• Starbuck@lemmy.world
    link
    fedilink
    English
    arrow-up
    20
    ·
    1 year ago

    I used to work in the field of image forensics a few years ago, right as the GAN technology was entering the scene. Even when it was just making 200x200 pixel faces, everyone in the industry was starting to panic. Everything we had at the time was based off of detecting inconsistencies in the pixel content, repeating structures that indicated copy/paste attacks, or looking for metadata inconsistencies

    For pixel inconsistencies, you can look at how the jpeg image is encoded to look for blocks that aren’t encoded consistently. This paper coversDCT and some others. https://scholar.google.com/scholar?q=dct+image+forensics&hl=en&as_sdt=0&as_vis=1&oi=scholart#d=gs_qabs&t=1690073435801&u=%23p%3DKmFtRm3WpQ8J That’s just one example, but it’s ultimately looking for things like someone photoshopping a region out or patching something in.

    Similarly, copy-move detection would look for “edges” and “intersections” in images and creating constellations of points, which you can use scale invariant transforms to look for duplicates. This article covers an example where North Korea tried to make their landing force look more impressive https://www.theguardian.com/world/2013/mar/27/north-korea-photoshop-hovercraft

    The problem is that when the entire image is forged, there is no baseline to detect against. The whole thing is uniformly fake. So we’re back to the old “I can tell be looking at it” which is extremely imprecise and labor intensive. In fact, if you look at how GANs work, it’s trivial to embed any detector algorithm into the training process and make something that also defeats that detector.