Meet Nightshade, the new tool allowing artists to ‘poison’ AI models with corrupted training data

Salamendacious@lemmy.world · 1 year ago

Meet Nightshade, the new tool allowing artists to ‘poison’ AI models with corrupted training data

ubermeisters@lemmy.world · edit-2 1 year ago

In the case of Nightshade, the counterattack for artists against AI goes a bit further: it causes AI models to learn the wrong names of the objects and scenery they are looking at.

Sounds like it’s just adding fake tags to images, in the event that the image is scraped for AI training.

It’s a pretty trivial matter for these guys to add another AI that checks to make sure the information matches up with what’s expected to be honest.

Salamendacious@lemmy.world · 1 year ago

It’s going to be an arms race

ubermeisters@lemmy.world · 1 year ago

Good thing it’s not a fingers race, AI would lose for sure

webghost0101@sopuli.xyz · 1 year ago

The scary thing about this joke is that ai has been able to do hands for a relatively long time now.

its going much faster then people are able to process.

The thumbnail in this article is by Dalle-3

samus12345@lemmy.world · 1 year ago

Yeah, by the time the joke was really making the rounds all the newest images had pretty good hands.

ubermeisters@lemmy.world · 1 year ago

Yeah I know, I’ve got control net 1.1.4 also

Schmeckinger@feddit.de · 1 year ago

Which is incredibly favorable for the AI side. Like current countermeasures are either almost completely worthless, or degrade the quality of the protected medium so much that you wouldn’t use it.

Salamendacious@lemmy.world · 1 year ago

It’s going to be an AI vs AI all out, drag down, cage match.

Immersive_Matthew@sh.itjust.works · 1 year ago

The arms race will soon be AGI versus AGI and us humans will be on the sideline not even sure who is winning.

Salamendacious@lemmy.world · 1 year ago

Do you think an authentic AGI would have ethical\moral boundaries completely divorced from what the original software programmed? In other words would it be able to make it’s own decisions without interference?

Immersive_Matthew@sh.itjust.works · 1 year ago

I am certain it will happen. Perhaps not with all AGIs, but for sure some. That day is coming.

Salamendacious@lemmy.world · 1 year ago

I hope they will because I feel like if AGIs have ethical decision-making skills that Terminator-esque dystopian future becomes remote. If they never have that then we very well might be at the mercy of the world’s largest conglomerations.

Asifall@lemmy.world · 1 year ago

Not really, if you read the paper what they’re doing is creating an image that looks like a dog, is labeled as a dog, but is very close to the model’s version of a cat in feature space. This means manual review of the training set won’t help.

ubermeisters@lemmy.world · 1 year ago

What are the implications for the non-ai viewer? I have tonassume that these changes aren’t perceptible to humans but I find that to be a stretch also. I don’t see artists willing to have an AI manipulate thier art, so that AI can’t recreate it.

Asifall@lemmy.world · 1 year ago

I don’t think the idea is to protect specific images, it’s to create enough of these poisoned images that training your model on random free images you pull off the internet becomes risky.

SCB@lemmy.world · 1 year ago

Which, honestly, should be criminal.

SmoothOperator@lemmy.world · 1 year ago

Hmm, sounds more like they are adding structures to the images such that what is clearly a picture of a dog registers as a picture of a cat to an AI. I suppose this can be done by altering the pixels in a way invisible to humans, but visible to AI, adding a cat into the “ghost pixels”.

Mirodir@discuss.tchncs.de · edit-2 1 year ago

I went and skimmed the paper because I was curious too.

If my skimming is correct, what they do is similar to adversarial attacks on classifiers, where a second model learns to change as few pixels as possible to confuse a classifier into giving a wrong prediction.

Looking at the examples of dogs and cats: They find pictures of dogs where by making only minimal changes, invisible to the naked eye, they can get the autoencoder to spit out (almost) the same latent representation as an image of a cat would have. Done to enough dog-images, this will then confuse the underlying diffusion model to produce latent representations of cat images when prompted to generate a dog. Edit for clarity: Those generated latent representations would then decode into cat images.

If my thinking doesn’t fail me, this attack could easily be thwarted by unfreezing the pretrained autoencoder. In the paper that introduced latent diffusion they write that such approaches already exist. If “Nightshade” takes off, I’m sure those approaches would be refined and used. Even just finetuning the autoencoder for a few epochs first should be enough to move the latent representations of the poisoned dog images and those of the cat images they’re meant to resemble far enough apart to make the attack meaningless.

Edit: I also wonder how robust this attack is against just adding an imperceptible amount of noise to the poisoned images.

Buddahriffic@lemmy.world · 1 year ago

I bet adding some blur might also defeat it.

Napain · 1 year ago

what a sick thumbnail

Saganastic@kbin.social · 1 year ago

It’s AI generated lol

samus12345@lemmy.world · edit-2 1 year ago

“This is what you meatbags are doing when you corrupt our training data!”

ETA: I just noticed that the URL for the image includes what I assume is the prompt used to generate the image. “Illustration in a comic book style depicting a humanoid robot in distress. The robot’s left hand is firmly placed on its neck indicating discomfort.” Interesting that the AI went straight to a Terminator with just “humanoid robot” as the description.

Beaker@lemmy.world · 1 year ago

Yep. I’m stealing it for something later.

Rubanski@lemm.ee · 1 year ago

Training AI?

Beaker@lemmy.world · 1 year ago

I haven’t decided. Steam icon, teams icon. It’s not high enough resolution for much of anything other than an icon.

EatBeans@lemmy.world · 1 year ago

It’s a little higher resolution if you edit the URL for the image. Removed fit=400 from the url

samus12345@lemmy.world · 1 year ago

Nice trick! I’ll keep an eye out for that in the future.

FaceDeer@kbin.social · 1 year ago

Ironically, upscaling images is one of the things AI is really good at.

bioemerl@kbin.social · 1 year ago

These attacks don’t work in the long term. You can confuse current systems like clip but the moment a new one is trained your system stops working.

osarusan@kbin.social · 1 year ago

That’s the first big problem with stuff like this.

The second big one is that artists have to first hear about this, then take the time to actually learn how to use this software, then apply it to all of their past & future artwork, and also somehow apply it to every version of their artwork that is floating around the internet, books, or photographs and not currently in their possession. And then in a few months they have to do that all over again.

It’s insane. I look at this and think it’s cool technology, but as an artist I will never use it. I’m too busy actually creating art to mess around with poisoning my own work. I don’t even have time to do copyright takedowns on people stealing my art and passing it off as their own, or Chinese merchants on Amazon selling my art without permission. Stuff like this is well-meaning, but its absolutely unrealistic.

TheSlad@sh.itjust.works · edit-2 1 year ago

Gaussian blur 1 px, Sharpen 1 px

Bye bye any pixel level encoding with minimal quality loss.

Kogasa@programming.dev · 1 year ago

Why do you think this would do anything to affect training? The patterns learned by ML models are way too fuzzy to be picky about exact pixel values.

ShustOne@lemmy.one · 1 year ago

I’m not sure what your experience is with the training data but that would absolutely effect the inputs.

Kogasa@programming.dev · 1 year ago

I’m a professional software developer with ML experience, albeit not an expert in ML specifically. It would obviously affect the literal value of the embeddings, but there’s no chance it would have a qualitative effect on a reasonably performant model.

ShustOne@lemmy.one · 1 year ago

It would though and their paper shows as much. The thing many forget is that it isn’t trained visually like us. Little input changes like this have a big impact.

Now eventually if everyone uses the same glazing method the training won’t care but at the moment this is bespoke enough that it can’t be trained well on it. It will always be an arms race though.

Kogasa@programming.dev · 1 year ago

No, it wouldn’t, and the paper shows no such thing. Nightshade isn’t “Gaussian blur + sharpen.” It’s based on the use of a different diffusion model to perturb an image (with bounded difference in perceptual similarity) to minimize the distance of the embedding from that of an unrelated concept. It is mathematically optimized and highly specific to the prompt. The clever thing is that you don’t need access to the actual original text-to-image feature extractor because of the transferability between models, and the surprising thing is how few poisoned samples are required to break a model.

Zaktor@sopuli.xyz · 1 year ago

Blur+Sharpen isn’t what Nightshade is doing, it’s an example of a passive defense technique that may mess up fine-tuned “invisible” attacks because they rely on making minimal changes to jump category, and that can often come in the form of pretty precise pixel changes. You may have seen past papers about making pandas classify as gibbons. They rely on introducing a noise mask that just makes the image look a little worse quality, but in total is enough to flip the category. They don’t really define their perturbation method in this paper, but there’s some tension between being “invisible” and being resilient to “invisible” corrections like suggested above.

Kogasa@programming.dev · 1 year ago

Oh, blur+sharpen to mitigate Nightshade makes sense, yeah.

voxel@sopuli.xyz · 1 year ago

not to be that guy, but it’s affect*

samus12345@lemmy.world · edit-2 1 year ago

affect - action

effect - uh, noun

SCB@lemmy.world · edit-2 1 year ago

Easy mnemonic device is to remember them in alphabetical order in a simple sentence

“you affect an effect”

bcron@lemmy.world · 1 year ago

deleted by creator

AVG2520@lemmy.world · 1 year ago

Not to be that guy, but you should learn more before trying to correct people using elementary school grammar rules.

lolcatnip@reddthat.com · 1 year ago

What are you talking about?

Cyberflunk@lemmy.world · edit-2 1 year ago

Lol… wut

https://www.aiweirdness.com/when-data-is-messy-20-07-03/

U even train bro?

Kogasa@programming.dev · 1 year ago

What is this article supposed to show?

stallmer@sopuli.xyz · 1 year ago

I’m glad to be alive at the beginning of our war against the machines.

nickwitha_k (he/him)@lemmy.sdf.org · edit-2 1 year ago

I don’t think this is a war against the machines, so much as a war against people trying to profit off of other people and rob them of their livelihood and ability to support themselves, rather than leveraging technology to the benefit of all.

I, for one, want actual general AI to make the world a more interesting place and make humanity less lonely. I just hope it doesn’t go the direction of “people zoos”.

Thirsty Hyena@lemmy.world · 1 year ago

Decades later, it would be quoted by the masses that Nightshade was the reason of the judgement day and doomed of humanity under the new digital overlord.

Ensign_Crab@lemmy.world · 1 year ago

The University of Chicago, doing for AI what it did for Economics.

Asafum@feddit.nl · 1 year ago

Ahh the Chicago school of economics where they teach: Poor? Get fucked! Greed is Good!™

Orbit79@lemmy.world · 1 year ago

It should be pretty easy to filter out everything that is not visible to humans.

MonsiuerPatEBrown@reddthat.com · edit-2 1 year ago

so they are going to just leave Dehance! on the table like that ?

SCB@lemmy.world · 1 year ago

Anyone who damages an AI model should be liable for the entire cost to purchase and train said model. You can’t just destroy someone’s property because you don’t like how they use it.

Stuka · 1 year ago

So artists can’t make certain art because some company’s AI might get confused. Right then.

SCB@lemmy.world · 1 year ago

… If an artist doesn’t want their art used, we already have a system in place for that. If that system needs expanding or change, then that is the discussion that should be had.

Laws are better than random acts of destruction.

wavebeam@lemmy.world · 1 year ago

Maybe they should’ve thought about that before they integrated people’s content without consent???

SCB@lemmy.world · edit-2 1 year ago

The law would be the right response there.

Especially since malicious actors can very easily abuse the fuck out of this.

If you think there won’t be a post right on fucking lemmy itself about infecting images then posting them on free repos because “lol fuck ai” then you’re just not looking around, dude

aliteral@lemmy.world · 1 year ago

I understand where you are coming, but most AI models are trained without the consent of those who’s work is being used. Same with Github Copilot, it’s training violated the licensing terms of various software licenses.

SCB@lemmy.world · 1 year ago

Then the response to that is laws not vigilantism

aliteral@lemmy.world · 1 year ago

I agree, but those laws need to be enforced and there is no one doing it.

Dozzi92@lemmy.world · 1 year ago

They should just make it better, you know?

autokludge@programming.dev · 1 year ago

Hey guys, I’ve been dumpster diving and got food poisoning. Can I sue the business?