A sex offender convicted of making more than 1,000 indecent images of children has been banned from using any “AI creating tools” for the next five years in the first known case of its kind.
Anthony Dover, 48, was ordered by a UK court “not to use, visit or access” artificial intelligence generation tools without the prior permission of police as a condition of a sexual harm prevention order imposed in February.
The ban prohibits him from using tools such as text-to-image generators, which can make lifelike pictures based on a written command, and “nudifying” websites used to make explicit “deepfakes”.
Dover, who was given a community order and £200 fine, has also been explicitly ordered not to use Stable Diffusion software, which has reportedly been exploited by paedophiles to create hyper-realistic child sexual abuse material, according to records from a sentencing hearing at Poole magistrates court.
How could they possibly enforce this ban?
Removed by mod
Wasn’t the point that what he was using them for already illegal? Sounds like he already couldn’t get caught, so doesn’t seem like that’ll do much…
What do you mean? Like how would they catch him?
In the States parole/probation means you lose most of your civil liberties. In other words, if this was the U.S. a PO would check his phone and possibly his computer. Possibly even pull ISP records depending on how bad they want to catch you/how full of shit they think you are.
How will they even know he’s doing it? It doesn’t say they’re monitoring his internet connection. And even if they were monitoring his internet connection, he could go to some public wifi hotspot and sit in a car and do it.
I edited my comment. You’re too quick.
But yeah, he could get around it. But, he’s an addict. He’s going to want that porn other places then his car and make mistakes. If he’s tech savvy, he can probably stay one step ahead of his probation agent (assuming he has one). If he’s not, he’ll slip up because he’s addicted, and that’s how people get caught.
Is it weird that the whole detect/evade game just sounds super fun to me?
Not really. You’re probably a bit of a dopamine/norepinephrine (adrenaline) junky, like most Westerners. It’s bred into us by consumer culture.
It’s weird that it’s not weird though.
Nahh, I’m not so much about the chase as the metagame.
Might want to checkout cyber security and pen testing. It’s not the same thing exactly but it kinda close in some regards.
What do you mean by metagame? Like you find the cat and mouse stuff that is happening fascinating?
Put monitoring software on his devices.
He could just get a burner phone. Realistically, there is no way to police this.
This is pretty similar to restraining orders, make it more difficult and make the consequences more severe.
In the modern world when we have cellphones that can do pretty much anything… it’s fucking hard. There will be a parole officer and monitoring software with periodic physical inspections along with watching his purchases. (That’s, at least, th American approach).
Usually the way it works is that when this dude slips up once he goes to prison for violating his court order.
Or a burner laptop/Chromebook/whatever. Couple that with a VPN, using a neighbor’s wifi, public hotspots, etc, I don’t really see how they can realistically enforce someone motivated to do what they’re gonna do.
Have part of his probation be having his property searched to check for such devices.
There’s a log for everything. There really is. It’s just hard to piece it all together.
As a UK citizen, I’m ashamed of my government.
I am firmly against child abusers, but AI images don’t harm anyone and are a safe and harmless way for pedophiles to fulfil their urges, which they cannot control.
Where does the training data come from to create indecent images of children?
It doesn’t need csam data for training, it just needs to know what a boob looks like, and what a child looks like. I run some sdxl-based models at home and I’ve observed it can be difficult to avoid more often than you’d think. There are keywords in porn that blend the lines across datasets (“teen”, “petite”, “young”, “small” etc). The word “girl” in particular I’ve found that if you add that to basically any porn prompt gives you a small chance of inadvertently creating the undesirable. You have to be really careful and use words like “woman”, “adult”, etc instead to convince your image model not to make things that look like children. If you’ve ever wondered why internet-based porn generators are on super heavy guardrails, this is why.
Thanks for the reply, it’s given me a good idea of what’s most likely happening :)
It’s a shame that the rest of the thread went to shit, but unfortunately it’s an emotional topic, and brings out emotional responses
Always happy to try and productively add to someone’s learning.
I’m not going to say that csam in training sets isn’t a problem. However, even if you remove it, the model remains largely the same, and its capabilities remain functionally identical.
At that point it’s still using photos of children to generate csam even if you could somehow assure the model is 100% free of csam
That would be true, it’d be pretty difficult to build a model without any pictures of children at all, and then try and describe to the model how to alter an adult to make a child. Is anyone asking for that though? To make it illegal to have regular pictures of children in these datasets?
No but it is a reason why generating csam should be illegal. You’re using data trained on pictures of real kids
It is true, a 10 year old naked woman is just a 30 year old naked woman scaled down by 40%. /s
No buddy, there isn’t some vector of “this is the distance between kid and adult” that a model can apply to generate what a hypothetical child looks like. The base model was almost certainly trained on more than just anatomical drawings from Wikipedia - it ate some csam.
If you’ve seen stuff about “Hitler - Germany + Italy = Mousillini” for models where that’s true (which is not universal) it takes an awful lot of training data to establish and strengthen those vectors. Unless the generated images were comically inaccurate then a lot of training went into this too.
Right, and the google image ai gobbled up a bunch of images of black george washington, right? They must have been in the data set, there’s no way to blend a vector from one value to another, like you said. That would be madness. Nope, must have been copious amounts of asian nazis in the training set, since the model is incapable of blending concepts.
You’re incorrect and you should fucking know better.
I have no idea why my comment above was downvoted to hell but AI can’t “dream up” what a naked young person looks like. An AI can figure that adults wear different clothes and put a black woman in a revolutionary war outfit. These are totally different concepts.
You can downvote me if you like but your AI generated csam is based on real csam so fuck off. I’m disappointed there is such a large proportion of people defending csam here especially since lemmy should be technically oriented - I expect to see more input from fellow AI fluent people.
You’re spreading misinformation and getting called out for it.
Just a note - csam has been found in model training sets: https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse
It isn’t misinformation, though, generative AI needs a basis for it’s generation.
Removed by mod
Bro googled the word vector and was waiting to use it.
No, they’s referring to the internal workings of AI models, which are essentially a series of incredibly high-dimension matrices with extra bits around them to make them work. Individual concepts are embedded as vectors in the space that these models work in. That’s why linear algebra is brought up so frequently in discussions of AI.
While it’s true that linear algebra and vectors are used in learning models, they’re not using the term correctly in a way that says they know something about the subject (at least, the modern subject). Concepts aren’t embedded as vectors. In older models (before the craze), concepts were manually embedded as numbers or a collection of numbers, which could be a vector (but could be something else as well), and the machine would learn by modifying weights. However, in current models (and by current, I mean at least more than a couple years), concepts are learnt by the machine (weights are still modified by the machine as well) and the machine makes its own connections between features presented to it.
For example, you give it a dataset of 10x10 pixel images (with text descriptions) and it reads that as 100 pixels split into 3 numbers (RGB) and then looks for connections between those numbers and in which pixels. It’s not identifying what a boob is, but knows that when an image has ‘boob’ in the text description then there’s a very high likelihood that there will be a circular collection of pixels with lots of red somewhere in the image that are also connected to other pixels that are often also lots of red. That’s me breaking down what a human would think given the same task/information, but the reality is the machine will come up with its own connections/concepts which are both often far better than humans (when the model works, at least) and far more ineffable to humans.
The whole point of diffusion models is that you can generate new concepts using training data. Models trained on any nsfw images can combine those concepts with any of its non-nsfw concepts. Of course, that’s not to say there isn’t CSAM in any training data, because there objectively has been in the past, but there doesn’t need to be any to generate it.
Thanks for the reply, that makes a lot of sense :)
Thanks for not being a dick! I aim to inform
Ai is able to fill in the last field in a table like “Old / young” vs “Clothed / naked” when given three of the four fields.
Csam is in the training data. From a few months ago
Please reiterate your statement but instead using the “goose chase meme” format.
It brings me to ask the question if lolicon could be their next target?
UK legislators have a long history of taking actions not informed by science or reason but rather the popular, often hysteric, opinion.
This case is yet another attempt at tightening screws where they shouldn’t be.
AI imagery was produced by Stable Diffusion, the model that, for all we know, did not take real CSAM as inputs and caused no harm to actual children. At the same time, such images are important at discouraging the consumption of real CSAM, with very real children being traumatized.
By banning AI imagery production using safe models, legislators leave no legal way for pedophiles to get something by the harmless means, directing many to the harmful ways as equally illegal, while also prosecuting those who did no harm.
I thought pedophiles looking at CSAM were more likely to attack a child, not less. They are actively fantasizing about it, and that can escalate.
I am basing this belief on what I remember of discussions regarding that “ask a rapist” reddit megathread. Apparently psychologists thought that was horrifying.
The bias with this approach is that it highlights those who did offend, while telling us nothing of those who didn’t. This is often repeated throughout research as well.
It’s very likely that a lot of child abusers did watch CSAM (after all, if you see no issue in child abuse, there’s no issue for you in the creation of such imagery), but how many CSAM viewers end up being abusers and is there an elevated risk? That is the question.
I guess if we’d make an “ask a pedophile” thread instead of “ask a rapist”, we could get some insights. Pedophiles, catch the idea!
But then we cannot say that in either direction. We simply don’t know if they are more or less likely to attack a child without data about it.
By “harmful ways” I meant consuming more real CSAM - something that is frustratingly underresearched as well, but one can guess.
I don’t have any of these Tendencies but I like to think that if I did I would chemically remove my sex drive
That’s up to everyone. Besides, most pedophiles do have sexual interest towards adults as well, and current means reduce that drive too.
Chemical castration in this context increases misery and makes building healthy adult relationships harder. Most pedophiles do not opt for that, for all I know.
Current therapeutic methods do include suppressing sex drive in case the client struggles with impulse control. Otherwise, it is not offered, but can be given on request.
counter point:
if you have a folder of AI generated CP and put in a couple of pictures of actual CP it’s going to muddle the case as the offender could claim all of them are simply AI generated. Real harm could go unnoticed if those two were to be treated differently.
Additionally, not every offender will stop at AI generated images, and if their curiosity becomes enough they could go on to want to experience “the real thing”.
You do realize that slippery slope argument is what’s used when it comes to banning anything else, right?
“Can’t legalize marijuana or people will start wanting to do meth” for example.
I don’t believe those two are comparable.
Weed and meth are rather different in how they affect people.
AI images are often used as a way to imitate reality
It doesn’t matter if you believe it, for those who lived through D.A.R.E and the war on drugs, that argument was common and on plenty of people’s lips. It’s a stupid argument but I think that’s the point OP is trying to make
then why is that person repeating a stupid argument at me? those aren’t comparable at all.
A better comparison would be idk, CBD weed with no THC being legal and that being the “gateway” to normal weed. Or buying a knock off product and wanting to try the original. Or looking at AI generated photos people eating spaghetting and wanting to see how it actually looks like
It’s a stupid argument being juxtaposed with your argument…you’re so close, you got this.
I think the solution here is not banning AI materials outright but to make them identifiable - even by means of digital signatures if you want.
For example, Stable Diffusion could insert particular piece of metadata into the picture containing the signature and proving the image is AI-generated, etc.
Without AI materials, said curiosity may lead people straight to the “real thing”, and every darknet or even Telegram dweller will tell you it’s frighteningly easy to find it even if you never intended to. With AI materials, people can have a chance to stop there.
deleted by creator
The fact that digital signatures don’t work like that and sign specific content.
It’s a mathematical procedure that involves the data being signed. If you apply it to any other content - the sign won’t be valid.
meta data is trivially easy to strip off a picture, you don’t even need to bother using tools for it - just take a screenshot and delete the original
Can be baked in pixels, or even better sent to identification for a system similar to what Apple uses to detect CSAM, but as an “alright” ID (but just in police’s hands, not on device or something).
But even then, if every pixel gets marked as ‘created by AI’, it would still be trivial to take real CSAM and run it through an image-to-image generator with denoising turned down to 0.05 and suddenly you have real CSAM that has been marked as ‘legal’ since it is technically AI generated.
Also, keep in mind that there are several open source projects out there where anyone who knows what they are doing could just strip out any protections that might be put in place.
Apple-like ID system solves the latter by technical means.
As per image-to-image, feeding the model with recognised CSAM should be unavailable to begin with.
Yeah but the point is you can’t easily add it to any picture you want (if it’s implemented well), thus providing a way to prove that the pictures were created using AI and no harm has been done to children in their creation. It would be a valid solution to the “easy to hide actual CSAM between AI generated pictures” problem.
you can’t easily add it to any picture you want (if it’s implemented well
Edit for the downvoters: StackExchange - How do I add exif data to an image?
Going to need you to elaborate on this. EXIF data is just bytes in a file, like any of the other bytes in the file. It can be changed and is often changed without the users consent. Are you proposing we create a new type of hardware, something akin to Secure Enclave, and then mass-produce and add it to every consumer CPU to ensure some specific types of exif data isn’t tampered with?
I was thinking of an approach based on cryptographic signatures. If all images that come from a certain AI model are signed with a digital certificate, you can tamper with metadata all you want, you’re not gonna be able to produce the correct signature to add to an image unless you have access to the certificate’s private key. This technology has been around for ages and is used in every web browser and would be pretty simple to implement.
The only weak point with this approach would be that it relies on the private key not being publicly accessible, which makes this a lot harder or maybe even impossible to implement for open source models that anyone can run on their own hardware. But then again, at least for what we’re talking about here, the goal wouldn’t need to be a system covering every model, just one that makes at least a couple models safe to use for this specific purpose.
I guess the more practical question is whether this would be helpful for any other use case. Because if not, I hardly doubt it’s gonna be implemented. Nobody is gonna want the PR nightmare of building a feature with no other purpose than to help pedophiles generate stuff to get off to “safely”, no matter how well intentioned
I disagree that it should be allowed, but I think their proposal would be something like attaching an identifier to the model, the random seed, the “temperature,” and any other relevant parameters that allow exact reproduction of the image without having access to anything but the model. Then you can prove it came from the model.
Here’s a thought experiment, though, what would prevent someone from taking a real image and a model, then working with them until they can reproduce a very close approximation of the real image from text and parameter input? These models aren’t like a hash function, they can be viewed in reverse to some extent. Backpropagation is how they are trained.
deleted by creator
Try to take your emotion from the discussion. There is finally a way for people with an illness (in this case pedophilia) to “satisfy” urges without causing harm to children. They need professional help which cannot be gained easily in the UK due to a certain government removing funds.
This isn’t a give pedos stuff celebration, it’s a discussion that needs to happen and if you’re not mature enough to not get emotional, don’t partake in the conversation.
What were those models trained on?
Not taking sides, but his argument hinged on the stable diffusion model not having CSAM in it, and using non-CSAM images in order to generate CSAM.
So he’s already answered the question of what the models are trained on.
Whether those models actually are clean/safe is a different question.
Here’s the problem, it doesn’t matter if it was or not. It does, but that’s a different issue.
My point is, how do you know it wasn’t trained on csam?
You can’t possibly. You can point to all the places where csam isn’t and say “we haven’t found any illegal images yet.” But you can’t say with 100% certainty that there are none.
And since you can’t prove that no csam is used to train the model, any argument beyond that point is moot. If this were almost any other issue I’d say eliminating 99.99% of the risk is completely valid and safe. But we’re not talking about a celebrity or a porn star. We’re talking about child victims of sexual assault, and to that end we should not accept anything other than absolute certainty. And because absolute certainty cannot exist, we should not simply accept it as a society.
I’m not disagreeing, I also don’t want these models producing CSAM.
But in the hypothetical that we have a clean model that still generates CSAM, what would be your argument against it?
Obviously goat sex
deleted by creator
My understanding is that CSAM doesn’t satisfy anything. Iirc research on the subject suggests that it causes most pedophiles to go out and look for the real thing.
Which scans. How many people watch normal.porn and think: “well, that’s good enough” and just stop pursuing a real partner?
Could you please provide such paper? I couldn’t obtain the same findings.
The difference between pedophiles and non-pedophiles is that the latter don’t have to satisfy themselves with less; it’s not morally wrong nor illegal to pursue relationships with an adult partner. It is, however, with children.
No one says pedophiles don’t want to have relationships/sex with children after being exposed to either CSAM or AI imagery; but there is a difference between a wish and intention, and if we can help them to keep their wishes at bay, we should.
If dating adults would deeply traumatize them and would be illegal, many people would probably find a relief in porn without a real action. We just don’t normally consider this perspective because in reality it’s totally okay and we don’t have to limit ourselves.
That’s why satisfy was in quotations, it’s not a black and white matter, for a lot of people this does nothing. But for alot of people this is something that is potentially life altering.
And I agree with what you’re saying to an extent. But you watch porn to satisfy an urge, if I watch a certain category of porn it doesn’t mean I want to go out and experience that category.
This is a complicated Matter, and someyhing a magistrate is not equipped to deal with.
Which scans. How many people watch normal.porn and think: “well, that’s good enough” and just stop pursuing a real partner
Enough that our birth rates are dropping and less people are getting married.
You’re going to have to source your claim there.
If you think porn is the reason for declining birth rates and higher rates of loneliness I have a bridge to sell you
It’s not a matter of entitlement but of a real world harm. And generated imagery involving imaginary children does not constitude child sexual abuse.
I’d gladly give pedophiles generated imagery if that were to stop them from lurking in search of real CSAM, supporting the industry that creates a very tangible harm - actual child abuse.
And my life has nothing to do with either, so don’t make it personal. I only share my opinion on what we should really do to protect children, not to protect our deeply rooted views.
Using csam in training data causes harm
Answered in another thread. TL;DR those are accidentally scraped pictures, not intentional training. But we should filter datasets better.
Sure they’re entitled to something.
Coping mechanisms to help them not pursue that desire, or a first class ticket on a rocket to the sun.
There is no middle ground.
There’s no csam because there’s no child. Critical thinking is hard I know.
Except when the data is trained on csam
Now that’s 100% reprehensible. I didn’t read the link, but the only excuse I can think of is if it’s used to automatically recognise csam, so a human doesn’t have to look at it.
The link explains that they are in a dataset used to train a text-to-image model. Images with hashes matching known CSAM. There are tools that could have caught this which this dataset failed to use. Gigantic and repugnant failure. Makes me want to never download a dataset.
Now think of the photos that don’t have any matching hashes. Social media has a ton of csam and as long as they scrape from Facebook/insta/twitter or from porn sites with no verification system they will continue to have csam in their training data.
Why didn’t he get banned from using the internet?
I definitely can’t let you do that, Hal.
Is he extorting actual kids or just having a computer generate fap material? The difference decides whether or not I give a damn.
He is fapping to porn that was generated by an AI that trained on csam.
Yes, just like the pictures of astronauts on horses were trained on an extensive collection of space derby pictures.
Not quite. You see, unfortunately, space derbies don’t actually exist. The other, unfortunately, actually does.
Be in denial if you want. That csam is trained on csam.
Any proof for this? Would be an interesting read.
No, it’d be hard to tell since models are usually close lipped. But Twitter has been included in a lot of the image models and traditionally has a very large issue with cp.
Oh… You sounded so confident at first.
See the sibling comment with a link.
Kind of just contradicted yourself there. And have you ever heard the phrase “correlation does not imply causation”?
But how can that be? Surely just the fact that it can create those pictures is incontravertable proof that it was trained on pictures of spacesuited cowboys?
He used Stable Diffusion, which, for all we know, was NOT trained on CSAM.
Csam is in the training data. From a few months ago
Thanks for the correction!
It’s worth noting that this only includes CSAM accidentally scraped along with everything else on the open Web. No specialized CSAM training took place.
In any way, I welcome the efforts at filtering such content before it enters the dataset.
It’s obviously accidental, but that doesn’t change that it happened and is something that will be near impossible to avoid as long as they continue to scrape data in the way they do for their models. They would need a human to filter it out like they already use for most LLMs.
How you will enforce this kind of politics? I just buy a VPN, use proxychains or annonsurf, what you gonna do? put a police to live in the same room as I live?
Install spyware on your devices, one would assume.
I think it would be hard to enforce but it’s an interesting precedent, and it means that if they catch them they can whack him
Strange to use yourself as the person creating child pornography in this hypothetical.
Come on, they’re raising concerns about how this ruling will be made effective. An actual criminal would just stay silent and use the VNP.
Jesus Christ is that @PotatoKat@lemmy.world ’s music I hear?
Just annoyed to see everyone saying with such definitive wording that there isn’t any csam in training data. I’m a victim of CSA and can’t imagine how I would feel if photos of me were used to help get people off like that.
Right? Training data is an absurd blob of everything the algorithm can get its hands on. It’s like trying to assure that there’s no alcohol or coca-cola in a lake.
It’s great to see you wading into this shitshow with the folding chair, ngl.
This is the best summary I could come up with:
The Internet Watch Foundation (IWF) said the prosecutions were a “landmark” moment that “should sound the alarm that criminals producing AI-generated child sexual abuse images are like one-man factories, capable of churning out some of the most appalling imagery”.
Susie Hargreaves, the charity’s chief executive, said that while AI-generated sexual abuse imagery currently made up “a relatively low” proportion of reports, they were seeing a “slow but continual increase” in cases, and that some of the material was “highly realistic”.
The Lucy Faithfull Foundation (LFF), which runs the confidential Stop It Now helpline for people worried about their thoughts or behaviour, said it had received multiple calls about AI images and that it was a “concerning trend growing at pace”.
The decision to ban an adult sex offender from using AI generation tools could set a precedent for future monitoring of people convicted of indecent image offences.
Stability AI, the company behind Stable Diffusion, said the concerns about child abuse material related to an earlier version of the software, which was released to the public by one of its partners.
It said that since taking over the exclusive licence in 2022 it had invested in features to prevent misuse including “filters to intercept unsafe prompts and outputs” and that it banned any use of its services for unlawful activity.
The original article contains 974 words, the summary contains 219 words. Saved 78%. I’m a bot and I’m open source!
wank foundation
Wank watch foundation