How can I add a simple requirement "do not train Al on the source code of the program" to AGPLv3 or GPLv3 and thereby create a new license?

JustVik · 1 month ago

How can I add a simple requirement "do not train Al on the source code of the program" to AGPLv3 or GPLv3 and thereby create a new license?

hendrik@palaver.p3x.de · edit-2 1 month ago

I mean I can also read your code, take inspiration from that and use it to write some proprietary software. I think that’s at least somewhat similar to what happens with AI. AI doesn’t reproduce the code verbatim, but instead learns from it and as far as I know nobody found it repeating large chunks of one specific software. So I’d say it’s like me reading copyrighted computer science books, learning programming that way and nowadays using my skill to code Free Software (or whatever).

I know this position is disputed. But I think there is some truth to it. At the same time it’s close to the tech companies’ rationale. It’s morally wrong that they get to profit from other peoples’ labor. And they’re definitely exploiting the situation that law and licences come from a time, where AI wasn’t an issue… But I’m really split on the topic. Ultimately we’d need some consensus on how to handle this. And some laws and regulation. And we don’t have that yet.

And I think it’s also similar to other companies profiting off of FLOSS projects. Like with Redis, MongoDB(?) and all the projects that shifted from open-source to source-available due to Amazon et al just taking things and making profit by selling it as a cloud service without ever contributing back. Is just a sad situation. And ultimately it harms me and everyone. Because I’m subject to the same license. And now I can’t use, modify and share some software anymore. These non-commercial clauses are difficult, too. Even if I just run a small Fediverse instance and collect donations, that could be construed as commercial. Or trying to make a living off of Free Software. And I think all of this drama is an even bigger problem than AI being trained on other people’s code. And it all cuts down on freedom. I mean for a legitimate reason… And I get it… Still the freedom gets lost.

JustVik · edit-2 1 month ago

Thank you very much for your reply. I support your opinion in a way that I am already inclined that a complete prohibition on the training of “AI” models on the source code of software is not a very good solution and is difficult to limit according to current laws. I hope somtimes someone smart will come up with some approaches to such problems.

hendrik@palaver.p3x.de · edit-2 1 month ago

Indeed. I hope so. And we desperately need some clear regulations. Even the big AI companies struggle with the lack of clear rules. I can see how we need to go through quite some legal battles to settle some questions arising with the new technology. And that’s currently taking place. But it extends past that. Currently, companies are retreating from the European market. Due to a completely unmanageable situation. I’ve seen local language models (starting with Llama 3.2) being banned / not licensed within the EU. And that’s going to lead to all kinds of complications. Just because the EU can’t get some proper regulations out, and do it in time. That’ll leave technological progress behind in the EU, mess with companies. In effect also take away my freedom to run language model on my own hardware…

I hope they get that straight. And there is some demand… So maybe it’s happening sooner than later. But these are very difficult questions to answer. About AI safety, copyright, effect and impact on society and freedom… And I think a lot of these questions are difficult to tackle with licensing anyways. We definitely need laws governing if AI training is fair use. Or if generating a voice that sounds 70% like David Attenborough is alright to do.