Source: https://cybre.space/@tindall/106539167944483388
From the same Mastodon thread:
The model is known to reproduce some code, including GPL-licensed code, verbatim; therefore, it must contain verbatim copies of that code, however it is encoded.
[…]
the snippet in question is clearly, deeply original. it is a cursed coding crime that contains several “magic constants” with high entropy.
So it should be required to be open source now, right?
If the license doesn’t matter to them, they could’ve use Microsoft’s own huge private codebase.
Time to delete all github repos. I did this with my sources and I don’t want to go back.
Thing is, they could just crawl all public code. It wouldn’t even need to be on their site, just takes more work.
That’s true of course, but I’m not quite sure whether this would be legal (IANAL, ofc).
I hope there will be soon licenses that forbid these kind of abusive actions.
Some copy-left licenses might already cover closed-source redistribution like this. But I’m guessing we’d need a court to decide that for sure.
deleted by creator