Now I get it. And yes, now I agree with you; it would give them a bit more merit to claim that the data being used in the input was obtained illegally. (Unless Meta has right of use to ThePile.)
The link does not mention GPT (OpenAI, Microsoft) or LaMDA/Bard (Google, Alphabet), but if Meta is doing it odds are that the others are doing it too.
Sadly this would be up to the copyright holders of this data. It does not apply to NYT content that you can freely access online, for NYT it got to be about the output, not the input.
Now I get it. And yes, now I agree with you; it would give them a bit more merit to claim that the data being used in the input was obtained illegally. (Unless Meta has right of use to ThePile.)
The link does not mention GPT (OpenAI, Microsoft) or LaMDA/Bard (Google, Alphabet), but if Meta is doing it odds are that the others are doing it too.
Sadly this would be up to the copyright holders of this data. It does not apply to NYT content that you can freely access online, for NYT it got to be about the output, not the input.