Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Arthur Besse · edit-2 2 years ago

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Sibbo@sopuli.xyz · 2 years ago

Now that’s interesting. I really have been waiting for something like this. Wonder if the LLM companies now actually have to explain where their models get the detailed information about the book from. Or if they can get away with stating that they have no idea how their own system works

Even_Adder@lemmy.dbzer0.com · 2 years ago

It is legal to create new knowledge about works or bodies of works. They don’t have a leg to stand on.

Another Person @beehaw.org · 2 years ago

“The Library.”

133arc585 · 2 years ago

I found this comment on Hacker News to be spot-on:

The complaint lays out in steps why the plaintiffs believe the datasets have illicit origins — in a Meta paper detailing LLaMA, the company points to sources for its training datasets, one of which is called ThePile, which was assembled by a company called EleutherAI. ThePile, the complaint points out, was described in an EleutherAI paper as being put together from “a copy of the contents of the Bibliotik private tracker.” Bibliotik and the other “shadow libraries” listed, says the lawsuit, are “flagrantly illegal.”

This is the makers of AI explicitly saying that they did use copyrighted works from a book piracy website. If you downloaded a book from that website, you would be sued and found guilty of infringement. If you downloaded all of them, you would be liable for many billions of dollars in damages.

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Sarah Silverman Sues ChatGPT Creator for Copyright Infringement