• tojikomori@kbin.social
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    2 years ago

    This reply’s interesting:

    How can data licensed under the CC-BY-SA licenses (that SO content is licensed under) be “misused”? The license explictly allows others to do essentially anything they want with the data as long as attribution is given, in particular profit off of it.

    When SO content is applied as parametric knowledge I’d expect the outcome to fail both the “BY” and the “SA” clauses, since model interpreters can’t provide attribution for it and their output won’t share the license. That’s true even if output is considered public domain: CC-BY-SA content can’t be moved into a public domain equivalent license. It seems practically indistinguishable from using any other in-copyright content as training material.

    None of that’s to say SO is right to stop data dumps. It feels like they’re trying to find a technical solution to a legal problem, perhaps even one that rises to criminality on the part of Open AI and others?