• Akisamb@programming.devOPM
    link
    fedilink
    arrow-up
    5
    arrow-down
    1
    ·
    9 months ago

    It does seem odd that scraping activity from just two accounts allegedly managed to cause such an extended server outage. The irony of this situation also hasn’t been lost on online creatives, who have extensively criticized both companies (and generative AI systems in general) for training their models on masses of online data scraped from their works without consent. Stable Diffusion and Midjourney have both been targeted with several copyright lawsuits, with the latter being accused of creating an artist database for training purposes in December.

    As far as I know they do not have copyright over the output of their models. Apart from banning the users they pretty much have no solutions to stop this. Even if they had copyright, it’s still legally unknown if training LLMs constitutes a copyright violation.

    In a similar fashion a lot of the recent chat llm’s have been trained on output from chatgpt. After all why pay humans to produce training data when your competitor has already done it for you.