Maven, a new social network backed by OpenAI’s Sam Altman, found itself in a controversy today when it imported a huge amount of posts and profiles from the Fediverse, and then ran AI analysis to alter the content.

  • danc4498@lemmy.world
    link
    fedilink
    English
    arrow-up
    51
    arrow-down
    20
    ·
    5 months ago

    People can complain, but the Fediverse is built to make consuming user’s data easy. If you don’t want AI using your data, don’t put it on such an easily “scrapable” network.

    • bbuez@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      5 months ago

      Alternatively, use a closed ecosystem susceptible to data rot and loss.

      Want to contribute to our open source project? Join our discord

      Would you want art to be unfindable because scraping for AI image generation happens? It’s a solution looking for problems.

    • Scrubbles@poptalk.scrubbles.tech
      link
      fedilink
      English
      arrow-up
      7
      ·
      5 months ago

      This is what I’ve been saying the entire time. It sucks, and it’s wrong, but the fediverse is built from the ground up as an open sharing platform, where amour data is shared with anyone. It shouldn’t be, and it’s wrong, but there is nothing to stop anyone from doing it. To change that would alter federation at a core level

      • danc4498@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        5 months ago

        I would rather my content be open to the world for however it wants to use it than owned by a single company that gets to profit off aggregating and selling it.

      • tooLikeTheNope
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        Yeah but doesn’t hubzilla (https://hubzilla.org/page/info/discover) applies a privacy layer to how its content it is distributed? The issue then lies also in how the social network gets implemented in function of its purpose, in hubzilla vs lemmy case for instance is a public board vs a social network

        • bamboo@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          If it ends up being ruled that training an LLM is fair use so long as the LLM doesn’t reproduce the works it is trained on verbatim, then licensing becomes irrelevant.

        • Scrubbles@poptalk.scrubbles.tech
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          I’ve had this argument with other people, but essentially at this point there is no licensing beyond server ownership here, and most servers don’t have any licenses defined. Even if they do, then sure they did something wrong… but how would you ever prove it or enforce it? The only way to actually disallow them is to switch from open federation to closed - which goes against what we’re trying to build with federation.

          • Grandwolf319@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            5 months ago

            There has been instances before where LLMs gave up clues as to what source it used. When that happens, they can be sued.

            Im okay with people using our data for whatever, since it’s all open and it should be. But I rather put a little bit of effort to make for profit use technically illegal. It’s better than nothing.

    • lambalicious@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      5 months ago

      People can complain, but the Fediverse is built to make consuming user’s data easy

      Correction: it is built to make consuming users’s data not easy, but more human.

      WHat you are thinking of is AP, not “Fediverse”, and even then that’s a stretch.

      • danc4498@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        5 months ago

        Correction: it is built to make consuming users’s data not easy, but more human.

        What does that even mean?

        WHat you are thinking of is AP, not “Fediverse”, and even then that’s a stretch.

        Honestly, I think Fediverse is inseparable from AP (or some similar protocol). You can split hairs if you want, but the thing that makes it different from all other social media services is that it allows the content created by users on one service to be imported into a different service.

        You can hope and dream that it is only services like Lemmy consuming user content from services like Mastadon, but this same protocol makes it easy for services like ChatGPT to consume the same data.

    • Grandwolf319@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      5 months ago

      Just because our data is accessible doesn’t mean it’s legally licensed to be used by a for profit company. Free doesn’t meant you can do what you want with it, it just means no cost.

      • danc4498@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        5 months ago

        I don’t disagree. I’m just saying that so long as you’re putting content on this platform, you are powerless to stop any service from using the features of the platform in whatever way they want.

        It was built for easy and open consumption of user content by other services.

        • Grandwolf319@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Oh yeah for sure. Anything I type here is for the whole world to see and I’m okay with that as long as it’s anonymous.