Meta announced a new AI model called Voicebox yesterday, one it says is the most versatile yet for speech generation, but it’s not releasing it yet: The model is still only a research project, but Meta says can generate speech in six languages from samples as short as two seconds and could be used for “natural, authentic” translation in the future, among other things.

  • blaine@kbin.social
    link
    fedilink
    arrow-up
    5
    ·
    1 year ago

    Mark Zuckerberg was on the Lex Friedman podcast less than a week ago talking about this, and he said meta would continue to open source their models until they reach the point of “super intelligence”.

    So what changed in the last week?

    • Clairvoidance@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      edit-2
      1 year ago

      That was specifically around LLMs. In that same podcast he also highlighted how scams are very worrisome and you can probably extend that to any reality-faking technology as it gets more and more convincing.
      It’s self explanatory that the threat of extinction by AI and threat of crafting a fake reality to shape the outcome of the real reality are two different threats

      • blaine@kbin.social
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        So creating a text-based AI that impersonates influencers or celebrities is a “cool feature” to “increase engagement” and is totally viable to release to the public, but doing the (checks notes) same thing using voice is incredibly “dangerous” and needs to be protected?

        • Clairvoidance@kbin.social
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          Well snarked, especially enjoyed the copypaste of the checking notes phenomena. Can you figure out why one would be seen as more harmful in the immediate future than the other?

        • conciselyverbose@kbin.social
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          People understand that text can be fake.

          People don’t really understand that voices can be. It’s opening up a lot of scams with people pretending to be kidnapped (or otherwise desperate) relatives and taking money from people. If you make it easier to automate that without the human in play and have it appear responsive? A lot more is going to happen a lot more convincingly.

          I don’t at all believe Facebook cares about that, but it is a real downside to the tech.

      • Maeve@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Ok, following you, only commenter so far. Your posts are thought-experiment inducing. Thank you!

    • NotMyOldRedditName@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I didn’t watch it, but wasn’t that about llama? That’s text generation, not speech generation.

      Speech has more implications if it can replicate someone’s voice. Imagine getting a ransom voice mail from your child.

      That doesn’t happen with text generation the same way.