https://futurism.com/the-byte/government-ai-worse-summarizing

The upshot: these AI summaries were so bad that the assessors agreed that using them could require more work down the line, because of the amount of fact-checking they require. If that’s the case, then the purported upsides of using the technology — cost-cutting and time-saving — are seriously called into question.

  • DPRK_Chopra [comrade/them]@hexbear.net
    link
    fedilink
    English
    arrow-up
    11
    ·
    14 days ago

    Also, this study inexplicably used llama-2 ?? which does indeed suck and is nowhere near state of the art. Look at this scorecard from a couple months ago: https://www.trustbit.tech/en/llm-leaderboard-juli-2024

    Note the massive jump in quality for open source models. We went from around ~50% for Llama 2 to now +80% for Llama 3 on a lot of benchmarks. Llama 2 was released in July 2023, and Llama 3.1 just came out on July 2024. this-is-fine

    You don’t have to be a redditor, bazinga brain, treat enjoyer, etc. to realize these silicon valley freaks are onto something with this technology and the field is evolving quickly.

    • impartial_fanboy [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      6
      ·
      14 days ago

      To expand on that for people who think it’s all just smoke and mirrors. I think, just like the assembly line, work places will be reorganized to facilitate the usefulness/capabilities of LLM’s and, perhaps more importantly, designed to obviate their weaknesses.

      It’s just that people are still figuring out what that new organization will look like. There hasn’t been a Henry Ford type for LLM’s yet (and hopefully won’t be a Nazi this time). Obviously there’s no guarantee there will be such a person/organization but I don’t think it super unlikely either.

      • DPRK_Chopra [comrade/them]@hexbear.net
        link
        fedilink
        English
        arrow-up
        6
        ·
        14 days ago

        Well said. This is all so new, we’re still figuring out the implications of how to grapple with it.

        I do think people here have a tendency to just hate all of it out of hand, which I get to some extent. The last thing we want is Elon to have terminators or something, haha.

        We went from “it can’t even draw hands!!!” last year to “they’ll just use it for porn!!!” now, ignoring the fact that it can render pretty amazing looking videos in such a short time span.

        • impartial_fanboy [he/him]@hexbear.net
          link
          fedilink
          English
          arrow-up
          8
          ·
          14 days ago

          I do think people here have a tendency to just hate all of it out of hand, which I get to some extent.

          Yeah the hype cycle is certainly annoying. As is the accompanying fire/re-hire at lower pay cycle that follows any automation.

          ignoring the fact that it can render pretty amazing looking videos in such a short time span.

          I actually think the generative aspect of neural networks is the least interesting/useful/innovative/etc. Though it will admittedly be more interesting when an LLM can say, use blender to make a video rather than just wholesale generating it. Or at least generate the files/3d models necessary to have it be edited by a person just like they would anything else. I suspect there will have to be a pretty significant architecture change for them to be able to make convincing/coherent movie-length videos.

          Chaotic system control, like they’re doing with nuclear fusion plasma is the most interesting, to me anyway.