• Thorry84@feddit.nl
    link
    fedilink
    arrow-up
    21
    arrow-down
    2
    ·
    5 months ago

    You’ve quoted them stating they used LLMs while claiming they did not use a LLM? What am I missing here?

    • everett
      link
      fedilink
      arrow-up
      9
      arrow-down
      3
      ·
      5 months ago

      What am I missing here?

      “L” “M” “M”

        • blindsight@beehaw.org
          link
          fedilink
          arrow-up
          6
          ·
          5 months ago

          Correct.

          large language models (LLM) vs. large multi-modal models (LMM)

          Regardless, they both use an LLM as the main driver. Multi modal just means that the LLM is interfaced with generative and/or predictive AIs for other types of content like images, sound, video, etc.

          This is using a generalist tool for a specialized job. I’d expect the limit for LMMs is telling you if your picture is a heart or a kidney… Maybe. With low accuracy. Diagnosing? lol, hell no.