• hex@programming.dev
    link
    fedilink
    English
    arrow-up
    59
    ·
    3 months ago

    Facts are not a data type for LLMs

    I kind of like this because it highlights the way LLMs operate kind of blind and drunk, they’re just really good at predicting the next word.

    • CleoTheWizard@lemmy.world
      link
      fedilink
      English
      arrow-up
      24
      ·
      3 months ago

      They’re not good at predicting the next word, they’re good at predicting the next common word while excluding most unique choices.

      What results is essentially if you made a Venn diagram of human language and only ever used the center of it.

      • hex@programming.dev
        link
        fedilink
        English
        arrow-up
        14
        ·
        3 months ago

        Yes, thanks for clarifying what I meant! AI will never create anything unique unless prompted uniquely and even then it will tend to revert back to what you expect most.

  • swlabr@awful.systems
    link
    fedilink
    English
    arrow-up
    45
    ·
    3 months ago

    ATTN: If you’re coming into this thread to say, “The output of AI is bad because your prompts suck,” I’m just proud that you managed to figure out how to use the internet at all. Good job, you!

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      15
      ·
      3 months ago

      remember remember, eternal september

      (not that I much agree with the classist overtones of the original, but fuck me does it come to mind often)

  • Sibbo@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    33
    arrow-down
    2
    ·
    3 months ago

    Well, to be fair, AI can do it in seconds. Which beats humans.

    But if that is relevant if the results are worthless is another question.

  • kbal@fedia.io
    link
    fedilink
    arrow-up
    20
    ·
    3 months ago

    Made strange choices about what to highlight.

    They certainly do. For a while it was common to see AI-generated summaries under links to articles on lemmy, so I got a feel for them. Seems to me you would not need any fancy artificial intelligence to do equally well: Just take random excerpts, or maybe just read every third sentence.

  • David Gerard@awful.systemsOPM
    link
    fedilink
    English
    arrow-up
    18
    ·
    3 months ago

    i have seen the light from the helpful posters here, made up bullshit alleged summaries of documents are great actually

  • khalid_salad@awful.systems
    link
    fedilink
    English
    arrow-up
    11
    ·
    3 months ago

    Could it be because a statistical relation isn’t the same as a semantic one? No, I must be prompting it wrong. I’ll just add “engineer” to my title and then everyone will take me seriously.

  • RagnarokOnline@programming.dev
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    20
    ·
    3 months ago

    I had GPT 3.5 break down 6x 45-minute verbatim interviews into bulleted summaries and it did great. I even asked it to anonymize people’s names and it did that too. I did re-read the summaries to make sure no duplicate info or hallucinations existed and it only needed a couple of corrections.

    Beats manually summarizing that info myself.

    Maybe their prompt sucks?

        • Steve@awful.systems
          link
          fedilink
          English
          arrow-up
          14
          ·
          3 months ago

          “tools” doesn’t mean “good”

          good tools are designed well enough so it’s clear how they are used, held, or what-fucking-ever.

          fuck these simpleton takes are a pain in the arse. They’re always pushed by these idiots that have based their whole world view on fortune cookie aphorisms

          • froztbyte@awful.systems
            link
            fedilink
            English
            arrow-up
            10
            ·
            3 months ago

            it makes me feel fucking ancient to find that this dipshit didn’t seem to get the remark, and it wasn’t even that long ago

    • David Gerard@awful.systemsOPM
      link
      fedilink
      English
      arrow-up
      26
      ·
      3 months ago

      I got AcausalRobotGPT to summarise your post and it said “I’m not saying it’s always programming.dev, but”

    • TexasDrunk@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      10
      ·
      3 months ago

      I also use it for that pretty often. I always double check and usually it’s pretty good. Once in a great while it turns the summary into a complete shitshow but I always catch it on a reread, ask a second time, and it fixes things up. My biggest problem is that I’m dragged into too many useless meetings every week and this saves a ton of time over rereading entire transcripts and doing a poor job of summarizing because I have real work to get back to.

      I also use it as a rubber duck. It works pretty well if you tell it what it’s doing and tell it to ask questions.

  • beefbot@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    4
    ·
    3 months ago

    Is it only me, or is the linked article not super long on details & is reaching a conclusion from 2 examples? This is important & I need to hear more, & I’m generally biased against AI at this point— but the article isn’t doing enough to convince me

    • self@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      3 months ago

      did you click through to any of the inline citations? David’s shorter articles on pivot mostly gather and summarize those, so if you need to read the original research and its conclusions that’s where to go

  • Lvxferre@mander.xyz
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    12
    ·
    edit-2
    3 months ago

    You could use them to know what the text is about, and if it’s worth your reading time. In this situation, it’s fine if the AI makes shit up, as you aren’t reading its output for the information itself anyway; and the distinction between summary and shortened version becomes moot.

    However, here’s the catch. If the text is long enough to warrant the question “should I spend my time reading this?”, it should contain an introduction for that very purpose. In other words if the text is well-written you don’t need this sort of “Gemini/ChatGPT, tell me what this text is about” on first place.

    EDIT: I’m not addressing documents in this. My bad, I know. [In my defence I’m reading shit in a screen the size of an ant.]

    • queermunist she/her
      link
      fedilink
      English
      arrow-up
      19
      ·
      edit-2
      3 months ago

      ChatGPT gives you a bad summary full of hallucinations and, as a result, you choose not to read the text based on that summary.

      • Lvxferre@mander.xyz
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        7
        ·
        3 months ago

        (For clarity I’ll re-emphasise that my top comment is the result of misreading the word “documents” out, so I’m speaking on general grounds about AI “summaries”, not just about AI “summaries” of documents.)

        The key here is that the LLM is likely to hallucinate the claims of the text being shortened, but not the topic. So provided that you care about the later but not the former, in order to decide if you’re going to read the whole thing, it’s good enough.

        And that is useful in a few situations. For example, if you have a metaphorical pile of a hundred or so scientific papers, and you only need the ones about a specific topic (like “Indo-European urheimat” or “Argiope spiders” or “banana bonds”).

        That backtracks to the OP. The issue with using AI summaries for documents is that you typically know the topic at hand, and you want the content instead. That’s bad because then the hallucinations won’t be “harmless”.

        • queermunist she/her
          link
          fedilink
          English
          arrow-up
          12
          ·
          3 months ago

          But the claims of the text are often why you read it in the first place! If you have a hundred scientific papers you’re going to read the ones that make claims either supporting or contradicting your research.

          You might as well just skim the titles and guess.

          • Lvxferre@mander.xyz
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            10
            ·
            3 months ago

            But the claims of the text are often why you read it in the first place!

            By “not caring about the former” [claims], I mean in the LLM output, because you know that the LLM will fuck them up. But it’ll still somewhat accurately represent the topic of the text, and you can use this to your advantage.

            You might as well just skim the titles and guess.

            Nirvana fallacy.

            • self@awful.systems
              link
              fedilink
              English
              arrow-up
              16
              ·
              3 months ago

              not reading the fucking sidebar and thinking this is high school debate club fallacy

              • Lvxferre@mander.xyz
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                9
                ·
                3 months ago

                not reading the fucking sidebar

                Yeah, I get that this is a place to vent. And I get why to vent about this. LLMs and other A"I" systems (with quotation marks because this shite is not intelligent!) are being shoved down every bloody where, regardless of actual usefulness, safety, or user desire. Telling you to put glue on your pizza, to eat poisonous mushrooms, that “cherish” has five letters, that Latin had no [w], that the Chinese are inferior to Westerners.

                While a crowd of irrationals tell you “it is intelligent, you can’t prove otherwise! CHRUST IT YOU DIRTY SCEPTIC/INFIDEL/LUDDITE REEEE! LALALA I’M PRETENDING TO NOT SEE THE HALLUCINATION LALALA”.

                I also get the privacy nightmare that this shit is. And the whole deal behind “we’re using your content as training data, and then selling the result back to you”. Or that it’s eating electricity like there’s no tomorrow, in a planet where global warming is a present issue.

                I get it. I get it all. That’s why I’m here. And if you (or anyone else) think that I’m here for any other reason, by all means, check my profile - you’ll find plenty pieces of criticism against those stupid corporate AI takes from vulture capital. (And plenty instances of me calling HN “Redditors LARPing as Hax0rz”. )

                However. Pretending that there’s no use case ever for LLMs is the wrong way to go.

                and thinking this is high school debate club fallacy

                If calling it “nirvana fallacy” rubs you the wrong way, here’s an alternative: “this argument is fucking stupid, in a very specific way: it pretends that either something is perfect or it’s useless, with no middle ground.”

                The other user however does not deserve the unnecessary abrasiveness so I’ll keep simply calling it “nirvana fallacy”.

                • self@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  9
                  ·
                  3 months ago

                  holy shit, imagine getting a second chance to not be a fucking debatelord and doubling down this hard

                  off you fuck

                • froztbyte@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  6
                  ·
                  edit-2
                  3 months ago

                  this argument

                  I agree, you’re quite right, and I thank you for taking the time and putting in the effort on such a wonderfully thorough portrayal of why your argument is total horseshit

            • queermunist she/her
              link
              fedilink
              English
              arrow-up
              12
              ·
              3 months ago

              Unless it doesn’t accurately represent the topic, which happens, and then a researcher chooses not to read the text based on the chatbot’s summary.

              Nirvana fallacy.

              All these chatbots do is guess. I’m just saying a researcher might as well cut out the hallucinating middleman.

              • Lvxferre@mander.xyz
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                All these chatbots do is guess.

                To call it “guessing” overestimates its abilities. It’s doing something even dumber - picking words and throwing them into a grammatically consistent whole, with barely any regards to meaning.

                But sometimes dumb shit is still useful. Just like my bash scripts.

                I’m just saying a researcher might as well cut out the hallucinating middleman.

                That’s why I called it a “nirvana fallacy” - it does hallucinate so it is not perfect, nor it’ll ever be (contrariwise to what tech bros want you to have “faith” = brainlessness towards). But by “cutting out the hallucinating middleman” you’re probably ignoring a lot of articles that might be useful for your research. Yes, ignoring them - because if you’re going by title alone you won’t read them, as it’s too much stuff to bother.

                I’ll give you a practical example. Suppose for a moment that you need info on the evolution of Hittite and other Anatolian languages. Based on the title alone (as you proposed in an earlier comment), would this article be useful? Probably not - the title doesn’t mention “Anatolian”, “Hittite”, “Luwian”, anything like that. And the article is 30 pages long so might as well skip it.

                Now throw “Summarise https://allegatifac.unipv.it/silvialuraghi/Gender FoL.pdf into three paragraphs.” into Gemini.

                Here's the output:

                The PDF you linked discusses the origin of the gender system in Proto-Indo-European (PIE), the ancestral language to many European and Asian languages. In the early 20th century, all known Indo-European languages seemed to have a three-gender system, including feminine. However, the discovery of Hittite, a language with only two genders and no clear feminine pronouns, challenged this theory.

                The paper explores two opposing viewpoints that emerged among Hittitologists and Indo-Europeanists. The first, called the Schwundhypothese, suggests that Hittite lost its feminine gender over time. The other, the Herkunfthypothese, proposes that PIE itself only had a two-gender system, and the feminine arose later in some descendant languages.

                The debate continues, with the unearthed evidence from Hittite sparking more questions than answers. The paper delves into the typological considerations of these gender systems, but a definitive explanation for the origin of the Proto-Indo-European gender system remains elusive.

                Can you trust the claims within that output? Fuck no, Gemini is likely losing its marbles. (For example, it implies that early PIE had masculine vs. neuter - not quite.) But it mentions Hittite enough times to make you correctly conclude that it’s a topic that the article talks about, and perhaps you should be reading those thirty pages instead of ignoring them.

                [Sorry for the fairly specific example. I wanted something that I could talk about, so I used an article that I know by heart about a topic that I know something about.]

                • queermunist she/her
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 months ago

                  It just so happened to work out in your very specific example.

                  Do you think it will always give a summary that is useful to researchers? It obviously won’t! It will, often, work just fine. When sorting through hundreds of documents, though, it will produce sorting errors and that will cause researchers to dismiss important documents by accident.

                  Maybe that’s fine. Maybe having the occasional error in the summary, which causes the researcher to dismiss the paper erroneously, is better than just guessing based on the title.

                  But that’s a lot of power to put in the hands of a dumbass chatbot.

    • David Gerard@awful.systemsOPM
      link
      fedilink
      English
      arrow-up
      19
      ·
      3 months ago

      Both the use cases here are goverment documents. I’m baffled at the idea of it being “fine if the AI makes shit up”.

      • Lvxferre@mander.xyz
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        3 months ago

        No, it’s just rambling. My bad.

        I focused too much on using AI to summarise and ended not talking about it summarising documents, even if the text is about the later.

        And… well, the later is such a dumb idea that I don’t feel like telling people “the text is right, don’t do that”, it’s obvious.

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      6
      ·
      3 months ago

      if the text is well-written you don’t need this sort of “Gemini/ChatGPT, tell me what this text is about” on first place.

      And if it’s badly written then the LLM will shit itself.

      Now let’s ask ourselves how much of the text in the world is “well-written”?

      Or even better, you could apply this to Copilot. How much code in the world is good code? The answer is fucking none, mate.

  • z00s@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    12
    ·
    edit-2
    3 months ago

    The problem is not the LLMs, but what people are trying to do with them.

    They are currently spoons, but people are desperately wishing they were katanas.

    They work really well for soup, but they can’t cut steak. But they’re being hyped as super ninja steak knives, and people are getting pissed when they can’t cut steak.

    If you give them watery, soupy tasks they can do successfully, they can lighten your workload, as long as you’re aware of what they are and aren’t good at.

    What people want LLMs to be able to do, ie. “Steak” tasks:

    • write complex documents

    • apply complex knowledge/rules to a situation

    • Write complex code and create entire programs based on vague description

    What LLMs can currently do ie. “Soup” tasks:

    • check this document and fix all spelling, punctuation and grammatical errors

    • summarise this paragraph as dot points

    • write a python program that sorts my photographs into folders based on the year they were taken

    Half of Lemmy is hyping katanas, the other half is yelling “Why won’t my spoon cut this steak?!! AI is so dumb!!!”

    Update: wow, the pure vitriol pouring out of the replies is just stunning. Seems there are a lot of you out there who have, in one way or another, tied your ego very strongly to either the success or failure of AI.

    Take a step back, friends, and go outside for a while.

    • V0ldek@awful.systems
      link
      fedilink
      English
      arrow-up
      17
      ·
      3 months ago

      What LLMs can currently do summarise this paragraph as dot points

      The entire point here is that they can’t?

      • fuzzzerd@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        13
        ·
        3 months ago

        Clearly this post is about LLMs not succeeding at this task, but anecdotally I’ve seen it work OK and also fail. Just like humans, which is the benchmark but they are faster.

        • self@awful.systems
          link
          fedilink
          English
          arrow-up
          7
          ·
          3 months ago

          humans are clearly faster at generating utterly banal shit, as proven by your posts in this thread

    • self@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      3 months ago

      they don’t do any of that soup shit reliably either and reading the article might have told you that

    • blakestacey@awful.systems
      link
      fedilink
      English
      arrow-up
      11
      ·
      3 months ago

      I’d offer congratulations on obfuscating a bad claim with a poor analogy, but you didn’t even do that very well.

    • istewart@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      3 months ago

      Why did this immediately give me a flashback to Donald Trump yelling, “when it comes to great steaks, I’ve just raised the stakes!

    • FredFig@awful.systems
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      3 months ago

      Food analogy

      This level of discourse wouldn’t fly on 4chan, how is it so popular with LLM fans?

      • David Gerard@awful.systemsOPM
        link
        fedilink
        English
        arrow-up
        9
        ·
        3 months ago

        needs to be a car analogy

        • What people want LLMs to do, i.e. Corvette tasks
        • What LLMs actually do, i.e. Trabant tasks
        • self@awful.systems
          link
          fedilink
          English
          arrow-up
          6
          ·
          3 months ago

          What LLMs actually do, i.e. Trabant tasks

          more of a Power Wheels Barbie Jeep whose battery got left out in the sun too long, but I’ll allow it

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      9
      ·
      3 months ago

      good god this entire post is the most tortured believer whataboutism I’ve encountered this month and there’s extremely strong competition here

      are currently spoons, but people are desperately wishing they were katanas

      ie. “Steak” tasks

      you should make a youtube channel, The Katana Steak-Eater. I’d watch the shit out of that at least one saturday afternoon

    • self@awful.systems
      link
      fedilink
      English
      arrow-up
      18
      ·
      3 months ago

      your post history tells me you’re pretty fucking comfortable with pointless nonsense

  • lightnsfw@reddthat.com
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    15
    ·
    3 months ago

    Ok? I don’t have another human available to skim a shitload of documents for me to find answers I need and I don’t have time to do ot myself. AI is my best option.

    • s3p5r@lemm.ee
      link
      fedilink
      English
      arrow-up
      26
      ·
      3 months ago

      So long as you don’t care about whether they’re the right or relevant answers, you do you, I guess. Did you use AI to read the linked post too?

      • jaemo@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        14
        ·
        3 months ago

        Yep. Go ahead and ignore all the cases where it’s getting answers correct and actually helping. We’re all just hallucinating, it’s in no way my lived experience. Your reality is the prime reality and we’re the NPC’s.

      • lightnsfw@reddthat.com
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        15
        ·
        3 months ago

        I didn’t read the post at all because its premise is irrelevant to my situation. If I had another human to read documentation for me I would do that. I don’t so the next best thing is AI. I have to double check its findings but it gets me 95% of the way there and saves hours of work. It’s a useful tool.

        • ebu@awful.systems
          link
          fedilink
          English
          arrow-up
          25
          ·
          3 months ago

          I didn’t read the post at all

          rather refreshing to have someone come out and just say it. thank you for the chuckle

          • self@awful.systems
            link
            fedilink
            English
            arrow-up
            17
            ·
            3 months ago

            we really do need “my source is that I made it the fuck up” for people who aggressively don’t want to read any of the text they’re allegedly commenting on

        • V0ldek@awful.systems
          link
          fedilink
          English
          arrow-up
          10
          ·
          3 months ago

          This is hall of fame shit right here, someone should study the way you use the internet sir

  • Scary le Poo@beehaw.org
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    19
    ·
    3 months ago

    I keep having to remind people. Chatgpt is only as good as the prompt you give it. I am astounded as the amount of garbage that some people get, but I also know that it’s generally because their prompts are garbage.

    Sometimes it’s output sucks, even with good input. But likely, if the output is bad, the input was bad.