• EatATaco@lemm.ee
    link
    fedilink
    English
    arrow-up
    5
    ·
    2 months ago

    I listened to a podcast (This American Life, IIRC), where some researchers were talking about their efforts to determine whether or not AI could reason. One test they did was asking it to stack a random set of items (one it wouldn’t have come across in any data set, plank of wood, 12 eggs, a book, a bottle, and a nail. . .probably some other things too) in a stable way. With chat gpt 3, it basically just (as you would expect from a pure text predictor) said to put one object on top of another, no way would it be stable.

    However, with gpt 4, it basically said to put the wood down, and place the eggs in a 3 x 4 grid with the book on top (to stop them from rolling away), and then with the bottle on top of that, with the nail (even noting you have to put the head side down because you couldn’t make it stable with the point down). It was certainly something that could work, and it was a novel solution.

    Now I’m not saying this proves it can think, but I think this “well it’s just a text predictor” kind of hand-waves away the question. It also begs the question, and based on how often I hear people parroting the same exact arguments against AI thinking, I wonder how much we are simply just “text predictors.”

    • Buddahriffic@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 months ago

      The sheer size of it and it’s training data makes it hard to really say what it’s doing. Like for an object that it wouldn’t have come across in it’s training data, a) how could they tell it was truly a new thing that had never been discussed anywhere on the internet where the training could have consumed it, and b) that any description provided for it didn’t map it to another object that would behave similarly when stacking.

      Stacking things isn’t a novel problem. The internet will have many examples of people talking about stacking (including this one here, eventually). The put the flat part down for the nail could have been a direct quote, even. Putting a plank of wood at the bottom would be pretty common, and even the eggs and book thing has probably been discussed before.

      I mean, I can’t dismiss that it isn’t doing something more complex, but examples like that don’t convince me that it is. It is capable of very impressive things, and even if it needs to regurgitate every answer it gives, few problems we want to solve day to day are truly novel, so regurgitating previous discussions plus a massive set of associations means that it can map a pretty large problem space to a large solution space with high accuracy.

      I’m having trouble thinking of ways to even determine if it can really problem solve that won’t accidentally map to some similar discussion among nerds that like to go into incredible detail and are willing to speculate in any direction just for the sake of enjoying a thought experiment.

      Like even known or suspected unsolvable problems have been discussed to greater levels of detail than I’ve likely considered them, so even asking it to do its best trying to solve the traveling salesman problem in polynomial time would likely impress me because computer science students and alums much smarter than I am have discussed it at length.

      • EatATaco@lemm.ee
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 months ago

        how could they tell it was truly a new thing

        Sure, there is a chance the exact question had been asked before, and answered, but we are talking remote possibilities here.

        that any description provided for it didn’t map it to another object that would behave similarly when stacking.

        If it has to say ‘this item is like that other item and thus I can use what I’ve learned about stacking that other item to stack this item’ then I would absolutely argue that it is reasoning and not just “predicting text” (or, again, predicting text might be the equivalent of reasoning).

        Stacking things isn’t a novel problem.

        Sure, stacking things is not a novel problem, which is why we have the word “stack” because it describes something we do. But stacking that list of things is (almost certainly) a novel problem. It’s just you use what you’ve learned and apply that knowledge to this new problem. A non-novel problem is if I say “2+2 = 4” and then turn around and ask you “what does 2 + 2 equal?” (Assuming you have no data set) If I then ask you “what’s 2 + 3?” that is a novel problem, even if it’s been answered before.

        I mean, I can’t dismiss that it isn’t doing something more complex, but examples like that don’t convince me that it is. It is capable of very impressive things, and even if it needs to regurgitate every answer it gives, few problems we want to solve day to day are truly novel, so regurgitating previous discussions plus a massive set of associations means that it can map a pretty large problem space to a large solution space with high accuracy.

        How are you convinced that humans are reasoning creatures? This honestly sounds like you could be describing 99.99% of human thought, meaning we almost never reason (if not actually never). Are we even reasonable?

        • Buddahriffic@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          2 months ago

          I don’t mean it needs to be the exact question, just something with equivalence. If someone talked about stacking boards and other things, the board could go on the bottom and then maybe someone else talked about stacking balls and books that way, so it used that because “eggs” were associated with “round”. Follow up with the nail thing from another conversation.

          It’s definitely a form of intelligence, but I don’t think it’s anywhere close to 99.9% of human thought. I think it’s missing entire dimensions of thought.

          • EatATaco@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 months ago

            I’m not saying it’s 99.9% of human intelligence, I’m saying you’re describing 99.9% of human thought.

            This is what humans do, we hear about something thing and then we learn how to apply it to another. You even mention here “stacking balls” and then making the connection that eggs are also round and would need to be stacked in the same way to prevent rolling. This is reasoning, using what you’ve learned and applying it to a novel problem.

            What you are describing as novel problems are really just doing the same thing at a completely different level. Like I play soccer, but no matter how much I trained, there is no way I would ever reach Messi’s skill, because he was just born with special skill in that area, but still just human like the rest of us.

            And remember I’m mostly just pointing to the “text predictor” claim. I’m not convinced it’s not, and I think that appeared true for early models, but not so easy to apply to current models.

            • Buddahriffic@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 months ago

              Yeah, it is hard to say if the “glorified text predictor” is completely accurate, since the sheer size of the model allows for some pretty deep connections.

              And, thinking about it since making that post, it’s hard to say for sure that even Einstein or Newton were doing anything differently or were just the first/most famous to put those particular things together.

              • EatATaco@lemm.ee
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 months ago

                It’s a weird world and cool to think about. Thanks for the civil and interesting discussion.

                  • EatATaco@lemm.ee
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    2 months ago

                    IMO, one of the best QoL updates for Lemmy is to make the votes invisible.

      • Eheran@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        So every human that does not come up with something entirely new that has never been before is not intelligent? Are people with an IQ of 80 not intelligent anymore, just bio-machines?Seriously, where do you draw the line? You keep shifting the goal to harder and harder to reach things that at this point most people would not fit anymore. When GPT5 will then also do that, what will you say? That it did not invent the car? Come up with relativistic effects?

        • Buddahriffic@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          2 months ago

          I don’t mean to shift the goalposts so much as better specify them.

          Ultimately, the sign I’m looking for to confirm we have true AI is the technological singularity when AI is able to iterate on itself (both software and hardware) and improve itself better than humans can, at an accelerating rate.

          If AI ever gets to the point where we are, it will quickly surpass us just due to the way they improve and scale up vs how we do.

          As long as they can’t do that, they are still missing something. They are good at what they do, returning an essay answer in seconds to any question that is accurate more often than not (depending on the question), but there’s parts of our circle in the venn diagram of capabilities that no AIs overlap with… Yet.

          I wouldn’t be surprised to see it before I die though, because I think the circle of what’s possible with AI that we haven’t done yet surrounds our own circle entirely, at least until we connect our brains with theirs and transcend or something.