• kayaven@lemmy.world
    link
    fedilink
    arrow-up
    16
    ·
    4 months ago

    I’m curious if you could give an image like this to an AI that supports image recognition like ChatGPT-4 and ask it to solve it for you.

    • moonlight@fedia.io
      link
      fedilink
      arrow-up
      15
      ·
      4 months ago

      This is actually a really interesting question. A modern LLM probably couldn’t do it, but I wonder if something like Alphazero could?

      My guess is that no current AI is capable, as it requires abstract reasoning and precise movement. But maybe in the next 5 years.

        • Prunebutt@slrpnk.net
          link
          fedilink
          arrow-up
          9
          ·
          4 months ago

          I don’t think that an LLM could do it. But the mechanics of baba is you should be in the training set, since it’s a relatively well knoun indie game.

          • rockkicker@kbin.run
            link
            fedilink
            arrow-up
            7
            ·
            4 months ago

            I feel like the amount of data required to train any neural network would be larger than all the levels that currently exist for baba is you

            you’d probably just end up overfitting the hell out of your model

              • rockkicker@kbin.run
                link
                fedilink
                arrow-up
                3
                ·
                4 months ago

                that would require an LLM then, but also multiple full walkthroughs are explained in text on the internet, so how would you be sure it was figuring stuff out by itself?

                • Prunebutt@slrpnk.net
                  link
                  fedilink
                  arrow-up
                  8
                  ·
                  4 months ago

                  As I said: I don’t think an LLM could do it (since LLMs can’t reason). Just saying that it wouldn’t have to deduce the mechanics from a single screenshot.

                  • rockkicker@kbin.run
                    link
                    fedilink
                    arrow-up
                    2
                    ·
                    4 months ago

                    I’m saying that if you’re attempting to parse the mechanics of play by shoving in the whole internet and saying “well the instructions are in there somewhere” then the best tool for that is an LLM.