• JeeBaiChow@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    So it generates images that emulate the look of minecraft?

    I’ve often postulated that in order to describe a reasonably complex system such that an AI is able to generate working code for it, you are effectively writing a detailed spec for the system. You might as well be coding the entire system in English. If you believe that coding languages are essentially shorthand English, well, you see the problem.

  • zaza [she/they/her]
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    the demo is really trippy - very dreamlike - obvs not an actual game but interesting to think what could be achieved if it learned about object permanence for example

  • Goun
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    Can someone explain what this actually is? It’s a python script that generates… screenshots? I don’t get it

    • Sphks@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      1 month ago

      It is based on image generators. Like Dall-e and others (more precisely videos generators like Sora). Ai based image generator take an input, like random noise, and try to fill the gaps according to one direction (usually a text like “a cat playing saxophone”). The AI have been taught what cats look like, what saxophones looks like, and what playing saxophone looks like.

      Here, the AI has been taught what Minecraft first person view looks like. With hours and hours of videos of someone playing, maybe bots.

      Now, if you type the forward arrow, let’s zoom the picture by spreading the pixels from the center of the screen. There is blank between these pixels. Get the AI fill the blank from what it thinks Minecraft should look like. Repeat for each frame and you can go forward. Do similar things for the other commands (turn left, jump…). This way you can explore the world infinitely and the AI invents the world in real time.

      I have not looked at the details, but I think that the issue is that there is no memory of the world other than what you see on the screen. If you look at the left you see something, you look at the right, then look at the left again, you see a different world. Edit. Yeah that’s an issue shown in the article.