• AutoTL;DRB
    link
    fedilink
    English
    12 months ago

    This is the best summary I could come up with:


    Alas, GPT-4, a large language model from Microsoft-backed OpenAI, lacks the capacity to execute DOOM’s source code directly.

    But its multimodal variant, GPT-4V, which can accept images as input as well as text, exhibits the same endearing sub-competence playing DOOM as the fraught text-based models that have launched countless AI startups.

    Interactions are handled through a Manager layer consisting of an open source Python binding to the C Doom engine running on Matplotlib.

    “For example, it would be very common for the model to see a zombie on the screen, and start firing at it until it hit it (or died),” explains de Wynter.

    When asked to explain its actions that were generally correct in context, its explanations were poor and often included hallucinations (aka incorrect information).

    "So, while this is a very interesting exploration around planning and reasoning, and could have applications in automated video game testing, it is quite obvious that this model is not aware of what it is doing.


    The original article contains 717 words, the summary contains 163 words. Saved 77%. I’m a bot and I’m open source!