“AI’s Ostensible Emergent Abilities Are a Mirage” paper won the Outstanding Paper Award at NeurIPS 2023

ylai · edit-2 1 year ago

“AI’s Ostensible Emergent Abilities Are a Mirage” paper won the Outstanding Paper Award at NeurIPS 2023

keepthepace@slrpnk.net · 1 year ago

Note: the actual paper’s title ends in a question mark and states in its “discussion”:

We emphasize that nothing in this paper should be interpreted as claiming that large language models cannot display emergent abilities; rather, our message is that previously claimed emergent abilities in [3, 8, 28, 33] might likely be a mirage induced by researcher analyses.

It is clear to anyone who used them and understand the task they were trained on, that LLMs do have emergent abilities. This paper is a refutation of precise claims of other papers that they argue use inappropriate metric to show “sudden” emergence rather than a “smooth” one.

ylai · edit-2 1 year ago

To clarify: The authors/Stanford used this exact stated/non-question title for their press release: https://hai.stanford.edu/news/ais-ostensible-emergent-abilities-are-mirage, which ended up also being the title of the previous post on !artificial_intel@lemmy.ml. As already noted by @huginn@feddit.it, this “AI’s Ostensible” title is therefore well in line with the paper’s actual conclusion, that is refuting current claims of emergence. And I picked the “AI’s Ostensible” title being from the authors/their employer, for clarity (especially when quoted inside a larger Lemmy post title), and continuity with the previous post.

It is clear to anyone who used them and understand the task they were trained on, […]

Yet where is the proof? This is the exact wishy-washy way of not substantiating a claim, which this paper investigated and have refuted.

[…] that LLMs do have emergent abilities.

I think you should really not drop that sentence immediately in front of your quite selective quote — the authors put it in emphasis for good reasons:

Ergo, emergent abilities may be creations of the researcher’s choices, not a fundamental property of the model family on the specific task.

So regarding “emergent abilities,” it is quite clear the authors argue that from their analysis, if at all, cherry-picked metrics carry these “emergent abilities,” not LLMs.

huginn@feddit.it · 1 year ago

This paper is a precise refutation to all current claims of emergence as nothing more than bad measurements.

It is clear to anyone who used them and understand the task they were trained on, that LLMs do have emergent abilities.

Not by the definition in this paper they don’t. They show linear improvement which is not emergent. The definition used is:

As the complexity of a system increases, new properties may materialize that cannot be predicted even from a precise quantitative understand- ing of the system’s microscopic details.

The capabilities displayed by LLMs all fall on a linear progression when you use the correct measures. That is the antithetical to emergent behaviors.

Again: that does not preclude emergence in the future, but it strongly refutes present claims of emergence.

keepthepace@slrpnk.net · 1 year ago

That’s a weird definition. Is it a widely used one? To me emergence meant to acquire capabilities not specifically trained for. I don’t see why them appearing suddenly or linearly is important? I guess that’s an argument in safety discussions?

jacksilver@lemmy.world · 1 year ago

That definition is based on how the paper approached it and seems to be a generally accepted definition. I just read a bit of the paper, but seems to highlight that how we’ve been evaluating LLMs has a lot more to say about their emergent capabilities than any actual outcome.

huginn@feddit.it · 1 year ago

Not only that but it’s the definition used by every single researcher claiming “Emergent behavior”

keepthepace@slrpnk.net · 1 year ago

Ok thanks.

Mahlzeit@feddit.de · 1 year ago

It’s not the definition in the paper. Here is the context:

The idea of emergence was popularized by Nobel Prize-winning physicist P.W. Anderson’s “More Is Different”, which argues that as the complexity of a system increases, new properties may materialize that cannot be predicted even from a precise quantitative understanding of the system’s microscopic details.

What this means is, that we cannot, for example, predict chemistry from physics. Physics studies how atoms interact, which yields important insights for chemistry, but physics cannot be used to predict, say, the table of elements. Each level has its own laws, which must be derived empirically.

LLMs obviously show emergence. Knowing the mathematical, technological, and algorithmic foundation, tells you little about how to use (prompt, train, …) an AI model. Just like knowing cell biology will not help you interact with people, even if they are only colonies of cells working together.

The paper talks specifically about “emergent abilities of LLMs”:

The term “emergent abilities of LLMs” was recently and crisply defined as “abilities that are not present in smaller-scale models but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models”

The authors further clarify:

In this paper, […] we specifically mean sharp and unpredictable changes in model outputs as a function of model scale on specific tasks.

Bigger models perform better. An increase in the number of parameters correlates to an increase in the performance on tests. It had been alleged, that some abilities appear suddenly, for no apparent reason. These “emergent abilities of LLMs” are a very specific kind of emergence.

kaffiene@lemmy.world · edit-2 1 year ago

That was my feeling reading the paper. I feel that LLMs are overhyped but the issue of linear vs super linear growth in metrics is a different issue and can’t be a refutation of what has traditionally been thought of as emergent properties. In other words, this is refutation by redefinition.

huginn@feddit.it · 1 year ago

Arxiv link

AernaLingus [any]@hexbear.net · 1 year ago

Presentation of the paper by one of its authors

ThatWeirdGuy1001@lemmy.world · 1 year ago

I just wish people stopped calling these things AI.

As far as I’m aware there’s a difference between true AI and what these things are which would VI (virtual intelligence)

AI can think entirely on it’s own whereas a VI is just selecting from a set list of information with no real thought behind it other than basic computations.

With all that said I could be wrong and would love for someone to explain it to me.

dovah@lemmy.world · 1 year ago

In computer science, AI refers to any machine (likely software) that is capable of doing tasks that only an intelligent being could. Algorithms that can solve search problems, such as path finding, linear regression, and perception are all said to be AI. I believe what you are referring to as “true AI” is known as AGI (artificial general intelligence).

frezik@midwest.social · 1 year ago

The history of AI research doesn’t go that way. It tends to be about pushing computers to do things they currently can’t. Chess was once a major focus. Then an AI could beat a specific grandmaster in a six game match, and then any grandmaster, and then so good that no human stands a chance. Then it wasn’t really AI anymore, but a piece of software you can run on a laptop.

Getting to a machine that thinks at a human level is an asperational goal. One where we create a bunch of useful tools along the way.

kaffiene@lemmy.world · 1 year ago

Historically AI has been the term for algorithm much much more simple than LLMs. I agree that it’s not the best term, but that’s what we’ve got and have had since the 60s so… Probably not changing it now Therebis a term for the distinction you appear to be wanting to draw: AGIs - Artifical General INTELLIGENCE. LLMs are AIs but not AGIs

Mahlzeit@feddit.de · 1 year ago

You were never into video games, right? The reason I ask, is because games use a lot of AI. One might see “AI” in the game settings, or if the game has some editing tool/level builder/ … one might see it there. If one takes an interest, one might pick up on people talking about the AI of one game or another.

I am always surprised, when I hear people say that LLMs are too simple to be real AI, because I’m thinking that most people who grew up in the last ~20 years would have interacted a lot with these much simpler game AIs. I would have thought that this knowledge would diffuse to parents and peers.

Non-rhetorical question: Any idea why that didn’t happen?

ylai · edit-2 1 year ago

My impression is that game AI (and I mean in FPS, not board games) were not considered as serious AI in the computer science sense. Most game AI even till this day are “cheating” in the sense that they are not end-to-end (i.e. cannot operate using screen capture, vs. engine information), and often also need additional advantages to hold ground. For example, virtually all these FPS game AI are quite useless once you actually want to interface it with some form of robotics and do open world exploration. So game AI is somewhat separate from the public’s obsession with the term AI, that suddenly turn nit-picky/moving-the-goalposty once AI became performant on end-to-end tasks.

The Wikipedia article AI effect (not super-polished) has many good references where people discussed how this is related to anthropocentrism, and people can also be very pushy with that view in the context of animal cognition:

Michael Kearns suggests that “people subconsciously are trying to preserve for themselves some special role in the universe”.[20] By discounting artificial intelligence people can continue to feel unique and special. Kearns argues that the change in perception known as the AI effect can be traced to the mystery being removed from the system. In being able to trace the cause of events implies that it’s a form of automation rather than intelligence.

A related effect has been noted in the history of animal cognition and in consciousness studies, where every time a capacity formerly thought as uniquely human is discovered in animals, (e.g. the ability to make tools, or passing the mirror test), the overall importance of that capacity is deprecated.[citation needed]

Note that there is also a similar effect, not explicitly discussed by that article, where people like to depict ancient societies dumber than they actually are (e.g. the today discounted notion of “Dark Ages”).

Mahlzeit@feddit.de · 1 year ago

The purpose of game AI is to make games fun, not to advance serious research, but it certainly is real AI. Making computers play chess was a subject of much serious research. AI opponents in video games are not fundamentally different from that.

As humans, we have an unfortunate tendency to aggrandize our own group and denigrate others. I see anthropocentrism as just one aspect of that, beside nationalism, racism and such. This psychological goal could be equally well achieved by saying things like: “This is not real intelligence. It’s just artificial, like game AI.”

But I don’t see that take being made. I only see pseudo-smart assertions about how AI is just a marketing term.

I think anthropocentrism may have something to do with why the idea of “emergent abilities” (as step-changes in performance/parameters) is alluring. We like to believe that we are categorically different from animals; or at least, that is the traditional belief in many western cultures. We now know, though, that the brain does the thinking, and that human and other mammal brains only show differences in degree, not in kind. If you believe in some categorical difference between animals and humans, you would expect to find step-changes of that sort. Personally, I would find it nice, if I could believe that, somewhere along that continuum between animal and human brain, something goes click and makes it ok to eat them.