Don’t use AI to summarize documents — it’s worse than humans in every way

David Gerard@awful.systems · 5 months ago

Don’t use AI to summarize documents — it’s worse than humans in every way

Lvxferre [he/him]@mander.xyz · 5 months ago

It just so happened to work out in your very specific example.

Emphasis mine. This shit is so trivial to reproduce with other texts, but so trivial, that you’re only making a fool of yourself with this hand-waving.

I’ll provide you further examples, showing that no, it didn’t “just so” happened to work in this “very specific” example. Now using content linked in this comm instead of sci articles so you don’t assume/lie cherry picking later on.

The link in the OP

[Input] Summarise https://pivot-to-ai.com/2024/09/04/dont-use-ai-to-summarize-documents-its-worse-than-humans-in-every-way/ into a single paragraph.

[Output] The article titled “Don’t use AI to summarize documents - it’s worse than humans in every way” argues that Large Language Models (LLMs) are not effective at summarizing documents. The author, Marcelo Soares, believes LLMs are best suited for generating generic text that isn’t particularly interesting and that their attempts at summarization are flawed.

AI and the Silicon Valley Hype Machine - NYT

[Input] Summarise https://www.nytimes.com/2024/05/15/opinion/artificial-intelligence-ai-openai-chatgpt-overrated-hype.html into three paragraphs.

[Output] The New York Times opinion piece titled “Will A.I. Ever Live Up to Its Hype?” by Julia Angwin argues that Artificial Intelligence, particularly OpenAI’s ChatGPT, is falling short of its grand promises.

Just a year ago, researchers worried AI might become too powerful. Now, the concern is whether it’s powerful enough. Angwin criticizes OpenAI’s recent announcement of a faster, cheaper ChatGPT as underwhelming, suggesting it’s not the “magic” improvement that was hyped. She compares AI to an unreliable intern whose mistakes make it easier to do things yourself.

Angwin believes acknowledging AI’s limitations is crucial. It would allow for more efficient resource allocation and a shift towards more realistic solutions. We shouldn’t be fooled by Silicon Valley’s hype machine, she argues.

It shows what I said - you can’t use it for the claims themselves, but it’s a good enough way to know the topic of the article a bit past the title.

Do you think it will always give a summary that is useful to researchers? [emphasis in the original]

Still pretending that either something needs to be perfect or it’s useless (nirvana fallacy).

Maybe that’s fine. Maybe having the occasional error in the summary, which causes the researcher to dismiss the paper erroneously, is better than just guessing based on the title.

What you’re proposing (to guess based on title) leads to more papers being dismissed erroneously. You’re making the problem worse by ignoring the tool than by using it with all its flaws.

And it is not just sci articles. Every bloody time that you have more text than you can reasonably read, those “AI shortened versions” make you pick up something to read that you would not do otherwise.

Since both of us are clearly repeating arguments I’m going to end the discussion from my part here. I’ll still read any potential reply, but I’m not going to reply further myself.