[PNAS] Systematic testing of three Language Models reveals low language accuracy, absence of response stability, and a yes-response bias

www.pnas.org

cross-posted to:
becomeme@sh.itjust.works

[PNAS] Systematic testing of three Language Models reveals low language accuracy, absence of response stability, and a yes-response bias

www.pnas.org

LvxferreM to Linguistics · 1 year ago

cross-posted to:
becomeme@sh.itjust.works

Just a moment...

www.pnas.org

Interesting paper, about the alleged ability of LLMs* to judge the grammaticality of sentences - something that humans are rather good at. Eight phenomena were tested, and LLMs performed extremely poorly.

*LLM = large language model. Stuff like Bard, ChatGPT, LLaMa etc. I’d argue that they aren’t actual language models due to the absence of a semantic component, as shown by the article.

You must log in or register to comment.

Chat