You must log in or # to comment.
I don’t think using questions that the vast majority of humans would be unable to answer are good test for “reasoning”. GPT-4 is not an all knowing magical oracle that can answer all your questions and solve all your problems and nobody claims it is.
You also don’t even have to go this far, you can trip it up with far simpler questions, like:
Do these parenthesis match (((((((((((((((((((((((((())))))))))))))))))))))
Due being a feed-forward network and its inability to loop there are pretty severe limits on the problem size you can throw at it, even if those problems are pretty trivial.
How good would a human perform at these task without pen&paper?
deleted by creator