Researchers at Apple have released an eyebrow-raising paper that throws cold water on the "reasoning" capabilities of the latest, most powerful large language models.
In the paper, a team of machine learning experts makes the case that the AI industry is grossly overstating the ability of its top AI models, including OpenAI's o3, Anthropic's Claude 3.7, and Google's Gemini.
In particular, the researchers assail the claims of companies like OpenAI that their most advanced models can now "reason" — a supposed capability that the Sam Altman-led company has increasingly leaned on over the past year for marketing purposes — which the Apple team characterizes as merely an "illusion of thinking."
It's a particularly noteworthy finding, considering Apple has been accused of falling far behind the competition in the AI space. The company has chosen a far more careful path to integrating the tech in its consumer-facing products — with some seriously mixed results so far.
In theory, reasoning models break down user prompts into pieces and use sequential "chain of thought" steps to arrive at their answers. But now, Apple's own top minds are questioning whether frontier AI models simply aren't as good at "thinking" as they're being made out to be.
"While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood," the team wrote in its paper.
https://tech.yahoo.com/ai/articles/apple-researchers-just-released-damning-151149678.html