When tested with a classic psychological assessment, advanced AI models experienced a total breakdown in focus. A new PNAS Nexus study suggests these systems lack the human-like executive control necessary to override automatic responses and maintain complex goals.
A lot of tools like Claude or ChatGPT have internal tools they call when they do math (or use a python script) rather than have the model actually compute anything.
The underlying tech itself can’t do it because you can’t do math by token probability.
Is that relevant? Mathematicians will use tools and computers that calculate for them too. Are we saying they should all do it in their heads?
Whether they use tools to do it or not is entirely unimportant, that’s just how they do it?