SCIENTIFICAMERICAN | Math
Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI
在数学家们努力超越人工智能的秘密会议内

Blue illustration of a face with numbers, implying math and AI.
2025-06-06 1132词 困难
To track the progress of o4-mini, OpenAI previously tasked Epoch AI, a nonprofit that benchmarks LLMs, to come up with 300 math questions whose solutions had not yet been published. Even traditional LLMs can correctly answer many complicated math questions. Yet when Epoch AI asked several such models these questions, which were dissimilar to those they had been trained on, the most successful were able to solve less than 2 percent, showing these LLMs lacked the ability to reason. But o4-mini would prove to be very different.
免责声明:本文来自网络公开资料,仅供学习交流,其观点和倾向不代表本站立场。