The second batch of “First Proof” problems is meant to evaluate AI’s usefulness for research-level math. The best model got six or seven of the ten questions right.
Artificial intelligence has moved from checking homework to attacking problems that professional mathematicians once treated as out of reach. Systems tuned for symbolic reasoning are now cracking long ...
Google DeepMind’s AlphaProof system scored at a silver-medal level when tested against the 2024 International Mathematical Olympiad, solving problems that have historically separated elite human ...
Mathematician Will Sawin discusses his experience reviewing and refining a mathematical proof devised by OpenAI's internal model—and what that could mean for mathematics. Reading time 10 minutes Will ...