We discuss a one line prompt which significantly improves the mathematical abilities of o3.
OpenAI’s newest o3 model can spin out multi-page math proofs, keep previous lemmas in memory, and suggest multiple different attack routes to a proof. Yet one tiny sign error, such as \(\geq\) turning into \(\leq \) halfway down, can silently destroy the whole argument. After watching this happen a few times, our team adopted a simple discipline we call checkpoint lemmas. It costs almost nothing in extra prompting and saves lots of time catching errors and backtracking through proofs.
Imagine you ask for a spectral bound on a graph parameter. Mid-proof o3 writes
Ten lines later o3 realizes that it flipped the direction of the inequality, and the entire proof fails. Because every later step built on the wrong arrow, the fix is not a one-line patch. You must rewind the tape.
Before running the prompt, add one sentence:
o3 rereads its own line, realizes the arrow is backwards, and fixes it before continuing. We lost 20 seconds, not 20 minutes.
Great places to insert a checkpoint include any new upper or lower bound (eigenvalues, probabilities, error terms), steps where you divide by a potentially negative quantity, or moments you introduce absolute values or switch to logarithms.
This catches algebra and inequality errors while the context is still fresh, as well as adding sanity checks in intermediate spots in the argument. Boxed lemmas become ready to use building blocks if you reshuffle the argument later. The plain English restatement also lets non-specialists follow along and question the logic.