Prompt Inversion > Blog > Checkpoint Lemmas: A One-Line o3 Hack

May 5, 2025

Checkpoint Lemmas: A One-Line o3 Hack

We discuss a one line prompt which significantly improves the mathematical abilities of o3.

OpenAI’s newest o3 model can spin out multi-page math proofs, keep previous lemmas in memory, and suggest multiple different attack routes to a proof. Yet one tiny sign error, such as \(\geq\) turning into \(\leq \) halfway down, can silently destroy the whole argument. After watching this happen a few times, our team adopted a simple discipline we call checkpoint lemmas. It costs almost nothing in extra prompting and saves lots of time catching errors and backtracking through proofs.

Imagine you ask for a spectral bound on a graph parameter. Mid-proof o3 writes

Claim. The smallest eigenvalue satisfies \[ \lambda_{\text{min}} \leq -\sqrt{\Delta} \]

Ten lines later o3 realizes that it flipped the direction of the inequality, and the entire proof fails. Because every later step built on the wrong arrow, the fix is not a one-line patch. You must rewind the tape.

Checkpoints

Before running the prompt, add one sentence:

After each non-trivial inequality, box it as Lemma, restate it in plain English, then pause. Lemma. The smallest eigenvalue satisfies \[ \lambda_{\text{min}} \leq -\sqrt{\Delta} \] Explain why the sign is less than instead of greater than.

o3 rereads its own line, realizes the arrow is backwards, and fixes it before continuing. We lost 20 seconds, not 20 minutes.

Great places to insert a checkpoint include any new upper or lower bound (eigenvalues, probabilities, error terms), steps where you divide by a potentially negative quantity, or moments you introduce absolute values or switch to logarithms. 

This catches algebra and inequality errors while the context is still fresh, as well as adding sanity checks in intermediate spots in the argument. Boxed lemmas become ready to use building blocks if you reshuffle the argument later. The plain English restatement also lets non-specialists follow along and question the logic.

Recent blog posts

LLMs

Checkpoint Lemmas: A One-Line o3 Hack

We discuss a one line prompt which significantly improves the mathematical abilities of o3.

May 5, 2025
Tanay Wakhare
Read more
LLMs

o3 vs o1-pro: What We Learned After 1,000+ Math Prompts

We evaluate the reasoning abilities of OpenAI's newest model: o3

April 28, 2025
Tanay Wakhare
Read more
LLMs
Security

Next-Gen Jailbreaks and How to Defend Against Them

We dive into the latest generation of LLM jailbreaks and the defense techniques designed to counter these threats.

April 21, 2025
Rimon Melamed
Read more