Quantum Skip-Gram Proof Solver (Lean/mathlib)
Goal
Build a proof solver for a specific mathematical subdomain (e.g., Hilbert spaces) that learns human-style proof moves from Lean mathlib and uses them to propose the next likely action during proof search.
This is not a “find any proof” engine. The objective is to find good proofs:
- short (few steps),
- standard (uses common techniques),
- readable (high-level moves, not low-level term soup),
- robust (works with typical
mathlibpatterns).
Core Idea
Combine two ingredients:
1) Skip-gram training on proofs
- Treat a proof as a sequence of “actions” (tactics, lemma applications, rewrites, structured moves).
- Train a model by masking a step and asking it to predict the missing step given the surrounding context.
- This pushes the model toward learning “what is the natural next move here”, capturing common proof idioms.
2) Quantum-style branching proof search
- At any proof state, there are many plausible next moves.
- Instead of committing early, explore multiple probabilistic branches in parallel.
- Keep a frontier of candidate partial proofs, score them, expand the best ones, prune the rest.
- Eventually return the best complete proof under a chosen optimality metric.
“Quantum” here means: we maintain and evolve a distribution over proof paths, not literal quantum computation.
Vocabulary
Proof State
A Lean goal state:
- local context (hypotheses, types, instances),
- target goal,
- implicit typeclass obligations.
Action (a proof move)
An action is a structured “next step”, for example:
apply lemma_name(with implicit/explicit arguments),refine ...skeleton,have h : ... := ...,- rewriting:
simp [..],rw [..],ring,linarith, - subdomain-specific moves: orthonormal expansion, inner product identities, Cauchy–Schwarz, projection lemmas, etc.
We want actions at a level that reads like a standard mathlib proof.
Data Representation
Proof as a sequence
From existing mathlib proofs, extract sequences like:
- (state_0, action_0) -> state_1
- (state_1, action_1) -> state_2
- …
- state_n is
⊢ True/ closed goal
Where:
state_iis a compact representation of the goal and context,action_iis the next tactic/lemma step.
Skip-gram objective
Train on tuples where one step is removed:
Given context window:
- states/actions around step k, predict:
- action_k (or action_k’s “template class”)
Example: predict that the missing move is “apply Cauchy–Schwarz” instead of some brittle low-level rewriting.
Model Outputs (what the NN should predict)
The model should output a ranked list of “likely next actions”, ideally in layers:
1) Technique class (high-level)
- “use Cauchy–Schwarz”
- “rewrite with inner product symmetry”
- “introduce an epsilon and use completeness”
- “reduce to norm inequality”
- “apply projection property”
- “simp using linearity”
2) Concrete action template
have := inner_mul_self_nonneg ...have h := norm_inner_le ...simp [real_inner_mul_left, ...]rw [inner_add_left, ...]apply le_trans ?_ ?_then fill holes
3) Fully instantiated Lean step A ready-to-run tactic line (best-effort), possibly with metavariables to be filled by subsequent search.
The key is that the top-k list contains human-recognizable proof moves.
Quantum Branching Search (beam + scoring)
Maintain a priority queue / beam of partial proofs.
Each candidate contains:
- current Lean state,
- proof script so far,
- accumulated score.
At each expansion: 1) query NN for top-k next actions, 2) attempt actions in Lean (filter those that actually apply), 3) produce child candidates with updated states, 4) score each child and keep the best B.
Scoring: “optimal proof”
Score a proof path by a weighted combination:
- shorter is better (penalize length),
- common techniques are better (reward higher prior probability / frequency),
- readability is better (penalize “ugly” low-level steps),
- stability is better (prefer simp/lemma moves that don’t require fragile term arguments),
- domain alignment is better (reward canonical Hilbert-space idioms).
A simple prototype score: score(path) =
-
α * log P_model(actions contexts) - β * (#steps)
- γ * (ugliness_penalty)
- δ * (technique_commonness_bonus)
- ε * (subdomain_style_bonus)
Where:
ugliness_penaltycould include: too manysimp?guesses, heavyexactterms, long chains ofrw, etc.style_bonuscould reward known “clean” patterns in Hilbert space proofs.
“Transformation Graphs” link
Interpret the model’s “technique classes” as nodes in a transformation graph:
- nodes = proof techniques / rewrite schemas / lemma families,
- edges = likely transitions between techniques in successful proofs.
The solver is then a guided walk on this graph:
- local goal features determine which transformations are plausible,
- the NN provides probabilities for next transformation nodes,
- the search explores multiple walks and returns the best proof.
This graph is both:
- a learned search prior, and
- an explanation object (“the solver chose C–S then rewrote inner products then applied norm inequality”).
Why Hilbert spaces as a subdomain
Hilbert space proofs in mathlib often reuse a relatively tight toolkit:
- inner product algebra, linearity, conjugate symmetry,
- norm identities and inequalities (C–S, triangle),
- orthogonal decomposition and projections,
- completeness arguments,
- bounded linear maps and adjoints.
This makes the action vocabulary learnable and the “technique graph” meaningful.
Deliverables (prototype milestones)
1) Proof trace extractor
- Parse a curated set of Hilbert-space proofs in
mathlib - Extract (state, action) sequences.
2) Action vocabulary + templates
- Normalize actions into canonical forms (reduce syntax variance).
3) Skip-gram model
- Train to predict missing step / technique class.
4) Guided beam search
- NN proposes top-k actions
- Lean checks applicability
- Scoring + pruning.
5) Explainable output
- Return:
- final Lean proof script,
- technique trace (the transformation path),
- score breakdown (why this proof was “optimal”).
Design constraints (what we want to avoid)
- A solver that only works by brute-force low-level term search.
- Outputs that are technically correct but unreadable.
- Overfitting to exact lemma names without learning technique structure.
- A system that collapses into “simp until it works”.
Summary
A Quantum Skip-Gram Proof Solver is:
- a learned next-action predictor trained by masking proof steps (skip-gram),
- combined with a probabilistic multi-branch proof search (quantum-style),
- optimized to output human-readable, standard, short
mathlib-style proofs, - explainable via a learned transformation graph of proof techniques.
Comments