There is a quiet question at the heart of every verifiable database: how does a proof refer to data it never shows? When a bank proves a statistic about its ledger, the verifier must be sure the proof is about the real, current ledger — not a convenient fiction the prover swapped in. The mechanism that pins data down without disclosing it is the commitment scheme, and for modern systems specifically the polynomial commitment.
What a commitment is
A commitment scheme is the cryptographic equivalent of sealing a value in an envelope:
- Binding. Once committed, you cannot change the value. Opening the envelope later must reveal the original — substitution is detectable.
- Hiding. The sealed envelope reveals nothing about what is inside.
The classic analogy is a sealed bid at an auction: you commit to a number now, reveal it later, and no one can accuse you of changing it — nor can they read it early. A simple hash is binding but not gracefully openable in pieces. Databases need more: the ability to commit once to a huge dataset, then later prove specific facts about it, cheaply. That is what polynomial commitments add.
Why polynomial commitments
Modern proof systems encode data as polynomials over a finite field. A table becomes coefficients; a query result becomes an evaluation. A polynomial commitment lets you:
- Commit to a polynomial with a single short value, and
- Open it at any point — prove "this polynomial evaluates to y at point x" — with a short proof, without revealing the whole polynomial.
This is the exact shape a verifiable query needs. The committed polynomial is the dataset's fingerprint; the openings are the proofs about it. Two schemes dominate, and the choice between them is one of the first architectural decisions in an engagement.
KZG — small and constant, with a setup
KZG (Kate–Zaverucha–Goldberg) commitments are built on elliptic-curve pairings.
- Commitment and proof are constant-size — a single group element each (~48 bytes), no matter how large the dataset. This is the smallest, fastest-to-verify option.
- Requires a trusted setup. A one-time ceremony generates public parameters (the "powers of tau"). If the secret randomness of that ceremony were retained by an adversary, soundness could be broken — so the ceremony is run as a large multi-party computation where a single honest participant suffices. The Ethereum KZG ceremony, with tens of thousands of contributors, is the canonical example, and its parameters are reusable.
KZG is the right call when proof size and verifier speed are paramount and a reputable existing ceremony can be relied on.
IPA — transparent, no ceremony
IPA (Inner Product Argument) commitments, used in Bulletproofs and in the transparent configuration of Halo2, take the opposite trade.
- No trusted setup. Parameters are generated from public randomness ("nothing-up-my-sleeve"). There is no toxic waste, nothing an adversary could have kept.
- Logarithmic proof size. Proofs grow with the logarithm of the circuit size — larger than KZG's constant, but still small in absolute terms, and verification is cheap.
IPA is the right call in adversarial or highly-regulated settings where "trust no ceremony" is itself a requirement — which is often exactly the posture of the institutions a zero-knowledge database serves.
| Property | KZG | IPA |
|---|---|---|
| Trusted setup | Required (reusable ceremony) | None — transparent |
| Commitment size | Constant (~48 bytes) | Constant |
| Proof size | Constant | Logarithmic |
| Verify cost | Pairing — very fast | Logarithmic |
| Best when | Smallest proofs, ceremony acceptable | “Trust no ceremony” mandate |
The role in a zero-knowledge database
Here is where it lands. When a dataset is committed:
[ private rows ] [ public bulletin board ] ─────────────── ───────────────────────── row #1 ... commit_t0 = 0x4a1f… row #2 ... ── polynomial ──▶ commit_t1 = 0xe09c… row #3 ... commitment commit_t2 = 0x91b3… ◀ active ... ...
- The owner commits to the current dataset and publishes the fingerprint (to a private append-only log, an internal HSM-signed register, or a public chain — the cryptography is indifferent).
- From that moment the data is frozen: any later edit changes the fingerprint, invalidating every proof that referenced the old one.
- Each query proof is made relative to the active commitment. The verifier checks the proof against the published fingerprint — and is therefore certain the answer concerns the real, committed data, not a substitute.
This is what closes the loop opened in zero-knowledge proofs: the proof convinces without disclosing, and the commitment guarantees it convinces about the right data.
The honest caveat
A commitment proves the answer is faithful to the committed dataset. It does not prove the committed dataset is itself honest. If a firm commits to a fabricated ledger, the proofs are honest about a lie. This is why provenance — signed-at-source data, attested ingestion — becomes more important alongside verifiable queries, not less. We treat that boundary explicitly in the trust topology of a zkDB.
What to read next
- Custom gates for SQL — how the committed data is queried inside a circuit.
- How a verifiable query actually works — the full lifecycle, commitment to verification.
- Zero-knowledge proofs, explained — the primitive this builds on.
Kate, Zaverucha, Goldberg. Constant-Size Commitments to Polynomials and Their Applications. ASIACRYPT 2010 — the KZG scheme.
Bünz et al. Bulletproofs: Short Proofs for Confidential Transactions and More. IACR ePrint 2017/1066 — the inner-product argument.