The Data Trust Paradox

Every modern regulated data architecture, beneath whatever stack of clouds and warehouses and SOC reports it advertises, is built on the same binary choice. Either share the data with the partner, the regulator, the auditor, the counterparty — and accept the loss of control that comes with disclosure. Or protect the data, hold it close, and accept the loss of verifiability that comes with silence.

There is no third option, the lawyers say. There is no third option, the security architects say. There is no third option, the regulators say.

There is now another option.

A choice that has always been false

Pick any regulated data flow at the boundary of two institutions and the paradox is there. A bank computes its capital adequacy ratio for the OCC. A pharmaceutical company reports its trial endpoint to the FDA. A multinational discloses its Scope 3 emissions to the SEC. A SaaS vendor responds to a GDPR right-to-be-forgotten request. A cloud customer requests proof that its query result respected the row-level security policy.

In every case, the producing party knows something. The consuming party wants to verify what the producer knows. The producer cannot — must not — disclose the underlying records. So the producer makes a claim, and the consumer is asked to believe.

The institution of belief, in the modern economy, runs through three intermediaries: the producer's word, the producer's auditor, and the producer's vendor attestation chain. Each has a well-rehearsed failure mode. The producer can lie or be wrong. The auditor lacks the bandwidth to recompute at real volume and ends up sampling, with disclaimers. The vendor attestation chain depends on a chip-fab in another jurisdiction and a supply chain we have stopped pretending to control.

The collective cost of these failure modes — fines, restatements, broken inter-bank trust, regulator-driven data hoarding, the entire compliance-theater industry — is a tax on the information economy. Conservative estimate: hundreds of billions a year.

What zero-knowledge proofs actually change

A zero-knowledge proof is a cryptographic protocol that lets one party prove a statement to another without revealing anything about why the statement is true. For four decades after its first formulation (Goldwasser, Micali, Rackoff, 1985) the construction was theoretically beautiful and practically unusable.

That window has closed in the last five years.

Modern PLONK-ish proof systems compose efficiently with database operators. Polynomial commitments fingerprint datasets cheaply and once. Recursive composition collapses long computations into single short artifacts. Hardware acceleration has reduced prover cost by one to two orders of magnitude per year. The canonical analytic SQL benchmark — TPC-H — now runs end-to-end in zero knowledge.

A zero-knowledge database is the application: a system that answers a query and ships, alongside the answer, a short cryptographic artifact that proves the answer is exactly what the agreed query would produce on the committed dataset. The producer never reveals a row. The consumer verifies in milliseconds, on a laptop, from the artifact alone.

The implications follow.

Audit becomes a check, not an investigation

Today, an audit is a sampling exercise. Pull a sample of records, recompute the aggregate, hope the sample was representative, attest with disclaimers about the limits of the sample. With cryptographic verifiability, the audit is a single deterministic check on a 38-kilobyte artifact. Sample size: everything. Statistical confidence: complete. Time to verify: tens of milliseconds.

The auditor's role does not vanish. It moves up the stack. The auditor designs the proof system — the circuit, the commitment cadence, the verification-key custody. The auditor confirms the firm's commitment ceremony is honest. The actual computation is checked by the math.

Cross-border data flows become tractable again

Schrems II made the workaround — Standard Contractual Clauses, Transfer Impact Assessments, supplementary measures — into paperwork nobody believes in. With a zero-knowledge database, the data does not leave the EU. Only the answer and the proof do. A US verifier checks the proof against the public commitment, sees no rows, and is mathematically certain of correctness.

The same logic applies to data sovereignty in the UAE, India, Saudi Arabia, China — every regime that has adopted strict locality rules. Cryptographic locality becomes a substitute for physical locality.

Regulators stop demanding bulk data

Today a regulator demands the full ledger because the regulator cannot trust the firm's report. The regulator then becomes a data custodian — with its own breach surface, its own staffing cost, its own data-protection liability when something goes wrong. With cryptographic verifiability, the regulator receives the answer and the proof. The bulk data stays in the firm's custody. The regulator's breach surface contracts. The firm's exposure to a regulator-side breach disappears.

The shift this enables is institutional, not technical

The deepest change is not technical. For most of the modern information economy, the right to verify has been delegated to a handful of professional intermediaries — Big Four firms, ratings agencies, regulators with subpoena power. The intermediary class exists because verification was expensive and required physical access to the data.

Cryptographic verifiability turns that right into a public good. Anyone with the verification key can audit. Watchdogs, journalists, the firm's own board, and individual customers join the set of plausible verifiers. The intermediaries do not disappear — their function moves up the stack, from "did this happen" to "should this be happening."

This is the bet behind zkDB as a category, and behind our work as a firm. The next decade of regulated data infrastructure will be built on cryptographically verifiable claims, and the institutions that move first will set the standards everyone else inherits.

What to read next

The concept primer — what a zero-knowledge database actually is, in plain English.
The trust topology — a map of who needs to trust whom, before and after.
The comparison — when to use ZK proofs, and when to use FHE, TEEs, or differential privacy instead.

Gu, Fang, Nawab. PoneglyphDB: Efficient Non-interactive Zero-Knowledge Proofs for Arbitrary SQL-Query Verification. arXiv:2411.15031, SIGMOD 2025.

Briefing Notes

Receive the next issue.

One considered email a month, for the people who build and regulate data systems.

Continue readingAll insights →

Industry2026-05-157 min

Your bank's stress-test submission is cryptographic theater

The CCAR / DFAST process is the most expensive performance of supervisory verification in finance. None of it is mathematically verified. Here is what would change if it were.

Literature2026-05-226 min

A decade of verifiable databases, read and annotated

From IntegriDB (2015) to PoneglyphDB (2025), every system that has shaped how we think about cryptographic verifiability for SQL — and what we learned reading them in sequence.