On Article 13

Why Article 13 transparency wants a hash, not a trust score.

The EU AI Act asks your high-risk system to be “sufficiently transparent.” A hash chain proves it. A probabilistic “trust score” just hopes.

Dato Bitarishvili, founding engineer·5 May 2026·10 min read

The two words that decide everything: “sufficiently transparent.”

Article 13(1) of the EU AI Act is, by statutory standards, short. The load-bearing language sits in a single phrase.

“High-risk AI systems shall be designed and developed in such a way as to ensure that their operation is sufficiently transparent” — that is the architectural demand. The continuation specifies the purpose: “to enable deployers to interpret a system's output and use it appropriately.”

Strip the formality and the obligation reads as a request: make the operation legible to the deployer. The deployer is the entity that actually runs the system in production — the bank running the credit-underwriter agent, the hospital running the triage assistant, the recruiter running the candidate-screener. That deployer must be able to read what the AI did, why it did it, and decide whether to use the output appropriately.

“Sufficiently” is the door the regulation leaves open. Sufficient for whom? For the deployer's interpretation. For their downstream regulator's audit. For the data-subject's contestation under GDPR Article 22. For the Wirtschaftsprüfer reading the evidence three years after the decision. The same evidence has to satisfy four readers at four different moments — and that is what makes “sufficiently” load-bearing in a way the word does not look at first.

The Act does not specify the technical mechanism. It does specify the test the result has to pass.

Why “trust score” reads as evasion.

A category of AI-governance vendor ships compliance evidence as a percentage. Compliance confidence: 87%. Risk score: amber. Trust signal: positive. The number is the artifact the vendor expects you to put in the audit folder.

The honest reading of these outputs is that they are opinions of a machine learning system about another machine learning system. The percentage is computed, not arbitrary. But its underlying claim — that the AI's behaviour is probably compliant — is structurally different from the claim Article 13 demands.

Article 13 does not ask: “Is your system probably compliant?” It asks: “Can the deployer read what the output was, why it was that, and use it appropriately?” These two questions are not the same question, and the second cannot be answered by a probability about the first.

The evasion lives in the gap. A 92% trust score does not let the deployer read the input, the decision, and the policy applied. It compresses an entire decision-trace into a single confidence interval, then asks the deployer to trust the compression. When the regulator asks the deployer to interpret the output, the deployer hands over the score, and the regulator asks: “Can I see the decision?” The score does not contain the decision. It contains an opinion about the decision.

That is what “just hopes” meant in the dek.

What a hash answers that an estimate doesn’t.

A SHA-256 hash is the inverse rhetoric of a trust score.

A trust score says: given everything I know about this system, I assess the probability of compliant behaviour at X. The probability is the artifact. The reasoning behind it is summarized.

A hash says: here are the bytes that were decided, on this date, by this model version, against this policy. The chain entry is committed; if anyone alters anything in the trace this entry covers, the hash will not match the next entry's prevHash, and the chain will refuse to verify.

A regulator asking “can I see what the AI decided?” gets a different answer from each. From the trust score, the answer is methodology and a confidence interval. From the hash, the answer is a JSON object with the decision, the input, the policy match, the confidence flags, and a chainHash that anchors all of it cryptographically.

Article 13 does not specify SHA-256. It specifies that the deployer must be able to interpret the output. Of the two answers, only one renders an interpretable output. The wedge is the difference between “I think it complied” and “here is what it did.” Article 13 wants the second.

The four-step replay.

The chain is not magic; it is a four-step procedure that any auditor with a JSON parser and a SHA-256 implementation can run independently.

Step 1. Each DecisionTrace becomes a HashChainEntry at ingestion. The entry carries prevHash (the previous entry's chainHash), payloadDigest (sha256(canonicalJson(traceView))), sequence (the per-org monotonic counter), and createdAt (ISO-8601).

Step 2. The chain is exported as a single JSON bundle via GET /api/v1/hash-chain/export. The bundle is self-contained — every entry, every hash, the verify algorithm reference. It downloads without an Adjudon login.

Step 3. The auditor recomputes each entry's chainHash from sha256(prevHash || payloadDigest || sequence || createdAt) and compares to the stored value. A passing replay returns { verified: true, lastValidSequence: N }. A break returns { verified: false, brokenAtSequence: <n>, brokenReason: 'prev-hash-mismatch' | 'chain-hash-mismatch' }.

Step 4. If the chain verifies, the trace each entry covers is the answer to “what did the AI decide?” — readable, indexed by traceId.

No third state. The chain either verifies, or it tells the auditor exactly which sequence number broke and why.

Article 13 in plain enumeration.

Article 13 has three subsections. Each maps to a specific evidence-shape.

Article 13(1) — sufficient transparency. The deployer must interpret the output. Adjudon answer: every DecisionTrace carries inputContext, outputDecision, confidenceScore, status, rationale, and matchedPolicy.name. All six fields are chain-anchored via payloadDigest.

Article 13(2) — instructions for use. The system must be accompanied by instructions covering its intended purpose, performance, foreseeable misuse, and the deployer's monitoring obligations. Adjudon answer: this lives outside the runtime trace — in product documentation, the SDK reference, and the customer-facing docs. The runtime chain demonstrates that the documented instructions match what the runtime actually does.

Article 13(3) — content of instructions. Subparagraphs (a) through (f) list specific items: provider identity, performance characteristics, conditions of operation, foreseeable misuse, expected lifetime. Adjudon answer: this is the bank's responsibility — their AI agent is the high-risk system, not Adjudon. Adjudon supports the bank's compliance by providing the runtime evidence the bank must reference.

The clean separation is intentional. Adjudon is not the high-risk AI system. Adjudon is the runtime evidence layer the bank uses to prove its high-risk system met Article 13(1) at decision time.

What Article 13 does not require.

Vendors over-promise on Article 13 in two consistent ways.

First, “explainability.” Article 13 does not require that the AI vendor explain why the model produced a specific output in the sense of mechanistic interpretability or attention-weight visualizations. It requires that the deployer be able to read what the output was — input, decision, confidence, policy match. There is a difference between understanding why a neural network weighted token X over token Y and being able to read that the system decided to flag this loan with this confidence under this policy. Article 13 wants the second; vendors selling the first are selling research, not compliance.

Second, “certification.” Article 13 does not require the AI vendor to be certified. It requires the high-risk system to operate transparently. A vendor positioning itself as “Article-13-certified” is selling a label that does not exist in the regulation. ISO 42001 certification is real and useful for the AI Management System; Article 13 evidence is what the runtime produces, not what the management certificate produces.

A buyer who pays for explainability tools or “Article-13-certified” vendor stamps is paying for things the regulation did not ask for. The hash chain is the thing it asked for. Buy that.

The chain is the algorithm, published.

The verify algorithm — the four steps in the previous section — is published at docs.adjudon.com/concepts/audit-and-security. It is a public document, not a vendor-private specification.

Three architectural commitments follow.

The chain runs without our login. The export bundle plus the published algorithm are sufficient to run a replay on the auditor's own machine. We are not in the request path of step 3.

The chain runs after we are gone. If Adjudon disappears between export and replay, the bundle still verifies. The chain is not a managed service the auditor relies on; it is a static reference the auditor reads.

The chain runs against contradiction. Cardinal Rule #5 in our codebase is that no updateOne() or findByIdAndUpdate() is ever called against an existing chain entry. Tampering breaks the next prevHash. The chain refuses to lie.

Article 13 demands sufficient transparency. The hash chain — published, exportable, replayable, refusing-to-lie — is the deterministic answer to a probabilistic question. Vendors who ship trust scores are answering a different question: how confident are we that the AI complied? That is not what Article 13 asks. Article 13 asks: what did the AI do?

The chain answers. The estimate doesn't.

What “verified: true” actually returns.

The previous section promised the chain returns bytes, not an estimate. Here are the bytes. The request is what an auditor sends to verify the chain; the response is what comes back. Read the response and notice what is missing — a confidence interval, a percentage, a probability of compliance.

The request
POST /api/v1/hash-chain/verify
Authorization: Bearer <jwt>
The Adjudon response
{
  "verified":           true,
  "ok":                 true,
  "totalChecked":       17493,
  "lastValidSequence":  17493,
  "brokenAtSequence":   null,
  "brokenReason":       null,
  "durationMs":         842,
  "verifiedAt":         "2026-05-03T10:14:22.317Z"
}

Eight fields. None of them is a probability. verified is a boolean from a replay, not an estimate. lastValidSequence tells the auditor exactly how far the chain held. brokenAtSequence would point at the break if there were one. The whole response is facts about a replay, not opinions about a model.

Illustrative trust-score response (not a specific vendor)
{
  "compliance_confidence": 0.87,
  "risk_score":            "amber",
  "trust_signal":          "positive",
  "methodology":           "proprietary-ml-classifier-v3",
  "confidence_interval":   [0.82, 0.92],
  "evaluated_at":          "2026-05-03T10:14:22.317Z"
}

Six fields. Five of them are opinions. compliance_confidence is a model's guess. risk_score is a category assignment. trust_signal is a marketing label. methodology points at a black box. confidence_interval is the model's honesty about its own guess.

Article 13 wants the deployer to read the output. The first response gives the deployer something to read. The second response gives the deployer something to trust. The auditor's terminal sees the difference.

What the auditor reads next.

This post is the argument. /compliance places the EU AI Act in a table with the other DACH/EU frameworks; /solutions/eu-ai-act stays in the Act and walks through Articles 13, 14, 26, 27, 73 with the backend artefact per row.