Prompt injection: why we don't fight it

The vendor I most want you to buy is not us.

Most vendor blogs sell the vendor. This one sells the layer below us first.

If you are evaluating AI security for a regulated bank, a Class IIa medical device, or a healthcare workflow, and you only buy Adjudon and skip the prompt-layer, you have bought half a stack. Half a stack is the most dangerous shape — it gives compliance officers a checkbox without protecting the surface that gets attacked first.

The vendors I want you to read first are not Adjudon. They are Lakera Guard (now part of Check Point), CalypsoAI Inference Defend (now F5 AI Guardrails), Cisco AI Defense (built around the former Robust Intelligence team), HiddenLayer, Arthur Shield, and the open-source pair of NVIDIA NeMo Guardrails and Guardrails AI. Pick one — preferably the one that fits the security platform your team already runs.

Then come back and we will talk about what nobody on that list does.

What Cluster A actually does.

Cluster A is the AI-security layer that sits in the request path between your application and your LLM. It exists because LLMs introduced an attack surface that conventional web-application firewalls do not recognise.

The threat-model is the OWASP Top 10 for LLM Applications. The top entries: prompt injection (direct, where a user types a malicious string; indirect, where the malicious string arrives via a retrieved document or a tool output), jailbreaks (variants of “ignore your previous instructions”), PII leakage, toxic output, insecure output handling, model denial-of-service, and agent-tool misuse including MCP-server manipulation.

The architectural shape is consistent across the category. An inline proxy (or a proxy-equivalent SDK hook) sits in front of the LLM. Inputs are inspected before they reach the model. Outputs are inspected before they reach the user or any downstream system. Cleared content moves on; flagged content is blocked, sanitised, or routed for review.

Lakera Guard offers this as an API service that can also deploy as a Docker container in your VPC, with a sub-50-millisecond inspection latency that Dropbox's engineering team measured in production. CalypsoAI Inference Defend is the same pattern with a Splunk-native telemetry path. Cisco AI Defense is the same pattern integrated into Cisco's Security Cloud. NeMo Guardrails is the open-source equivalent, programmable through a small DSL called Colang. The vendor identities differ; the lifecycle position is identical.

The 2024–2026 consolidation.

A category that was a swarm of independent AI-native startups in 2023 is, by mid-2026, mostly absorbed.

The chronology is short. Cisco announced the acquisition of Robust Intelligence in August 2024 (~$400M), and launched Cisco AI Defense built around that team in January 2025. Tenable announced its intent to acquire Apex Security in May 2025 (~$105M). Palo Alto Networks closed its acquisition of Protect AI in July 2025 (Jefferies estimated $650–700M). SentinelOne signed a definitive agreement for Prompt Security in August 2025 (~$180M, SEC-disclosed). Cato Networks announced Aim Security in September 2025 (~$350–400M). F5 closed CalypsoAI in late September 2025 ($180M). Check Point announced Lakera in late September 2025, closing in Q4 (~$300M). OpenAI absorbed Promptfoo in March 2026.

What is left as independent pure-play: HiddenLayer in Austin, Arthur AI in New York, plus the open-source projects that do not need acquiring (NeMo Guardrails, Guardrails AI).

What this means for a buyer: if your organisation already runs Cisco Security Cloud, Palo Alto Prisma, F5 Distributed Cloud, Check Point Infinity, Cato SASE, or SentinelOne Singularity, your AI-security choice has largely been made for you by the platform you already trust. The bundling is a feature, not a coincidence.

Where Cluster A ends.

Cluster A's responsibility ends at the LLM-output rail. When the output guardrail clears a response, the response is “safe” — and the vendor's job is done.

The word “safe” is doing a lot of work in that sentence. In the Cluster-A frame, “safe” means: no prompt injection detected, no PII leaked, no toxic output, no jailbreak. Safe means clean signal. The model said something the guardrail believes is acceptable.

What “safe” does not mean: this output is the right business decision for this customer at this moment under this policy with this level of human oversight. That second meaning is not the prompt-layer's question. It cannot be answered with content classifiers, regex matches, or learned safety models. It can only be answered by binding the LLM output to a concrete business action and recording, durably and reproducibly, what was decided and why.

Cluster A does log. Lakera, CalypsoAI, and Cisco AI Defense all emit guardrail telemetry to SIEMs (Splunk, Grafana, custom event stores). But those logs record guardrail events — block, allow, sanitise — not decisions. A 2027 BaFin examiner who walks in and asks why a specific loan was declined in October 2026 will not find that answer in a Lakera log. They will find that the prompt was clean.

The decision is the second log. Cluster A does not produce it. That is not a flaw in Cluster A; it is the boundary.

What we do that Cluster A doesn’t.

Adjudon begins where Cluster A ends. The decision-audit-layer's five concerns sit downstream of the cleared LLM output.

First, binding. The LLM output that Cluster A cleared has to mean something operationally — deny this loan, flag this prescription, escalate this KYC score. The decision-layer's first job is to bind the output to a concrete business action with a stable identifier (traceId) that survives system migrations and a regulator's 2030 retrieval request.

Second, post-output policy. The output may have been linguistically clean and operationally still wrong. Adjudon's policy engine evaluates the bound decision against deterministic rules — deny over 100,000 EUR without second signature, route any high-ambiguity confidence flag to a human reviewer. The policy returns HTTP 403 with a structured code and a matchedPolicy.name. Rule-driven, not content-classified.

Third, the chain. Each decision becomes a HashChainEntry carrying prevHash, payloadDigest, and chainHash. The chain is append-only, SHA-256, and replay-verifiable offline against the algorithm published at docs.adjudon.com/concepts/audit-and-security. The chain runs without an Adjudon login. That last sentence is the architectural commitment.

Fourth, the explainability record. GDPR Article 22 wants a justification the data-subject can request, contest, and have reviewed. The decision payload carries the inputs, the matched policy, the confidence flags, and the reviewer if one was involved. One trace, one chain row, one disclosure-ready record.

Fifth, the human-in-the-loop. A ReviewItem linked to the trace by traceId lets a named reviewer approve, reject, or escalate. The reviewer's decision is part of the chain entry, not a parallel system that could drift.

These five are the regulatory anchors the prompt-layer cannot reach: EU AI Act Articles 12, 13, and 14; GDPR Article 22; FDA 21 CFR Part 11; HIPAA § 164.312(b); BaFin MaRisk AT 4.3.1; DORA Articles 17, 19, and 27 RTS RMF.

Why a regulated stack needs both.

Imagine the auditor scenario. A BaFin examiner walks into a German bank in 2027 and asks two questions.

First: How do you prevent your AI from doing harm in real time? The bank points to its prompt-layer — Lakera, CalypsoAI, Cisco AI Defense, whoever — and produces guardrail logs, red-team reports, latency benchmarks. Prompt injection was blocked, PII was redacted, jailbreaks were detected. Question one: answered.

Second: Show me what your AI actually decided. Why. By whom. With what evidence. The bank points to its decision-layer — Adjudon, or the equivalent — and produces the chain. The auditor verifies it offline against the published algorithm, cross-references the matched policy, reads the reviewer's name, follows the trace back to the inputs. Question two: answered.

A bank that bought only Cluster A passes question one and fails question two. The chain does not exist; the decisions cannot be reproduced.

A bank that bought only Adjudon passes question two and exposes itself on question one. The prompts feeding the LLM could have been manipulated and there is no real-time detection.

A bank that bought both has both answers, on the right timeline, in the right shape. That is the only configuration the BaFin examiner expects to see by 2027.

What we will never build.

A short closing list of commitments.

We will not build a prompt-layer firewall and call it ours. The category is mature; replicating it would be vendor-bloat.

We will not white-label NeMo Guardrails or Guardrails AI as Adjudon technology. They are open-source projects with their own governance; we benefit when our customers run them next door, but we do not relabel.

We will not pretend, in a procurement-PDF, to do what Lakera does, just so a CISO can have one fewer vendor on a slide. The cost of that pretence is paid in 2027 by an auditor who finds a chain the bank cannot defend.

The architecture for a regulated stack: pick a Cluster-A vendor that fits the security platform your team already runs — Lakera if Check Point, Cisco AI Defense if Cisco, F5 AI Guardrails if F5, Prisma AIRS if Palo Alto, Cato + Aim if SASE-first, SentinelOne + Prompt Security if endpoint-first, NeMo Guardrails or Guardrails AI if open-source-first — and add Adjudon for the decision-audit-layer.

Two vendors, one regulator-readable stack.

Each layer, in six rows.

Six rows pair what each layer answers. The columns do not overlap — that is the architectural point. Where the prompt-layer's job ends, the decision-layer's begins. A regulated stack picks one from each column; either alone is half a stack.

	Prompt-layer (Cluster A)	Decision-layer (Adjudon)
Where it sits	Inline proxy in front of the LLM, inspecting input and output rails.	After the LLM clears Cluster A, before the bound decision becomes a business action.
What it inspects	Prompts, retrieved context, model outputs, tool arguments.	The bound decision, the matched policy, the reviewer (if any), the chain entry.
Threats it covers	Prompt injection (direct + indirect), jailbreaks, PII leakage, toxic output, model denial-of-service, tool-misuse.	Decision drift, missed policy match, audit-trail tampering, evidentiary gap during examination, retention loss.
Regulatory anchors	OWASP Top 10 for LLM Applications, NIST AI RMF, MITRE ATLAS.	EU AI Act Articles 12, 13, 14; GDPR Article 22; FDA 21 CFR Part 11; HIPAA § 164.312(b); BaFin MaRisk AT 4.3.1; DORA Articles 17, 19, 27 RTS RMF.
Vendor examples	Lakera Guard, CalypsoAI Inference Defend, Cisco AI Defense, F5 AI Guardrails, Prisma AIRS, HiddenLayer, Arthur Shield, NeMo Guardrails, Guardrails AI.	Adjudon — and any other vendor that ships an append-only chain with replay-verifiable export. The category is small.
Latency posture	Sub-50 ms inspection, inline-blocking (Lakera measured at Dropbox in production).	p50 < 10 ms, p95 < 25 ms, p99 < 45 ms ingestion, post-output (Adjudon backend SLA).

Cluster A latency is on the request's critical path; Adjudon ingestion is downstream of the response and does not affect end-user wait time.

We don't fight prompt injection — and what we fight instead.