The problem
What today's M2M auth can't deliver in agentic ai.
A high-risk AI system makes a thousand decisions an hour. An agentic deployment with tool calls makes ten thousand. The AI Act (Regulation (EU) 2024/1689, in force 1 August 2024; high-risk obligations applicable from 2 August 2026) requires the system to be designed for effective human oversight (Art. 14), to record events over its lifetime (Art. 12), and — for remote biometric identification — to require two qualified natural persons to separately confirm each identification before any action follows (Art. 14(5)).
In production, the evidence stack stops at the application log. Application logs are mutable, vendor-controlled, and not cryptographically committed. Prompt and response logs capture conversation turns; they do not capture the tool execution that happened mid-conversation, the parameters passed, or the human action that approved or vetoed it. MLflow and Weights & Biases capture training and evaluation metrics; they are not designed for per-decision production-time evidence. Custom audit databases are the deployer's own evidence trail, exactly as verifiable as the deployer's word. Human-in-the-loop UIs capture clicks; the clicks live in the operator's database, on the operator's terms.
When the market-surveillance authority asks under Article 73 (show the serious-incident evidence), or when the conformity-assessment body asks under Article 43 (demonstrate Art. 14 oversight design), or when a court asks did two qualified humans actually verify this biometric identification before the police acted, the deployer has logs they wrote about themselves. There is no operator-independent record of what the agent did, what the tools returned, or what the human in the loop chose. This gap is widening. Tool-using agents are entering production faster than oversight infrastructure can keep up. The MCP (Model Context Protocol) standardisation makes tool-call topologies more uniform but does not make them auditable.
How EdSSA addresses it
What EdSSA does differently here.
EdSSA Agentic AI runs an attestor at each tool-execution boundary and each human-oversight boundary. When the agent invokes a tool — function call, MCP server invocation, retrieval against a private knowledge base, API call to a downstream service, action against an enterprise system — the attestor emits a cryptographically signed record describing what happened. The record commits to the input, the tool identifier, the model version, the response surface, and the calling context, without exposing the underlying data. When the human in the loop acts — approves, overrides, disregards, reverses, or hits the stop button required by Art. 14(4)(e) — the attestor emits a second signed record committing to the natural person's identity (via the WebAuthn credential set), the timestamp, and the decision metadata.
Each attestor's records anchor into a Tier-4 Merkle-anchored audit chain with a seven-year retention floor — well above the Art. 26(6) deployer log-retention minimum of six months. The chain is operator-independent: it can be replicated to mirror stores held by the deployer, the provider, the notified body, the market-surveillance authority, or a third-party transparency-log operator. The chain composes across multi-agent topologies by design — a supervisor agent's chain references the worker agents' chains by cryptographic anchor; the worker agents' chains reference the tool boundaries they crossed.
EdSSA does not make the model safer, more accurate, less biased, or less prone to hallucination. It does not evaluate AI quality, alignment, fairness, or robustness. It does not gate, filter, or modify the agent's actions. It does not certify that the human in the loop was paying attention, exercising judgement independently, or competent for the decision. What it produces is the cryptographic evidence of what the agents did and what the humans signed — independently verifiable, in regulator-grade form, on demand. A deployer cannot truthfully claim EdSSA guarantees its agents behave well. A deployer can truthfully claim EdSSA produces the per-decision evidence the AI Act requires and the current observability stack cannot.
Use cases
Concrete operational scenarios.
- Every tool call by an agent attested as a processor event at the execution boundary
- Human-in-the-loop override, disregard, reverse and stop events anchored as WebAuthn-attested actions
- Two-person verification evidence for Annex III(1)(a) remote biometric identification under Art. 14(5)
- Multi-agent topologies composing chains via cryptographic anchor across supervisor and worker agents
- AI Act Art. 73 serious-incident reporting evidence on the immediately + ≤15-day cadence
- Deployer Art. 26(6) ≥6-month log retention satisfied by the Tier-4 seven-year floor
Compliance & standards
Standards and regulatory regimes.
AI Act Art. 14 human oversight (the headline article). Art. 12 record-keeping — the strongest architectural fit; the Tier-4 chain is structurally the record-keeping substrate the article describes. Art. 14(5) two-person verification for Annex III(1)(a) biometric ID. Art. 15(5) cybersecurity for I/O integrity at the boundary. Art. 26(6) deployer log retention ≥ 6 months. Art. 73 serious-incident reporting on the immediately, ≤ 15-day default plus the ≤ 2-day infrastructure fast path and the ≤ 10-day fatality path. Cross-fits GDPR Art. 22 automated-decision-making safeguards (Annex III(4)/(5)), NIS2 (Annex III(2) critical infrastructure), and DORA (when the deployer is an EU financial entity).
Audit emission
Per-tool-call and per-oversight-event records into the Tier-4 chain. edssa-admin compliance-export --regime ai-act-art14 produces a regulator-grade evidence bundle for market-surveillance authorities, notified bodies, conformity-assessment audits, and the provider's QMS. Recipients verify independently with edssa-admin verify-anchor. No operator cooperation required after the archive handover.
Customers
Operators in this vertical.
“We had a coherent story for model evaluation. We did not have one that survived a regulator asking who oversaw this agent's decision and what tools it called before it acted. Now we do.”