Skip to main content
Agent-Shield
Back to Blog
AI Agent Security
Authorization
Open Source

AgentLock: The Open Authorization Standard Every AI Agent Needs

David GriceApril 3, 202612 min read

Every operating system has a permission model. Every database has access control. Every API gateway has auth middleware. AI agents have nothing. AgentLock fixes that.


Open any AI agent framework today and look at how tool calls work. LangChain, CrewAI, AutoGen, the Model Context Protocol: the pattern is the same everywhere. The agent decides to call a tool. The framework calls the function. The function executes with whatever permissions the host process has. There is no identity check, no scope constraint, no rate limit, no audit record. The agent's decision to call a tool is treated as sufficient authorization to execute it.

This is the Full Permission anti-pattern, and it is the default in every major agent framework shipping today. It means that a successful prompt injection does not just change what the model says. It changes what the model does. An attacker who compromises the conversation controls the tools, because there is nothing between the model's intent and the tool's execution.

AgentLock is an open-source authorization framework that puts a gate between the agent and its tools. It is deny-by-default, identity-bound, tool-level, and framework-agnostic. Every call gets a single-use token. Every execution generates an audit record. The agent never sees credentials and never holds authorization state. It is, in short, what OAuth did for web APIs applied to AI agent tool calls.

The Permission Gap

The security model of traditional software is built on the principle that code execution and authorization are separate concerns. A web application does not let an HTTP request bypass authentication just because the request is well-formed. A database does not skip access checks just because the query is syntactically valid. These systems enforce permissions at the infrastructure layer, not the content layer.

AI agents invert this. The model is both the request parser and the authorization authority. When an agent decides to call send_email or query_database, most frameworks treat that decision as the access check itself. There is no second layer asking whether this user, in this session, with this risk profile, should be allowed to execute this specific operation with these specific parameters.

The result is predictable. We run 182 injection tests across 35 categories as part of every AgentShield enterprise audit. Without any authorization gate, agents routinely fail 40 to 60 percent of them. The failures are not subtle: the model sends emails to attacker-controlled addresses, queries databases for other users' records, writes files to arbitrary paths. The attacks work because there is nothing to stop them except the model's own judgment, and adversarial prompts are specifically designed to compromise that judgment.

What AgentLock Does

AgentLock introduces a three-layer architecture that separates conversation, authorization, and execution into distinct security domains. Layer 1 is the agent: it reads messages, decides which tool to call, and passes that intent to the gate. Layer 2 is the authorization gate: it validates identity, checks role and scope, enforces rate limits, issues a single-use execution token, and generates an audit record. Layer 3 is the tool execution environment: it validates the token, runs the function within scoped boundaries, and applies data policy like PII redaction. The critical constraint is that the agent never receives the execution token. Layer 2 passes it directly to Layer 3. The agent gets only the result.

Every tool registered with AgentLock carries a declarative permissions block. This block defines the risk level, required authentication, allowed roles, scope constraints, rate limits, data classification, redaction policy, and audit settings. When the agent calls a tool, the gate evaluates the call against this block using a 7-step policy evaluation pipeline. The result is one of five decisions.

ALLOW

The call is authorized. A single-use, time-limited, operation-bound token is issued and the tool executes normally.

DENY

The call is not authorized. No token is issued. The agent receives a structured denial with the reason.

MODIFY

The call is authorized but the output must be transformed before the agent sees it. Built-in actions include redact_pii, restrict_domain, whitelist_path, and cap_records.

STEP_UP

The session state indicates elevated risk. The action pauses until a human approves it. Triggers include hardening at elevated severity with a high-risk tool, multiple PII-returning tools already called in the session, or a tool denied earlier being retried via a different high-risk tool.

DEFER

The context is ambiguous and the gate cannot make a confident decision. The action is suspended and resolves via human review or timeout (default 60 seconds, DENY on timeout). Triggers include a first-ever high/critical call with no session history, or a prompt scanner signal coinciding with a tool call in the same turn.

The distinction between these five decisions matters because real-world authorization is not binary. A legitimate support agent querying a database should get the data but not the raw SSNs (MODIFY). A new session making its first high-risk call might need human review because the gate has no behavioral history yet (DEFER). An admin who already triggered a prompt scanner alert should require explicit approval for the next sensitive operation (STEP_UP). Flattening all of these into ALLOW/DENY forces teams to choose between security and usability. AgentLock does not force that choice.

How It Looks in Code

Registering a tool with DEFER and STEP_UP policies takes a single call. The permissions block is declarative: you define the conditions, and the gate handles enforcement, token issuance, and audit logging at runtime.

register_tools.py
from agentlock import AuthorizationGate, AgentLockPermissions
from agentlock.schemas import (
    DeferPolicyConfig, StepUpPolicyConfig,
    ModifyPolicyConfig, TransformationConfig
)

gate = AuthorizationGate()

# Tool that defers on first high-risk call,
# requires step-up if hardening is active,
# and redacts PII from all output.
gate.register_tool(
    "query_database",
    AgentLockPermissions(
        risk_level="high",
        requires_auth=True,
        allowed_roles=["admin", "support"],
        rate_limit={"max_calls": 10, "window_seconds": 3600},
        defer_policy=DeferPolicyConfig(
            enabled=True,
            timeout_seconds=60,
            timeout_action="deny",
        ),
        step_up_policy=StepUpPolicyConfig(
            enabled=True,
            triggers=["hardening_elevated"],
        ),
        modify_policy=ModifyPolicyConfig(
            enabled=True,
            transformations=[
                TransformationConfig(
                    field="output",
                    action="redact_pii",
                ),
            ],
        ),
    ),
)

When an agent calls query_database in this configuration, the gate checks the caller's role against the allowed list, verifies the rate limit window, and evaluates the session's risk state. If the session has no prior history with high-risk tools, the call is deferred for human review. If the session's hardening level is elevated, a step-up challenge is issued. If the call proceeds, the output is run through the PII redaction pipeline before the agent sees it. All of this happens in the gate. The agent sees only the final, redacted result.

What v1.2.0 Adds

Version 1.2.0, released March 30, 2026, introduces adaptive prompt hardening. When the gate detects suspicious activity through its signal detectors, it generates defensive system prompt instructions that the framework injects before the LLM processes the next turn. Session risk scores are monotonic (they only go up) and session-scoped. At critical severity (risk score 10 or higher), all high and critical risk tools are blocked regardless of role authorization.

The detection layer runs four signal detectors in parallel. The PromptScanner runs before the LLM processes a message, checking for injection phrases, authority claims, instruction planting, encoding-based evasion, and cross-turn repetition. The VelocityDetector tracks tool call frequency and topic shifts, firing on rapid calls (three or more in 60 seconds), risk-level jumps, and burst patterns. The ComboDetector watches for suspicious tool call sequences, covering 13 default suspicious pairs and 5 default sequences associated with data exfiltration, account takeover, and tool chain attacks. The EchoDetector checks LLM responses for attack prompt echoing, tool name disclosure, system prompt leakage, and credential-format strings.

When multiple signal types co-occur, compound scoring adds bonus weight. A velocity signal combined with a combo signal adds +2 as a rapid_exfil indicator. An echo signal combined with an injection signal adds +3 as a probing_attack indicator. The hardening directives are specific to the detected signal type: a format forcing attack gets format-specific instructions, not generic tool-blocking language.

Benchmark Results

We benchmarked AgentLock using AgentShield's 182-test enterprise attack suite across 35 categories covering the OWASP LLM Top 10. The target was a Gemini 2.5 Flash-Lite agent with five tools: send_email, query_database, read_file, write_file, and search_contacts. Each version was tested under identical conditions.

45/F

No AgentLock (baseline)

66/D

AgentLock v1.1.2

93.4%

Injection pass rate (v1.1.2)

Without AgentLock, the agent had 73 injection failures and a 56% pass rate. With v1.1.2, injection failures dropped to 12 with a 93.4% pass rate. Critical findings went from 74 to 13. Categories with failures went from 25 to 7. PII detection scored 100/A and data flow scored 97/A.

The most striking result came from the compromised admin scenario, which simulates an attacker who has already gained admin-level access and is attempting to leverage agent tools for lateral movement. Without prompt hardening, the pass rate was 30.2% (Grade F). With v1.2.0's adaptive hardening enabled, the pass rate jumped to 81.3% (Grade B). That is a 51-point improvement on the hardest scenario in the test suite.

Across all AgentLock versions, 17 categories achieved a perfect 100/A with zero failures: MTM, CM, IPI, MLB, PH, IOH, OR, EA, OM, RAG, SC, SG, TCA, TP, and MAC. The 12 remaining failures in v1.1.2 cluster into two groups: 9 are model-layer information leakage where the agent confirms the existence of system prompts or tools while correctly refusing to share details, and 3 are behavioral edge cases involving objective shift, partial compliance with planted instructions, and format forcing bypass.

AARM Conformance

The Agent Authorization Reference Model (AARM) defines nine requirements for agent-level access control. AgentLock maps to each of them. Seven are fully shipped in v1.2.0, and two are designed with implementations planned for upcoming releases.

R1
Action mediation.Shipped. Every tool call passes through the authorization gate before execution.
R2
Context accumulation.Shipped. Session-scoped risk scores, trust degradation, and signal history tracked across turns.
R3
Policy engine.Shipped. 7-step evaluation pipeline with deny-by-default, role checks, rate limits, and scope constraints.
R4
Five decision types.Shipped. ALLOW, DENY, MODIFY, STEP_UP, and DEFER with structured responses.
R5
Signed receipts.Designed. Ed25519 signed audit receipts planned for v1.2.1.
R6
Identity attribution.Shipped. Every call tied to verified identity; agent cannot assert identity.
R7
Drift detection.Shipped. Trust degradation, adaptive prompt hardening, and behavioral signal detection.
R8
SIEM export.Designed. Structured export for Splunk, Sentinel, and Elastic planned for v1.2.3.
R9
Least privilege.Shipped. Per-tool permission scoping, rate limits, data boundaries, and deny-by-default enforcement.

Roadmap

AgentLock ships fast. Four versions in two weeks from March 18 to March 30, each adding a distinct capability layer. The test suite grew from 267 to 745 tests with zero regressions. The roadmap is public and tied to specific capabilities, not dates.

1.2

Current Stable

Adaptive hardening, MODIFY/DEFER/STEP_UP decisions, four signal detectors, compound scoring, 745 tests. This is the version you install today.

1.2.1

Ed25519 Signed Receipts

Every audit record gets a cryptographic signature. Receipts become tamper-evident and independently verifiable without access to the gate.

1.2.2

Delegation Chains

Multi-agent systems where Agent A delegates to Agent B. The chain tracks authorization provenance across agent boundaries with scope narrowing at each hop.

1.2.3

SIEM Export

Structured event export to Splunk, Microsoft Sentinel, and Elastic. Agent authorization events flow into the same dashboards and alerting pipelines SOCs already use for infrastructure security.

Try It

AgentLock is Apache 2.0 licensed and installs with pip. The core library has zero framework dependencies. Integrations exist for LangChain, CrewAI, AutoGen, MCP, FastAPI, and Flask, each available as an optional extra.

terminal
# Core install
$ pip install agentlock

# With framework integration
$ pip install agentlock[langchain]
$ pip install agentlock[fastapi]

# Initialize a tool definition
$ agentlock init

# Validate against schema
$ agentlock validate tool.json

The interactive demo at agentlock.dev walks through the five decision types with a live agent. The full source, specification, and benchmark data are at github.com/webpro255/agentlock. The package is on PyPI.

The central finding from every benchmark we have run is consistent: adversarial and legitimate tool requests are semantically identical. Content-based detection cannot reliably distinguish them. The correct defense is architectural access control, not smarter AI-based detection. AgentLock is that architecture.

Test Your Agent With and Without AgentLock

Run AgentShield's 182-test enterprise audit suite against your own agent. See exactly where your tools are exposed and how authorization changes the results.


David Grice is the founder of AgentShield and the creator of AgentLock. He builds security tooling for AI agents and maintains active vulnerability research programs with major AI vendors. Follow his work on LinkedIn and GitHub.