Skip to content

An agentic AI that can send email, edit a CRM, move money, or call shell · is a security surface, not a chatbot. The blast radius of a jailbreak / prompt injection multiplies by whatever tools you wired up. This is the 4-layer pattern we apply on every agent build.

Layer 1 · Capability scoping

The agent gets tools that are the minimum it needs. Not 'send_email' in general · 'send_email_to_customer' where 'customer' is scoped to the current user's account. Design each tool as narrow as possible; any tool that could 'do anything' is wrong.

Layer 2 · User-backed authorization

Every tool wrapper re-authorizes the current user. The LLM can call the tool; the tool checks 'is this user allowed to do X to Y'. Treat the LLM as untrusted caller logic · always.

def safe_send_email(to: str, body: str, ctx: UserCtx):
    if to not in ctx.allowed_recipients:
        raise PermissionError(f"recipient {to} not authorized")
    if len(body) > MAX_BODY:
        raise ValueError("body too long")
    audit_log(ctx.user_id, "send_email", to, hash(body))
    return email.send(to=to, body=body, from=ctx.user_email)

Layer 3 · Per-call audit logging

Every tool call · who, what, when, why (the prompt that triggered it), how much (tokens, dollars). Keep 90-day retention minimum for compliance + debugging. You'll need this for the first incident; not having it at that moment is the worst-case day.

Layer 4 · Circuit breakers

Rate limit per user. Rate limit per tool. Global kill switch on misbehaviour. When MTTR (mean-time-to-revoke) of a compromised agent is < 10 minutes, the damage budget is bounded. When it's an hour, it isn't.

If you can't answer 'what can the agent do that would materially hurt the business?' in < 60 seconds, you haven't scoped tools enough. Start by reducing surface, not by adding guardrails to a broad one.

ShareXLinkedIn#
Dezső Mező

By

Dezső Mező

Founder, DField Solutions

I've shipped production products from fintech to creator-tooling · for startups and enterprises, from Budapest to San Francisco.

Keep reading

RELATED PROJECTS

Would rather build together?

Let's talk about your project. 30 minutes, no strings.

Let's talk