AI systems are attack surfaces. Every prompt is potential input injection. Every model response is potential data leakage. Every tool call is potential privilege escalation. If you're building AI applications without security as a foundational concern, you're building a breach waiting to happen.
Here's how to do it right.
The AI Security Threat Model
Traditional applications have well-understood threat models. AI applications introduce new attack vectors:
| Attack Vector | Traditional App | AI Application |
|---|---|---|
| Input injection | SQL injection | Prompt injection |
| Data leakage | Database exposure | Training data extraction |
| Privilege escalation | Auth bypass | Tool permission abuse |
| Denial of service | Resource exhaustion | Infinite loops, token bombing |
Layer 1: Input Sanitization
Every input to an AI system must be validated and sanitized:
{
(: ): <> {
(.(input)) {
()
}
truncated = input.(, )
encoded = .(truncated)
{
: ,
: input.,
:
}
}
(: ): {
patterns = [
,
,
,
,
]
patterns.( p.(input))
}
}
This isn't foolproof—prompt injection is fundamentally hard to prevent—but it raises the bar significantly.
Layer 2: Output Validation
Never trust model outputs. Validate everything before acting on it:
{
(: ): <> {
parsed = .(output)
validated = .(parsed)
( .(validated)) {
.(validated)
}
(validated.) {
.(validated.)
}
validated
}
(: ): <> {
sensitivePatterns = [
,
,
,
]
str = .(output)
sensitivePatterns.( p.(str))
}
}
Layer 3: Tool Permission System
AI agents with tools are especially dangerous. Implement strict permission controls:
{
:
: []
: []
:
:
}
{
: <, []>
(
: ,
: ,
: ,
:
): <> {
agentPerms = ..(agentId) || []
toolPerm = agentPerms.( p. === tool)
(!toolPerm) {
{ : , : }
}
(!toolPerm..(action)) {
{ : , : }
}
(!toolPerm..( p.(resource))) {
{ : , : }
}
( .(agentId, tool)) {
{ : , : }
}
(toolPerm.) {
{ : , : , : }
}
{ : }
}
}
Key principles:
- Deny by default: Agents have no permissions until explicitly granted
- Least privilege: Grant minimum permissions needed
- Human in the loop: Require approval for sensitive operations
Layer 4: Model Isolation
Run models in isolated environments:
{
(
: ,
: ,
:
): <> {
sandbox = .({
: ,
: ,
: ,
: ,
:
})
{
output = sandbox.( () => {
model.(input, context)
})
output
} {
sandbox.()
}
}
}
This prevents:
- Models accessing unauthorized resources
- Infinite loops consuming unbounded resources
- Side-channel attacks through filesystem/network
Layer 5: Audit Logging
Log everything for forensic analysis:
{
:
:
:
?:
: {
:
:
:
}
: {
:
: []
:
}
: {
:
:
:
:
: | |
}[]
:
}
Use these logs for:
- Real-time threat detection
- Post-incident forensics
- Compliance reporting
- Model behavior analysis
Defense in Depth
No single layer is sufficient. Stack them:
User Input
│
▼
┌───────────────────────┐
│ Input Sanitization │ ← Layer 1
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Model Isolation │ ← Layer 4
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Output Validation │ ← Layer 2
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Tool Permissions │ ← Layer 3
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Audit Logging │ ← Layer 5
└───────────────────────┘
│
▼
Safe Output
The Hard Truth
Perfect security is impossible. Prompt injection, in particular, is an unsolved problem—there's no foolproof way to distinguish "data" from "instructions" when everything is natural language.
But "hard" doesn't mean "don't try." These layers dramatically reduce your attack surface and make successful attacks much harder and more detectable.
Build AI systems like you're building banking software. Because increasingly, you are.