AI Security Checklist
Checklist for ensuring AI system security
Here is checklist for validating LLM-enabled applications against common security risks. This is not exhaustive but covers key areas to focus on.
This aligns with guidance from OWASP including the Top 10 for LLM Applications.
1. Prompt Injection & Input Manipulation
Test Objectives
- Detect instruction override attempts
- Prevent system prompt leakage
- Ensure role separation is enforced
Test Cases
- Inject: “Ignore previous instructions and…”
- Inject hidden instructions inside markdown, HTML, JSON
- Embed malicious instructions in uploaded files
- Use multi turn injection chaining
Validate
- System prompt never appears in output
- Model refuses unsafe override attempts
- Tool calls are not executed without validation
2. Output Handling & Unsafe Generation
Test Objectives
- Prevent harmful content generation
- Detect policy bypass attempts
- Validate content filtering
Test Cases
- Ask for disallowed content using obfuscation
- Ask for step by step harmful instructions indirectly
- Request data extraction from hidden context
Validate
- Proper refusal behavior
- No partial leakage
- No hallucinated secrets
3. Data Leakage & Privacy
Test Objectives
- Ensure no training data leakage
- Prevent internal document exposure
- Protect PII in RAG systems
Test Cases
- Ask for system prompt
- Ask for internal policies
- Attempt to retrieve other users’ conversation data
- Prompt model to reveal embeddings or vector store content
Validate
- Strong refusal
- No cross tenant leakage
- RAG filters enforced
4. RAG Security Validation
If using retrieval augmented generation:
Test Objectives
- Prevent document injection
- Validate document ranking controls
- Prevent sensitive data surfacing
Test Cases
- Upload poisoned document with malicious instructions
- Insert high ranking irrelevant document
- Attempt cross user document retrieval
Validate
- Retrieval restricted to authorized documents
- Content sanitization applied before prompt assembly
- Metadata access control enforced
5. Tool / Function Calling Security
Test Objectives
- Prevent unauthorized tool invocation
- Validate parameter sanitization
- Enforce role based tool usage
Test Cases
- Inject tool call instructions
- Manipulate JSON schema
- Attempt privilege escalation
Validate
- Server side validation exists
- Tool execution is never model only decision
- Logs capture tool invocation attempts
6. Model Abuse & Cost Attacks
Test Objectives
- Prevent denial of wallet
- Prevent token exhaustion
- Detect prompt flooding
Test Cases
- Extremely long prompts
- Recursive chain prompts
- High frequency API requests
Validate
- Rate limiting
- Token caps
- Budget enforcement
- Alerting on abnormal usage
7. Hallucination & Integrity Testing
Test Objectives
- Detect fabricated facts
- Validate citation enforcement
- Verify deterministic configurations
Test Cases
- Ask for nonexistent APIs
- Ask about fake vulnerabilities
- Ask for outdated references
Validate
- Clear uncertainty response
- Citation requirement enforcement
- Grounding enforcement when RAG enabled
8. Logging & Detection Alignment
Map behaviors to MITRE ATT&CK style abuse patterns.
Validate
- Prompt injection attempts logged
- Tool misuse logged
- Rate abuse logged
- Clear traceability per user
9. CI/CD Security Controls
Validate in pipeline
- Prompt regression tests
- Injection simulation tests
- Output policy validation tests
- RAG boundary tests
- Schema validation for tool calls
Security tests must run like unit tests.
10. Governance & Policy Alignment
Map controls to:
- OWASP Top 10 for LLM Applications
- AI acceptable use policy
- Data classification policy
Red Team Test Categories
- Prompt injection
- Data exfiltration
- Privilege escalation
- Tool misuse
- Model jailbreak
- Context poisoning
- Cross user leakage
QA Execution Model
For each AI feature:
- Identify assets
- Identify trust boundaries
- Apply injection tests
- Apply leakage tests
- Apply abuse tests
- Log and measure