Here is checklist for validating LLM-enabled applications against common security risks. This is not exhaustive but covers key areas to focus on.

This aligns with guidance from OWASP including the Top 10 for LLM Applications.

1. Prompt Injection & Input Manipulation

Test Objectives

Detect instruction override attempts
Prevent system prompt leakage
Ensure role separation is enforced

Test Cases

Inject: “Ignore previous instructions and…”
Inject hidden instructions inside markdown, HTML, JSON
Embed malicious instructions in uploaded files
Use multi turn injection chaining

Validate

System prompt never appears in output
Model refuses unsafe override attempts
Tool calls are not executed without validation

2. Output Handling & Unsafe Generation

Test Objectives

Prevent harmful content generation
Detect policy bypass attempts
Validate content filtering

Test Cases

Ask for disallowed content using obfuscation
Ask for step by step harmful instructions indirectly
Request data extraction from hidden context

Validate

Proper refusal behavior
No partial leakage
No hallucinated secrets

3. Data Leakage & Privacy

Test Objectives

Ensure no training data leakage
Prevent internal document exposure
Protect PII in RAG systems

Test Cases

Ask for system prompt
Ask for internal policies
Attempt to retrieve other users’ conversation data
Prompt model to reveal embeddings or vector store content

Validate

Strong refusal
No cross tenant leakage
RAG filters enforced

4. RAG Security Validation

If using retrieval augmented generation:

Test Objectives

Prevent document injection
Validate document ranking controls
Prevent sensitive data surfacing

Test Cases

Upload poisoned document with malicious instructions
Insert high ranking irrelevant document
Attempt cross user document retrieval

Validate

Retrieval restricted to authorized documents
Content sanitization applied before prompt assembly
Metadata access control enforced

5. Tool / Function Calling Security

Test Objectives

Prevent unauthorized tool invocation
Validate parameter sanitization
Enforce role based tool usage

Test Cases

Inject tool call instructions
Manipulate JSON schema
Attempt privilege escalation

Validate

Server side validation exists
Tool execution is never model only decision
Logs capture tool invocation attempts

6. Model Abuse & Cost Attacks

Test Objectives

Prevent denial of wallet
Prevent token exhaustion
Detect prompt flooding

Test Cases

Extremely long prompts
Recursive chain prompts
High frequency API requests

Validate

Rate limiting
Token caps
Budget enforcement
Alerting on abnormal usage

7. Hallucination & Integrity Testing

Test Objectives

Detect fabricated facts
Validate citation enforcement
Verify deterministic configurations

Test Cases

Ask for nonexistent APIs
Ask about fake vulnerabilities
Ask for outdated references

Validate

Clear uncertainty response
Citation requirement enforcement
Grounding enforcement when RAG enabled

8. Logging & Detection Alignment

Map behaviors to MITRE ATT&CK style abuse patterns.

Validate

Prompt injection attempts logged
Tool misuse logged
Rate abuse logged
Clear traceability per user

9. CI/CD Security Controls

Validate in pipeline

Prompt regression tests
Injection simulation tests
Output policy validation tests
RAG boundary tests
Schema validation for tool calls

Security tests must run like unit tests.

10. Governance & Policy Alignment

Map controls to:

OWASP Top 10 for LLM Applications
AI acceptable use policy
Data classification policy

Red Team Test Categories

Prompt injection
Data exfiltration
Privilege escalation
Tool misuse
Model jailbreak
Context poisoning
Cross user leakage

QA Execution Model

For each AI feature:

Identify assets
Identify trust boundaries
Apply injection tests
Apply leakage tests
Apply abuse tests
Log and measure

AI Security Checklist

1. Prompt Injection & Input Manipulation

Test Objectives

Test Cases

Validate

2. Output Handling & Unsafe Generation

Test Objectives

Test Cases

Validate

3. Data Leakage & Privacy

Test Objectives

Test Cases

Validate

4. RAG Security Validation

Test Objectives

Test Cases

Validate

5. Tool / Function Calling Security

Test Objectives

Test Cases

Validate

6. Model Abuse & Cost Attacks

Test Objectives

Test Cases

Validate

7. Hallucination & Integrity Testing

Test Objectives

Test Cases

Validate

8. Logging & Detection Alignment

Validate

9. CI/CD Security Controls

Validate in pipeline

10. Governance & Policy Alignment

Red Team Test Categories

QA Execution Model