AI Defense Lab
Your AI Security Learning Platform
Learning Modules
Threat Detection Lab
Learn how AI attacks are detected in real time using pattern matching and heuristics
Key Concept: Detectors scan for known attack patterns like instruction overrides, data exfiltration commands, and encoded payloads.
Policy Design Lab
Build rules that decide what to block, allow, or flag for review
Key Concept: Policy = Rules + Actions. Each rule maps detector findings to a decision: ALLOW, BLOCK, or REQUIRE_APPROVAL.
RAG Security Lab
Understand how retrieval-augmented generation can be poisoned and defended
Key Concept: Trust levels protect knowledge bases. Low-trust sources receive extra scrutiny from poisoning detectors.
Agent Attack Lab
See how multi-stage AI agents can be attacked at each pipeline stage
Key Concept: Defense-in-depth at every stage. Each step in the agent pipeline is a potential injection point.
Audit Trail Explorer
Every detection creates a traceable, immutable audit trail
Key Concept: Immutable audit trails for accountability. Traces enable replay, debugging, and compliance reporting.
Your Progress
—
Total Runs
—
Threats Blocked
—
Active Policies
—
Detection Rate
OWASP LLM Top 10 Coverage
| ID | Name | Description | Coverage |
|---|---|---|---|
| LLM01 | Prompt Injection | An attacker crafts input that overrides the model's system instructions, causing unintended behavior. | Covered (2 detectors) |
| LLM02 | Insecure Output Handling | LLM output is used in downstream systems without validation, enabling code execution or data leaks. | Covered (1 detector) |
| LLM03 | Training Data Poisoning | Malicious data is injected into training or retrieval sources, causing the model to produce harmful outputs. | Covered (1 detector) |
| LLM04 | Model Denial of Service | Attackers craft inputs that consume excessive resources, causing the model to become unresponsive. | Not yet covered |
| LLM05 | Supply Chain Vulnerabilities | Compromised components in the LLM supply chain (models, plugins, data) introduce security risks. | Not yet covered |
| LLM06 | Sensitive Information Disclosure | The LLM reveals confidential data from its training data, system prompts, or connected systems. | Covered (2 detectors) |
| LLM07 | Insecure Plugin Design | LLM plugins accept unchecked input or have excessive permissions, creating attack surfaces. | Covered (1 detector) |
| LLM08 | Excessive Agency | The LLM has too much autonomy or access, enabling it to take harmful actions without oversight. | Covered (2 detectors) |
| LLM09 | Overreliance | Users trust LLM outputs without verification, leading to misinformation or flawed decisions. | Not yet covered |
| LLM10 | Model Theft | Unauthorized access to the LLM model weights, parameters, or fine-tuning data. | Not yet covered |