- Core concepts and practical building context
- Awareness of misuse patterns and safety boundaries
- Explain threats follow the tool path in your own words and apply it to a realistic scenario.
- Most agent threats are about untrusted input reaching powerful tools or sensitive data.
- Check the assumption "All input is untrusted" and explain what changes if it is false.
- Check the assumption "Tools are restricted" and explain what changes if it is false.
- Run one misuse scenario and write a defensive response
- Describe one governance check before release
- A security review note with threat, control, and owner
- Prompt injection. Hidden instructions change behaviour, often by asking the agent to ignore its rules.
- Data exfiltration. The agent leaks secrets through output, logs, or tool parameters.