Loading lesson...
Loading lesson...

Security incident · September 2022
In September 2022, Optus, Australia's second-largest telecommunications provider, disclosed a data breach affecting 9.8 million current and former customers. The exposed data included names, dates of birth, phone numbers, email addresses, and in many cases, passport and driver's licence numbers.
The root cause was an API endpoint that was internet-facing and required no authentication. The endpoint had presumably been used for internal testing and was never properly decommissioned or secured. An attacker simply incremented a customer ID parameter in the URL, retrieving a new customer record with each request. No exploitation of a software vulnerability was required: the API worked exactly as designed.
This breach is directly relevant to AI agent development. An agent with a "fetch customer details" tool that calls an insufficiently authenticated API is the same attack surface in a new wrapper. The same principles that would have prevented the Optus breach, authentication on every endpoint, validated inputs, and minimal data returned per call, are the same principles that secure agent API integrations.
An AI agent with API integration capabilities is only as secure as the APIs it connects to. If your agent calls a poorly secured endpoint, what does that mean for the data it retrieves and the actions it takes?
MCP standardises tool integration; raw API integration gives you maximum flexibility. This module covers authentication, rate limiting, error handling, and retry logic for connecting agents to any HTTP API.
With the learning outcomes established, this module begins by examining rest api calls from agent tools in depth.
REST APIs use HTTP methods (GET, POST, PUT, DELETE, PATCH) to perform operations on resources identified by URLs. Agent tool functions call these APIs and return the result as a dictionary that the agent can reason about. Use httpx rather than Python's older requests library: httpx is async-native, which matters when tool calls run concurrently in a multi-agent system.
Every HTTP client in a tool function needs an explicit timeout. Without a timeout, a slow or unresponsive API causes the agent loop to hang indefinitely, consuming the thread and blocking any queued requests. Set timeouts between 10 and 30 seconds depending on the expected response time of the API.
import httpx, os
from typing import Optional
BASE_URL = "https://api.example.com/v1"
async def get_customer(customer_id: str) -> dict:
"""Fetch customer details from the CRM API."""
api_key = os.getenv("CRM_API_KEY")
async with httpx.AsyncClient(timeout=10.0) as client:
response = await client.get(
f"{BASE_URL}/customers/{customer_id}",
headers={
"Authorization": f"Bearer {api_key}",
"Accept": "application/json"
}
)
response.raise_for_status() # raises on 4xx/5xx
return response.json()“API security must be a priority for any organisation exposing data through programmatic interfaces. Broken authentication and lack of proper authorisation controls remain the leading causes of API data exposure.”
OWASP API Security Top 10, 2023 - API1:2023 Broken Object Level Authorisation
The Optus breach is a textbook OWASP API1 violation: an API that returned objects (customer records) without checking whether the requestor was authorised to access that specific object. Agent tools calling APIs must verify that authentication is present and that the API enforces object-level authorisation.
With an understanding of rest api calls from agent tools in place, the discussion can now turn to graphql from agent tools, which builds directly on these foundations.
GraphQL is a query language for APIs developed by Facebook (now Meta) that lets clients request exactly the data they need. Unlike REST, a single GraphQL endpoint handles all operations. This reduces over-fetching: instead of receiving a full customer object when you only need the email address, you specify the fields you want.
One GraphQL-specific behaviour catches many developers: GraphQL always returns HTTP 200, even when the query fails. A query with a syntax error, an invalid variable, or an authorisation failure returns HTTP 200 with an errors array in the response body. An agent tool that only checks the HTTP status code will incorrectly treat GraphQL failures as successes.
async def graphql_query(query: str, variables: Optional[dict] = None) -> dict:
token = os.getenv("INTERNAL_API_TOKEN")
async with httpx.AsyncClient(timeout=15.0) as client:
response = await client.post(
"https://api.example.com/graphql",
json={"query": query, "variables": variables or {}},
headers={"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
)
response.raise_for_status()
data = response.json()
# GraphQL returns 200 even for errors; always check this field
if "errors" in data:
return {"error": "GraphQL errors", "details": data["errors"]}
return data.get("data", {})Common misconception
“If an API returns HTTP 200, the request succeeded and the data is valid.”
GraphQL always returns HTTP 200 regardless of whether the query succeeded. Check the 'errors' field in the response body. Some REST APIs also return HTTP 200 with error information in the body (especially older or inconsistently designed APIs). Always validate the response structure, not just the status code.
With an understanding of graphql from agent tools in place, the discussion can now turn to authentication patterns, which builds directly on these foundations.
API key authentication is the simplest pattern: include the key in an HTTP header on every request. Always load keys from environment variables, never hardcode them in the tool function or the tool schema. The tool schema is visible to the model; anything hardcoded there becomes part of the context window.
OAuth 2.0 (defined in RFC 6749) is the standard for server-to-server authentication where your agent acts as a service. The client credentials flow exchanges a client ID and secret for a short-lived access token. The token expires; your implementation must refresh it before expiry, not after the first 401 response.
import time
class OAuthClient:
def __init__(self, token_url: str, client_id: str, client_secret: str):
self.token_url = token_url
self.client_id = client_id
self.client_secret = client_secret
self._access_token: Optional[str] = None
self._token_expires_at: float = 0
async def _refresh_token(self) -> None:
async with httpx.AsyncClient() as client:
response = await client.post(self.token_url, data={
"grant_type": "client_credentials",
"client_id": self.client_id,
"client_secret": self.client_secret,
"scope": "read write"
})
response.raise_for_status()
token_data = response.json()
self._access_token = token_data["access_token"]
# Subtract 60 seconds buffer to refresh before expiry
self._token_expires_at = time.time() + token_data["expires_in"] - 60
async def get_token(self) -> str:
if not self._access_token or time.time() > self._token_expires_at:
await self._refresh_token()
return self._access_token“The authorization server issues access tokens to the client after successfully authenticating the resource owner and obtaining authorization. The access token provides an abstraction layer, replacing different authorization constructs with a single token.”
OAuth 2.0 Specification, RFC 6749 - Section 1.4: Access Token
The access token abstraction is what makes OAuth useful for agent integrations. Your agent never handles the user's credentials directly; it only handles the short-lived token. When the token expires, the refresh flow re-authenticates without user involvement.
With an understanding of authentication patterns in place, the discussion can now turn to rate limiting and exponential backoff, which builds directly on these foundations.
External APIs enforce rate limits. An agent that retries immediately on a 429 (Too Many Requests) response will be blocked for longer, not shorter. Exponential backoff waits twice as long after each failed attempt: 1 second, then 2, then 4, then 8. This gives the rate limiter time to reset.
Many APIs include a Retry-After header on 429 responses that specifies exactly how long to wait. Always check for this header before calculating your own backoff delay. Adding a small random jitter (up to 1 second) to the delay prevents multiple concurrent agent calls from retrying simultaneously and re-triggering the rate limit.
import asyncio, random
async def api_call_with_retry(fn, *args, max_retries: int = 3, base_delay: float = 1.0, **kwargs) -> dict:
"""Retry on 429 and 503 with exponential backoff. Never retry on 4xx client errors."""
for attempt in range(max_retries + 1):
try:
return await fn(*args, **kwargs)
except httpx.HTTPStatusError as e:
status = e.response.status_code
if status in (429, 503) and attempt < max_retries:
retry_after = e.response.headers.get("Retry-After")
delay = float(retry_after) if retry_after else base_delay * (2 ** attempt) + random.uniform(0, 1)
await asyncio.sleep(delay)
continue
raise # re-raise non-retryable errors immediately
raise RuntimeError(f"Exhausted {max_retries} retries")The circuit breaker pattern stops sending requests to an API that is consistently failing. After a defined number of consecutive failures, the breaker opens and all requests to that API fail immediately without attempting the call. After a recovery period, the breaker moves to half-open and allows one test request through. If it succeeds, the breaker closes and normal operation resumes.
With an understanding of rate limiting and exponential backoff in place, the discussion can now turn to returning structured errors to the agent, which builds directly on these foundations.
When an API call fails, the tool function must not propagate the raw exception to the agent. A raw Python exception message such as httpx.ConnectError: [Errno 111] Connection refused is not useful to the agent: it cannot reason about what to do next. Catch exceptions at the tool boundary and return a structured dictionary with a human-readable, actionable error description.
Include a success boolean so the agent can check the result programmatically rather than parsing the message text. Map common HTTP status codes to plain-language descriptions that the agent can relay to the user without exposing internal system details.
async def safe_api_call(endpoint: str, params: dict) -> dict:
try:
result = await api_call_with_retry(make_request, endpoint, params)
return {"success": True, "data": result}
except httpx.HTTPStatusError as e:
return {
"success": False,
"error_type": "http_error",
"status_code": e.response.status_code,
"message": {
400: "Invalid request parameters. Check the input and try again.",
401: "Authentication failed. The API key may be invalid or expired.",
403: "Access denied. This operation requires additional permissions.",
404: "Resource not found. The requested item does not exist.",
429: "Rate limit reached. The system will retry automatically.",
500: "The external service is experiencing issues. Try again later."
}.get(e.response.status_code, f"HTTP error {e.response.status_code}")
}
except httpx.TimeoutException:
return {
"success": False,
"error_type": "timeout",
"message": "The request timed out. The service may be slow or unavailable."
}Common misconception
“Returning detailed error messages from tool functions helps the agent debug the problem.”
Detailed internal error messages such as stack traces, database connection strings, or raw exception types expose system internals to the model, which may relay them to the user. Return human-readable, actionable descriptions (401: authentication failed) without internal system details. Log the full error server-side for developer debugging.
Your agent's get_weather tool calls a weather API. In testing you notice that when the API is slow, your agent hangs indefinitely. When the API is rate-limited, the agent retries immediately 10 times in 2 seconds and gets blocked for 60 seconds. What two configurations fix both problems?
Your agent tool calls a GraphQL API. The API returns HTTP 200 with the body: {"errors": [{"message": "Field 'email' not found on type 'Customer'"}]}. The tool function only checks response.raise_for_status(). What happens?
Which HTTP status code should NOT trigger an automatic retry in your exponential backoff implementation?
You are building a get_invoice tool that calls your company's billing API. The tool should return invoice data to the agent. What is the correct approach to handling the API credentials?
OWASP API Security Top 10 2023
API1:2023 Broken Object Level Authorisation
Cited in Section 15.1 in relation to the Optus breach. Defines the authorisation requirement that the Optus API violated and that agent tool integrations must implement.
RFC 6749, The OAuth 2.0 Authorization Framework
Section 1.4: Access Token; Section 4.4: Client Credentials Grant
Authoritative specification for the client credentials flow used in Section 15.3. Quoted to define what an access token is and why the abstraction matters.
python-httpx.org: AsyncClient, Timeout, and Exception handling
Async-native HTTP client recommended for agent tools in this module. Source for timeout configuration, raise_for_status(), and retry patterns in Sections 15.1 and 15.4.
Retry-After HTTP header, MDN Web Docs
developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
Reference for parsing and respecting rate limit delay headers from external APIs. Cited in Section 15.4 as the header to check before calculating exponential backoff.
Australian Signals Directorate: Optus Data Breach Analysis, 2022
Post-incident analysis of the September 2022 Optus data breach
Source for the real-world story in the opening section. Documents the unauthenticated API endpoint as the root cause of the 9.8 million record exposure.
Module 15 of 25 · Practical Building