Your tool crashes. The model sees a stack trace it can't interpret. It retries with the same parameters. Crashes again. The user waits while the agent burns through attempts, never understanding what went wrong.
Error handling isn't just about catching exceptions. It's about giving the model actionable information it can use to fix the problem. Clear error messages, recovery strategies, and graceful degradation turn failures into learning opportunities.
Why Error Handling Matters for Agents
When you call an API yourself, you can read stack traces, check logs, and debug. The model can't. It only sees what you return from the tool.
A stack trace like TypeError: Cannot read property 'length' of undefined at line 42 tells you nothing about what parameter was wrong or how to fix it. But Error: file_path is required and must be a string, received undefined tells the model exactly what to change.
Good error handling means:
- The model understands what went wrong
- It knows how to fix it
- Operations can recover automatically when possible
- Users see progress even when things fail
Without this, your agent gets stuck repeating the same mistakes.
Error Messages for LLMs
Format errors so the model can act on them. Return strings that explain the problem and suggest fixes.
// Bad: Generic error
if (!file_path) {
throw new Error('Invalid input');
}
// Good: Specific, actionable error
if (!file_path) {
return 'Error: file_path is required. Please provide the path to the file you want to read (e.g., "src/app/page.tsx")';
}When validation fails, show exactly what's wrong:
const parseResult = WRITE_SCHEMA.safeParse(input);
if (!parseResult.success) {
return `Error: Invalid input parameters. ${parseResult.error.errors
.map((e) => `${e.path.join('.')}: ${e.message}`)
.join(', ')}`;
}This produces: Error: Invalid input parameters. file_path: Required, content: Expected string, received number
The model sees which fields are wrong and adjusts the next call.
Validation Before Execution
Validate parameters before doing any work. Zod catches type errors before your tool runs.
import { z } from 'zod';
import { tool } from '@openai/agents';
import { zodToJsonSchema } from 'zod-to-json-schema';
const BASH_SCHEMA = z.object({
command: z.string().min(1).describe('The bash command to execute'),
run_in_background: z.boolean().optional(),
timeout: z.number().min(0).max(600000).optional(),
});
export const bashTool = tool({
name: 'Bash',
description: 'Executes a bash command',
parameters: zodToJsonSchema(BASH_SCHEMA) as any,
async execute(input: any, runContext?: RunContext<AgentContext>) {
const parseResult = BASH_SCHEMA.safeParse(input);
if (!parseResult.success) {
return `Error: ${parseResult.error.errors
.map((e) => `${e.path.join('.')}: ${e.message}`)
.join(', ')}`;
}
const { command, run_in_background, timeout } = parseResult.data;
// Now TypeScript knows these are the correct types
// Execute command...
}
});Use .safeParse() instead of .parse(). It returns a result object instead of throwing. Return the error message so the model sees it.
Security Validation
Validate security constraints before executing operations. Some commands should never run:
const prohibitedCommands = [
/git\s+rm\b/,
/git\s+reset\s+--hard/,
/git\s+push\s+.*--force/,
/\bsudo\b/,
/\brm\s+-rf\s+\//,
];
const isProhibited = prohibitedCommands.some((pattern) =>
pattern.test(command)
);
if (isProhibited) {
const errorMsg = `Error: This command is not allowed for security reasons. Prohibited commands include: git rm, git reset --hard, git push --force, sudo, and destructive file operations.`;
await mux?.put({
type: 'coding_agent.run_terminal_command.error',
data: { command, error: errorMsg },
});
return errorMsg;
}Return clear error messages explaining why the command was blocked. This helps the model choose an alternative approach.
Try-Catch with Context
Wrap operations in try-catch blocks and provide context about what failed:
async execute(input: any, runContext?: RunContext<AgentContext>) {
const { file_path, content } = parseResult.data;
const sandboxClient = runContext?.context.sandboxClient;
const mux = runContext?.context.mux;
const todos = runContext?.context.todos;
if (!sandboxClient) {
throw new Error('Sandbox client not available');
}
await mux?.put({
type: 'coding_agent.create_file.started',
data: { file_path },
});
try {
await sandboxClient.writeFile(file_path, content);
await mux?.put({
type: 'coding_agent.create_file.completed',
data: { file_path },
});
return `Successfully created ${file_path}`;
} catch (error) {
const errorMsg = error instanceof Error ? error.message : String(error);
await mux?.put({
type: 'coding_agent.create_file.error',
data: { file_path, error: errorMsg },
});
return appendTodoContext(`Error creating file: ${errorMsg}`, todos);
}
}Key points:
- Emit events - Even on error, emit an error event so the frontend knows what happened
- Extract message safely - Use
error instanceof Errorto get the message without throwing - Append context - Include todo context so the model sees current task state
- Return, don't throw - Return error strings so the model sees them
The model gets feedback even when operations fail, and can adjust its approach.
Null Checks for Resources
Check if resources exist before using them. Return helpful errors when they don't:
const content = await sandboxClient.readFile(file_path);
if (content === null) {
const errorMsg = `Error: Could not read file ${file_path}. The file may not exist. Use the Glob tool to search for files if you're unsure of the path.`;
await mux?.put({
type: 'coding_agent.read_file.error',
data: { file_path, error: errorMsg },
});
return appendTodoContext(errorMsg, todos);
}
// Handle empty files differently
if (content.trim() === '') {
const warningMsg = `Warning: File ${file_path} exists but has empty contents.`;
return appendTodoContext(warningMsg, todos);
}Distinguish between "doesn't exist" and "exists but empty". The model handles each case differently.
Retry with Exponential Backoff
Some operations fail temporarily. Retry them with increasing delays:
export async function retryWithBackoff<T>(
operation: () => Promise<T>,
options: {
maxAttempts?: number;
initialDelayMs?: number;
backoffMultiplier?: number;
maxDelayMs?: number;
useJitter?: boolean;
} = {}
): Promise<T> {
const {
maxAttempts = 3,
initialDelayMs = 1000,
backoffMultiplier = 2,
maxDelayMs = 30000,
useJitter = true,
} = options;
let lastError: Error | unknown;
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await operation();
} catch (error) {
lastError = error;
if (attempt === maxAttempts - 1) {
throw error;
}
// Calculate delay: 1s, 2s, 4s, 8s...
let delay = initialDelayMs * Math.pow(backoffMultiplier, attempt);
delay = Math.min(delay, maxDelayMs);
// Add jitter (randomize ±25% of delay)
if (useJitter) {
const jitter = delay * 0.25;
delay = delay - jitter + Math.random() * jitter * 2;
}
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw lastError;
}Use it for operations that might fail due to rate limits or temporary network issues:
const result = await retryWithBackoff(
() => uploadToSupabase({ file: buffer, path: 'screenshots/image.png' }),
{ maxAttempts: 3, initialDelayMs: 1000 }
);Jitter prevents thundering herd problems when multiple operations retry simultaneously.
Database-Specific Error Handling
Database errors need special handling. Map error codes to user-friendly messages and mark which errors are retryable:
interface DatabaseError {
status: number;
body: {
error: string;
details?: string;
retryable?: boolean;
errorType?: string;
};
}
const mapDatabaseError = (error: unknown): DatabaseError => {
const err = error as { message?: string; code?: string };
const message = err?.message || 'Internal server error';
const normalizedMessage = message.toLowerCase();
// Connection pool exhaustion - retryable
if (
normalizedMessage.includes('max clients reached') ||
normalizedMessage.includes('connection pool exhausted')
) {
return {
status: 503,
body: {
error: 'Database connection pool is temporarily exhausted. Please retry.',
details: message,
retryable: true,
errorType: 'POOL_EXHAUSTION',
},
};
}
// Authentication errors - not retryable
if (normalizedMessage.includes('password authentication failed') || err?.code === '28P01') {
return {
status: 401,
body: {
error: 'Database authentication failed. Check your connection string and password.',
details: message,
},
};
}
// Generic error
return {
status: 500,
body: { error: message },
};
};Use the mapped error with retry logic:
const MAX_RETRIES = 3;
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
try {
const result = await pool.query({ text: query, values: params });
return c.json(result.rows);
} catch (error) {
const mapped = mapDatabaseError(error);
// Retry if error is retryable and we have attempts left
if (mapped.body?.retryable && attempt < MAX_RETRIES) {
await new Promise(resolve => setTimeout(resolve, 500 * attempt));
continue;
}
return c.json(mapped.body, mapped.status);
}
}This automatically retries connection pool errors but immediately fails on authentication errors.
Enum-Based Error Status
For complex systems, use enums to standardize error responses:
export enum GitHubStatus {
SUCCESS = 'SUCCESS',
INSTALLATION_NOT_FOUND = 'INSTALLATION_NOT_FOUND',
INSTALLATION_SUSPENDED = 'INSTALLATION_SUSPENDED',
REPO_NOT_FOUND = 'REPO_NOT_FOUND',
PUSH_ACCESS_DENIED = 'PUSH_ACCESS_DENIED',
NETWORK_ERROR = 'NETWORK_ERROR',
UNKNOWN_ERROR = 'UNKNOWN_ERROR',
}
const STATUS_MESSAGES: Record<GitHubStatus, string> = {
[GitHubStatus.SUCCESS]: 'Operation completed successfully',
[GitHubStatus.INSTALLATION_NOT_FOUND]: 'GitHub connection not found. Please connect your repository.',
[GitHubStatus.INSTALLATION_SUSPENDED]: 'GitHub installation is suspended. Check your GitHub settings.',
[GitHubStatus.REPO_NOT_FOUND]: 'Repository not found. Verify the repository exists and you have access.',
[GitHubStatus.PUSH_ACCESS_DENIED]: 'Push access denied. You need write permissions for this repository.',
[GitHubStatus.NETWORK_ERROR]: 'Network error. Please check your connection and try again.',
[GitHubStatus.UNKNOWN_ERROR]: 'An unexpected error occurred.',
};
export interface GitHubResponse<T = any> {
status: GitHubStatus;
message: string;
data?: T;
}
export const createResponse = <T = any>(
status: GitHubStatus,
data?: T
): GitHubResponse<T> => ({
status,
message: STATUS_MESSAGES[status],
...(data && { data }),
});Use it to return consistent responses:
try {
const result = await octokit.repos.get({ owner, repo });
return createResponse(GitHubStatus.SUCCESS, result.data);
} catch (error) {
if (error.status === 404) {
return createResponse(GitHubStatus.REPO_NOT_FOUND);
}
if (error.status === 403) {
return createResponse(GitHubStatus.PUSH_ACCESS_DENIED);
}
return createResponse(GitHubStatus.UNKNOWN_ERROR);
}This separates error classification from error messages. You can change messages without touching error handling logic.
Graceful Degradation
Some failures shouldn't stop the entire operation. Use Promise.allSettled to continue even when some operations fail:
const nextConfigFiles = ['next.config.js', 'next.config.mjs', 'next.config.ts'];
const results = await Promise.allSettled(
nextConfigFiles.map((configFile) => sandboxClient.readFile(configFile))
);
const isNextJsProject = results.some(
(result) => result.status === 'fulfilled' && result.value !== null
);
if (isNextJsProject) {
// Adjust behavior for Next.js projects
}The operation continues even if some config files don't exist. This pattern is useful for optional features or multi-source data fetching.
Logging with Context
Add context prefixes to logs so you can trace errors through the system:
console.error('[Database Studio] Error closing pool:', err);
console.error('[Rate Limit Redis] Connection error:', err.message);
console.error('[Webhook] Signature verification failed');
console.warn('[Webhook] Could not extract projectId from payload');The prefix tells you which component logged the error. This is critical when debugging production issues across multiple services.
Resource Cleanup on Error
Use finally blocks to ensure cleanup happens even when operations fail:
const fileLockManager = runContext?.context.fileLockManager;
const releaseLock = fileLockManager
? await fileLockManager.acquireLock(file_path)
: null;
let updatedCode = '';
try {
// Re-read file after acquiring lock
const currentCode = await sandboxClient.readFile(file_path);
if (!currentCode) {
throw new Error(`File was deleted while waiting for lock`);
}
// Apply edit
updatedCode = await applyCodeEdit(currentCode, edit, instruction);
// Write updated content
await sandboxClient.writeFile(file_path, updatedCode);
} finally {
// Always release lock, even if write fails
if (releaseLock) {
releaseLock();
}
}The lock is released whether the operation succeeds or fails. Without this, a failed operation would leave the file locked forever.
Runtime Error Detection
Some errors only appear after execution completes. Check for them asynchronously:
export async function checkRuntimeErrors(
projectId: string,
waitTime: number = 4.0
): Promise<string> {
try {
// Wait for errors to be captured
await new Promise(resolve => setTimeout(resolve, waitTime * 1000));
const redisErrors = await getRedisErrors(projectId);
if (redisErrors && redisErrors.length > 0) {
const errorMessages = redisErrors
.map(error => `${error.name || 'Error'}: ${error.message}`)
.filter(Boolean);
return errorMessages.join('\n\n');
}
return '';
} catch (error) {
console.error('[Runtime Errors] Failed to check:', error);
return ''; // Return empty string on error, don't fail the operation
}
}Use this after file writes to catch syntax errors, type errors, or runtime exceptions:
await sandboxClient.writeFile(file_path, content);
let result = `Successfully wrote ${file_path}`;
const projectId = runContext?.context.projectId;
if (projectId) {
const errors = await checkRuntimeErrors(projectId);
if (errors && errors.trim().length > 0) {
result += `\n\nRuntime errors detected:\n${errors}`;
}
}
return result;The model sees the errors immediately and can fix them in the next turn.
HTTP Status Codes
Use appropriate status codes for different error types:
// 400 - Bad Request (client sent invalid data)
if (!body.role || !['user', 'assistant'].includes(body.role)) {
return c.json({ error: 'Role must be either "user" or "assistant"' }, 400);
}
// 401 - Unauthorized (authentication failed)
if (!isValidSignature(signature, body)) {
return c.json({ error: 'Invalid signature' }, 401);
}
// 404 - Not Found (resource doesn't exist)
if (!project) {
return c.json({ error: 'Project not found' }, 404);
}
// 429 - Too Many Requests (rate limited)
if (await isRateLimited(userId)) {
return c.json({ error: 'Too many requests. Please try again later.' }, 429);
}
// 500 - Internal Server Error (something broke on our side)
catch (error) {
console.error('[API] Unexpected error:', error);
return c.json({ error: 'Internal server error' }, 500);
}
// 503 - Service Unavailable (temporary failure, retry possible)
if (isPoolExhausted) {
return c.json({
error: 'Database temporarily unavailable. Please retry.',
retryable: true
}, 503);
}Status codes help clients (including other services) understand what went wrong and whether to retry.
Common Mistakes
Throwing instead of returning: The model can't see thrown exceptions. Return error strings.
// Bad: Model sees nothing
if (!file_path) throw new Error('Missing file_path');
// Good: Model sees the error
if (!file_path) return 'Error: file_path is required';Generic error messages: "Error: Invalid input" doesn't help. Be specific.
// Bad
return 'Error: Invalid input';
// Good
return 'Error: file_path must be a string, received undefined';No retry for transient failures: Network issues and rate limits are temporary. Retry them.
Missing error events: Emit error events to the message queue so the frontend knows what failed.
Not cleaning up resources: Use finally blocks to release locks, close connections, and clean up even when operations fail.
What We're Skipping
Circuit breakers, error budgets, distributed tracing, structured logging frameworks, custom error classes. These add sophistication but aren't necessary initially.
The patterns here—validation, try-catch, retry, graceful degradation, clear messages—handle most failure scenarios. Start with these. Add complexity when you need it.
What's Next
Your tools now handle errors gracefully. They validate inputs, retry transient failures, provide clear feedback, and clean up resources even when things go wrong.
Next, we'll look at how to test these error handling patterns to ensure they work correctly under failure conditions. Testing error paths is just as important as testing happy paths.