Building AI Coding Agents

Your tool crashes. The model sees a stack trace it can't interpret. It retries with the same parameters. Crashes again. The user waits while the agent burns through attempts, never understanding what went wrong.

Error handling isn't just about catching exceptions. It's about giving the model actionable information it can use to fix the problem. Clear error messages, recovery strategies, and graceful degradation turn failures into learning opportunities.

Why Error Handling Matters for Agents

When you call an API yourself, you can read stack traces, check logs, and debug. The model can't. It only sees what you return from the tool.

A stack trace like TypeError: Cannot read property 'length' of undefined at line 42 tells you nothing about what parameter was wrong or how to fix it. But Error: file_path is required and must be a string, received undefined tells the model exactly what to change.

Good error handling means:

The model understands what went wrong
It knows how to fix it
Operations can recover automatically when possible
Users see progress even when things fail

Without this, your agent gets stuck repeating the same mistakes.

Error Messages for LLMs

Format errors so the model can act on them. Return strings that explain the problem and suggest fixes.

// Bad: Generic error
if (!file_path) {
  throw new Error('Invalid input');
}
 
// Good: Specific, actionable error
if (!file_path) {
  return 'Error: file_path is required. Please provide the path to the file you want to read (e.g., "src/app/page.tsx")';
}

When validation fails, show exactly what's wrong:

const parseResult = WRITE_SCHEMA.safeParse(input);
if (!parseResult.success) {
  return `Error: Invalid input parameters. ${parseResult.error.errors
    .map((e) => `${e.path.join('.')}: ${e.message}`)
    .join(', ')}`;
}

This produces: Error: Invalid input parameters. file_path: Required, content: Expected string, received number

The model sees which fields are wrong and adjusts the next call.

Validation Before Execution

Validate parameters before doing any work. Zod catches type errors before your tool runs.

import { z } from 'zod';
import { tool } from '@openai/agents';
import { zodToJsonSchema } from 'zod-to-json-schema';
 
const BASH_SCHEMA = z.object({
  command: z.string().min(1).describe('The bash command to execute'),
  run_in_background: z.boolean().optional(),
  timeout: z.number().min(0).max(600000).optional(),
});
 
export const bashTool = tool({
  name: 'Bash',
  description: 'Executes a bash command',
  parameters: zodToJsonSchema(BASH_SCHEMA) as any,
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const parseResult = BASH_SCHEMA.safeParse(input);
    if (!parseResult.success) {
      return `Error: ${parseResult.error.errors
        .map((e) => `${e.path.join('.')}: ${e.message}`)
        .join(', ')}`;
    }
 
    const { command, run_in_background, timeout } = parseResult.data;
    // Now TypeScript knows these are the correct types
 
    // Execute command...
  }
});

Use .safeParse() instead of .parse(). It returns a result object instead of throwing. Return the error message so the model sees it.

Security Validation

Validate security constraints before executing operations. Some commands should never run:

const prohibitedCommands = [
  /git\s+rm\b/,
  /git\s+reset\s+--hard/,
  /git\s+push\s+.*--force/,
  /\bsudo\b/,
  /\brm\s+-rf\s+\//,
];
 
const isProhibited = prohibitedCommands.some((pattern) =>
  pattern.test(command)
);
 
if (isProhibited) {
  const errorMsg = `Error: This command is not allowed for security reasons. Prohibited commands include: git rm, git reset --hard, git push --force, sudo, and destructive file operations.`;
 
  await mux?.put({
    type: 'coding_agent.run_terminal_command.error',
    data: { command, error: errorMsg },
  });
 
  return errorMsg;
}

Return clear error messages explaining why the command was blocked. This helps the model choose an alternative approach.

Try-Catch with Context

Wrap operations in try-catch blocks and provide context about what failed:

async execute(input: any, runContext?: RunContext<AgentContext>) {
  const { file_path, content } = parseResult.data;
  const sandboxClient = runContext?.context.sandboxClient;
  const mux = runContext?.context.mux;
  const todos = runContext?.context.todos;
 
  if (!sandboxClient) {
    throw new Error('Sandbox client not available');
  }
 
  await mux?.put({
    type: 'coding_agent.create_file.started',
    data: { file_path },
  });
 
  try {
    await sandboxClient.writeFile(file_path, content);
 
    await mux?.put({
      type: 'coding_agent.create_file.completed',
      data: { file_path },
    });
 
    return `Successfully created ${file_path}`;
 
  } catch (error) {
    const errorMsg = error instanceof Error ? error.message : String(error);
 
    await mux?.put({
      type: 'coding_agent.create_file.error',
      data: { file_path, error: errorMsg },
    });
 
    return appendTodoContext(`Error creating file: ${errorMsg}`, todos);
  }
}

Key points:

Emit events - Even on error, emit an error event so the frontend knows what happened
Extract message safely - Use error instanceof Error to get the message without throwing
Append context - Include todo context so the model sees current task state
Return, don't throw - Return error strings so the model sees them

The model gets feedback even when operations fail, and can adjust its approach.

Null Checks for Resources

Check if resources exist before using them. Return helpful errors when they don't:

const content = await sandboxClient.readFile(file_path);
 
if (content === null) {
  const errorMsg = `Error: Could not read file ${file_path}. The file may not exist. Use the Glob tool to search for files if you're unsure of the path.`;
 
  await mux?.put({
    type: 'coding_agent.read_file.error',
    data: { file_path, error: errorMsg },
  });
 
  return appendTodoContext(errorMsg, todos);
}
 
// Handle empty files differently
if (content.trim() === '') {
  const warningMsg = `Warning: File ${file_path} exists but has empty contents.`;
  return appendTodoContext(warningMsg, todos);
}

Distinguish between "doesn't exist" and "exists but empty". The model handles each case differently.

Retry with Exponential Backoff

Some operations fail temporarily. Retry them with increasing delays:

export async function retryWithBackoff<T>(
  operation: () => Promise<T>,
  options: {
    maxAttempts?: number;
    initialDelayMs?: number;
    backoffMultiplier?: number;
    maxDelayMs?: number;
    useJitter?: boolean;
  } = {}
): Promise<T> {
  const {
    maxAttempts = 3,
    initialDelayMs = 1000,
    backoffMultiplier = 2,
    maxDelayMs = 30000,
    useJitter = true,
  } = options;
 
  let lastError: Error | unknown;
 
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await operation();
    } catch (error) {
      lastError = error;
 
      if (attempt === maxAttempts - 1) {
        throw error;
      }
 
      // Calculate delay: 1s, 2s, 4s, 8s...
      let delay = initialDelayMs * Math.pow(backoffMultiplier, attempt);
      delay = Math.min(delay, maxDelayMs);
 
      // Add jitter (randomize ±25% of delay)
      if (useJitter) {
        const jitter = delay * 0.25;
        delay = delay - jitter + Math.random() * jitter * 2;
      }
 
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
 
  throw lastError;
}

Use it for operations that might fail due to rate limits or temporary network issues:

const result = await retryWithBackoff(
  () => uploadToSupabase({ file: buffer, path: 'screenshots/image.png' }),
  { maxAttempts: 3, initialDelayMs: 1000 }
);

Jitter prevents thundering herd problems when multiple operations retry simultaneously.

Database-Specific Error Handling

Database errors need special handling. Map error codes to user-friendly messages and mark which errors are retryable:

interface DatabaseError {
  status: number;
  body: {
    error: string;
    details?: string;
    retryable?: boolean;
    errorType?: string;
  };
}
 
const mapDatabaseError = (error: unknown): DatabaseError => {
  const err = error as { message?: string; code?: string };
  const message = err?.message || 'Internal server error';
  const normalizedMessage = message.toLowerCase();
 
  // Connection pool exhaustion - retryable
  if (
    normalizedMessage.includes('max clients reached') ||
    normalizedMessage.includes('connection pool exhausted')
  ) {
    return {
      status: 503,
      body: {
        error: 'Database connection pool is temporarily exhausted. Please retry.',
        details: message,
        retryable: true,
        errorType: 'POOL_EXHAUSTION',
      },
    };
  }
 
  // Authentication errors - not retryable
  if (normalizedMessage.includes('password authentication failed') || err?.code === '28P01') {
    return {
      status: 401,
      body: {
        error: 'Database authentication failed. Check your connection string and password.',
        details: message,
      },
    };
  }
 
  // Generic error
  return {
    status: 500,
    body: { error: message },
  };
};

Use the mapped error with retry logic:

const MAX_RETRIES = 3;
 
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
  try {
    const result = await pool.query({ text: query, values: params });
    return c.json(result.rows);
  } catch (error) {
    const mapped = mapDatabaseError(error);
 
    // Retry if error is retryable and we have attempts left
    if (mapped.body?.retryable && attempt < MAX_RETRIES) {
      await new Promise(resolve => setTimeout(resolve, 500 * attempt));
      continue;
    }
 
    return c.json(mapped.body, mapped.status);
  }
}

This automatically retries connection pool errors but immediately fails on authentication errors.

Enum-Based Error Status

For complex systems, use enums to standardize error responses:

export enum GitHubStatus {
  SUCCESS = 'SUCCESS',
  INSTALLATION_NOT_FOUND = 'INSTALLATION_NOT_FOUND',
  INSTALLATION_SUSPENDED = 'INSTALLATION_SUSPENDED',
  REPO_NOT_FOUND = 'REPO_NOT_FOUND',
  PUSH_ACCESS_DENIED = 'PUSH_ACCESS_DENIED',
  NETWORK_ERROR = 'NETWORK_ERROR',
  UNKNOWN_ERROR = 'UNKNOWN_ERROR',
}
 
const STATUS_MESSAGES: Record<GitHubStatus, string> = {
  [GitHubStatus.SUCCESS]: 'Operation completed successfully',
  [GitHubStatus.INSTALLATION_NOT_FOUND]: 'GitHub connection not found. Please connect your repository.',
  [GitHubStatus.INSTALLATION_SUSPENDED]: 'GitHub installation is suspended. Check your GitHub settings.',
  [GitHubStatus.REPO_NOT_FOUND]: 'Repository not found. Verify the repository exists and you have access.',
  [GitHubStatus.PUSH_ACCESS_DENIED]: 'Push access denied. You need write permissions for this repository.',
  [GitHubStatus.NETWORK_ERROR]: 'Network error. Please check your connection and try again.',
  [GitHubStatus.UNKNOWN_ERROR]: 'An unexpected error occurred.',
};
 
export interface GitHubResponse<T = any> {
  status: GitHubStatus;
  message: string;
  data?: T;
}
 
export const createResponse = <T = any>(
  status: GitHubStatus,
  data?: T
): GitHubResponse<T> => ({
  status,
  message: STATUS_MESSAGES[status],
  ...(data && { data }),
});

Use it to return consistent responses:

try {
  const result = await octokit.repos.get({ owner, repo });
  return createResponse(GitHubStatus.SUCCESS, result.data);
} catch (error) {
  if (error.status === 404) {
    return createResponse(GitHubStatus.REPO_NOT_FOUND);
  }
  if (error.status === 403) {
    return createResponse(GitHubStatus.PUSH_ACCESS_DENIED);
  }
  return createResponse(GitHubStatus.UNKNOWN_ERROR);
}

This separates error classification from error messages. You can change messages without touching error handling logic.

Graceful Degradation

Some failures shouldn't stop the entire operation. Use Promise.allSettled to continue even when some operations fail:

const nextConfigFiles = ['next.config.js', 'next.config.mjs', 'next.config.ts'];
 
const results = await Promise.allSettled(
  nextConfigFiles.map((configFile) => sandboxClient.readFile(configFile))
);
 
const isNextJsProject = results.some(
  (result) => result.status === 'fulfilled' && result.value !== null
);
 
if (isNextJsProject) {
  // Adjust behavior for Next.js projects
}

The operation continues even if some config files don't exist. This pattern is useful for optional features or multi-source data fetching.

Logging with Context

Add context prefixes to logs so you can trace errors through the system:

console.error('[Database Studio] Error closing pool:', err);
console.error('[Rate Limit Redis] Connection error:', err.message);
console.error('[Webhook] Signature verification failed');
console.warn('[Webhook] Could not extract projectId from payload');

The prefix tells you which component logged the error. This is critical when debugging production issues across multiple services.

Resource Cleanup on Error

Use finally blocks to ensure cleanup happens even when operations fail:

const fileLockManager = runContext?.context.fileLockManager;
 
const releaseLock = fileLockManager
  ? await fileLockManager.acquireLock(file_path)
  : null;
 
let updatedCode = '';
 
try {
  // Re-read file after acquiring lock
  const currentCode = await sandboxClient.readFile(file_path);
  if (!currentCode) {
    throw new Error(`File was deleted while waiting for lock`);
  }
 
  // Apply edit
  updatedCode = await applyCodeEdit(currentCode, edit, instruction);
 
  // Write updated content
  await sandboxClient.writeFile(file_path, updatedCode);
 
} finally {
  // Always release lock, even if write fails
  if (releaseLock) {
    releaseLock();
  }
}

The lock is released whether the operation succeeds or fails. Without this, a failed operation would leave the file locked forever.

Runtime Error Detection

Some errors only appear after execution completes. Check for them asynchronously:

export async function checkRuntimeErrors(
  projectId: string,
  waitTime: number = 4.0
): Promise<string> {
  try {
    // Wait for errors to be captured
    await new Promise(resolve => setTimeout(resolve, waitTime * 1000));
 
    const redisErrors = await getRedisErrors(projectId);
 
    if (redisErrors && redisErrors.length > 0) {
      const errorMessages = redisErrors
        .map(error => `${error.name || 'Error'}: ${error.message}`)
        .filter(Boolean);
 
      return errorMessages.join('\n\n');
    }
 
    return '';
  } catch (error) {
    console.error('[Runtime Errors] Failed to check:', error);
    return ''; // Return empty string on error, don't fail the operation
  }
}

Use this after file writes to catch syntax errors, type errors, or runtime exceptions:

await sandboxClient.writeFile(file_path, content);
 
let result = `Successfully wrote ${file_path}`;
 
const projectId = runContext?.context.projectId;
if (projectId) {
  const errors = await checkRuntimeErrors(projectId);
  if (errors && errors.trim().length > 0) {
    result += `\n\nRuntime errors detected:\n${errors}`;
  }
}
 
return result;

The model sees the errors immediately and can fix them in the next turn.

HTTP Status Codes

Use appropriate status codes for different error types:

// 400 - Bad Request (client sent invalid data)
if (!body.role || !['user', 'assistant'].includes(body.role)) {
  return c.json({ error: 'Role must be either "user" or "assistant"' }, 400);
}
 
// 401 - Unauthorized (authentication failed)
if (!isValidSignature(signature, body)) {
  return c.json({ error: 'Invalid signature' }, 401);
}
 
// 404 - Not Found (resource doesn't exist)
if (!project) {
  return c.json({ error: 'Project not found' }, 404);
}
 
// 429 - Too Many Requests (rate limited)
if (await isRateLimited(userId)) {
  return c.json({ error: 'Too many requests. Please try again later.' }, 429);
}
 
// 500 - Internal Server Error (something broke on our side)
catch (error) {
  console.error('[API] Unexpected error:', error);
  return c.json({ error: 'Internal server error' }, 500);
}
 
// 503 - Service Unavailable (temporary failure, retry possible)
if (isPoolExhausted) {
  return c.json({
    error: 'Database temporarily unavailable. Please retry.',
    retryable: true
  }, 503);
}

Status codes help clients (including other services) understand what went wrong and whether to retry.

Common Mistakes

Throwing instead of returning: The model can't see thrown exceptions. Return error strings.

// Bad: Model sees nothing
if (!file_path) throw new Error('Missing file_path');
 
// Good: Model sees the error
if (!file_path) return 'Error: file_path is required';

Generic error messages: "Error: Invalid input" doesn't help. Be specific.

// Bad
return 'Error: Invalid input';
 
// Good
return 'Error: file_path must be a string, received undefined';

No retry for transient failures: Network issues and rate limits are temporary. Retry them.

Missing error events: Emit error events to the message queue so the frontend knows what failed.

Not cleaning up resources: Use finally blocks to release locks, close connections, and clean up even when operations fail.

What We're Skipping

Circuit breakers, error budgets, distributed tracing, structured logging frameworks, custom error classes. These add sophistication but aren't necessary initially.

The patterns here—validation, try-catch, retry, graceful degradation, clear messages—handle most failure scenarios. Start with these. Add complexity when you need it.

What's Next

Your tools now handle errors gracefully. They validate inputs, retry transient failures, provide clear feedback, and clean up resources even when things go wrong.

Next, we'll look at how to test these error handling patterns to ensure they work correctly under failure conditions. Testing error paths is just as important as testing happy paths.