Building AI Coding Agents

Your agent needs hands. Reading files, querying databases, executing commands—these are tools. Without them, your agent is just a chatbot with opinions. With them, it becomes something that can actually do work.

But tools aren't just functions you pass to the LLM. They need structure. Parameter validation so the model can't send garbage. Context access so they can read state and emit updates. Error handling that gives useful feedback instead of cryptic crashes.

We'll build tools using the @openai/agents SDK. You'll see how to define parameters with Zod, access runtime context, and structure tools so they're maintainable when you have dozens of them.

Why Tool Structure Matters

The LLM decides which tools to call based on their descriptions. It generates parameters based on your schema. If your schema is vague, the model guesses. If your validation is weak, you get runtime errors.

Good tool structure gives you:

Type safety: Zod validates parameters before execution. Wrong types? The tool fails fast with a clear message.

Context access: Tools need more than just parameters. They need to read project state, emit progress updates, and coordinate with other systems.

Clear contracts: Your tool description tells the model what it does. Your parameters define what it needs. Get both right and the model uses your tools correctly.

Get it wrong and you debug obscure failures where the model sends undefined when you expected a string, or calls tools in the wrong order because the descriptions were ambiguous.

Basic Tool Definition

Here's the simplest tool structure:

import { tool } from '@openai/agents';
import { z } from 'zod';
 
export const readFileTool = tool({
  name: 'read_file',
  description: 'Reads the contents of a file from the filesystem',
  parameters: z.object({
    file_path: z.string().describe('The absolute path to the file to read'),
  }),
  async execute({ file_path }) {
    try {
      const content = await fs.readFile(file_path, 'utf-8');
      return `File contents:\n${content}`;
    } catch (error) {
      return `Error reading file: ${error.message}`;
    }
  }
});

Three required fields:

name: How the model references the tool
description: What the model uses to decide when to call it
parameters: Zod schema defining what the tool expects
execute: Async function that does the work

The model sees the name and description. It generates parameters matching your schema. Your execute function receives validated parameters.

Parameter Schemas with Zod

Zod gives you type-safe parameter validation. The model generates JSON. Zod parses it and validates types before your tool runs.

Required vs Optional Parameters

parameters: z.object({
  // Required string
  table_name: z.string().describe('Name of the database table'),
 
  // Optional string
  where_clause: z.string().optional().describe('SQL WHERE clause filter'),
 
  // Nullable string
  order_by: z.string().nullable().describe('Column to sort by'),
 
  // With default value
  method: z.string().default('GET').describe('HTTP method (GET, POST, PUT, DELETE)'),
})

Use .optional() when the parameter might not be provided. Use .nullable() when it can be explicitly null. Use .default() to provide fallback values.

Complex Parameter Types

parameters: z.object({
  // Enum values
  operation: z.enum(['create', 'update', 'delete']).describe('Type of database operation'),
 
  // Array of strings
  columns: z.array(z.string()).describe('List of column names to include'),
 
  // Number with constraints
  limit: z.number().min(1).max(100).describe('Maximum number of results (1-100)'),
 
  // Boolean flag
  include_metadata: z.boolean().describe('Whether to include table metadata'),
})

Use enums to restrict choices. The model will only pick from your options. Use arrays when you need multiple values. Add constraints like .min() and .max() for validation.

Descriptions Matter

The .describe() calls aren't just documentation. The model reads them:

// Bad: Vague description
file_path: z.string().describe('A file path')
 
// Good: Specific description with examples
file_path: z.string().describe(
  'Path to the API route file (e.g., "src/app/api/users/route.ts")'
)
 
// Bad: No guidance
diff: z.string().describe('Code changes')
 
// Good: Format specification
diff: z.string().describe(
  'A formatted code diff representing the changes to apply'
)

Be specific. Include examples. Specify formats. The model uses these to generate correct parameters.

Accessing Runtime Context

Tools need more than parameters. They need to access project state, emit events, read configuration. That's what RunContext provides.

Defining Context Type

First, define what context your tools need:

// types/agent.ts
export interface DatabaseAgentContext {
  fileSystem: FileSystemClient;
  mux: EventEmitter;
  helpers: AgentHelpers;
  changesTracked: Array<{ filePath: string; content: string }>;
  databaseUrl: string;
  apiRouteGenerationModel: LanguageModel;
  seederGenerationModel: LanguageModel;
  abortSignal?: AbortSignal;
}

This defines what your tools can access. File system client for file operations. Event emitter for progress updates. Helpers for common operations. Models for sub-generation tasks.

Using Context in Tools

import { RunContext } from '@openai/agents';
 
export const manageTableSchemaTool = tool({
  name: 'manage_table_schema',
  description: 'Modifies database table schema',
  parameters: z.object({
    diff: z.string().describe('Code diff to apply to schema'),
  }),
  async execute({ diff }, runContext?: RunContext<DatabaseAgentContext>) {
    // Extract context
    const { fileSystem, mux, helpers, changesTracked } =
      runContext?.context || {};
 
    // Validate context is available
    if (!fileSystem || !mux || !helpers) {
      throw new Error('Context not available');
    }
 
    // Use context for operations
    await helpers.emitStepStarted("Setting up tables", "schema");
 
    const currentSchema = await fileSystem.readFile("src/db/schema.ts");
    const updatedSchema = await applyCodeEdit(currentSchema, diff);
 
    await fileSystem.writeFile("src/db/schema.ts", updatedSchema);
 
    // Track changes
    changesTracked?.push({
      filePath: "src/db/schema.ts",
      content: updatedSchema
    });
 
    await helpers.emitStepCompleted("Set up tables", "schema");
 
    return `✅ Schema updated successfully`;
  }
});

The runContext parameter gives you access to shared state. Extract what you need. Validate it exists. Use it to coordinate operations.

Real-World Tool Examples

Let's look at complete tools from actual agent implementations.

File Management Tool

export const manageApiRoutesTool = tool({
  name: 'manage_api_routes',
  description: 'Creates or modifies API route files in Next.js applications',
  parameters: z.object({
    file_path: z.string().describe(
      'Path to the API route file (e.g., "src/app/api/users/route.ts")'
    ),
    diff: z.string().describe(
      'Instant apply formatted diff showing the changes to apply'
    )
  }),
  async execute({ file_path, diff }, runContext?: RunContext<DatabaseAgentContext>) {
    const { fileSystem, helpers, changesTracked } = runContext?.context || {};
    if (!fileSystem || !helpers) throw new Error('Context not available');
 
    try {
      // Emit progress
      await helpers.emitStepStarted("Editing API routes", "api", {
        api_path: file_path,
        operation: "modify"
      });
 
      // Read existing file or create template
      let currentContent: string;
      try {
        currentContent = await fileSystem.readFile(file_path) ||
          "import { NextRequest, NextResponse } from 'next/server';";
      } catch {
        currentContent = "import { NextRequest, NextResponse } from 'next/server';";
      }
 
      // Validate input
      if (!diff.trim()) {
        const errorMsg = "❌ API changes are required. Please specify what changes to make.";
        await helpers.emitStepFailed("Created API routes", errorMsg, "api");
        return errorMsg;
      }
 
      // Apply changes
      const updatedContent = await applyCodeEdit(currentContent, diff);
      await fileSystem.writeFile(file_path, updatedContent);
 
      // Track for later use
      changesTracked?.push({
        filePath: file_path,
        content: updatedContent
      });
 
      await helpers.emitStepCompleted("Edited API routes", "api");
 
      return `✅ Successfully modified API route at ${file_path}`;
 
    } catch (error) {
      const errorMsg = `Error: ${error instanceof Error ? error.message : String(error)}`;
      await helpers.emitStepFailed("Created API routes", errorMsg, "api");
      return errorMsg;
    }
  }
});

Notice the pattern:

Validate context exists
Emit progress event (started)
Read current state
Validate input
Perform operation
Track changes
Emit completion
Return clear message

LLM Generation Tool

Sometimes tools need to call other models for sub-tasks:

export const generateSeederFileTool = tool({
  name: 'generate_seeder_file',
  description: 'Generates database seeder files with sample data',
  parameters: z.object({
    table_name: z.string().describe('Name of the table to create seeder for'),
    instructions: z.string().describe(
      'Comprehensive description of the sample data to generate, ' +
      'including patterns, distributions, and specific requirements'
    )
  }),
  async execute({ table_name, instructions }, runContext?: RunContext<DatabaseAgentContext>) {
    const { fileSystem, helpers, changesTracked } = runContext?.context || {};
    const model = runContext?.context.seederGenerationModel;
    const abortSignal = runContext?.context.abortSignal;
 
    if (!fileSystem || !helpers || !model) {
      throw new Error('Required context not available');
    }
 
    try {
      const seederFilePath = `src/db/seeds/${table_name}.ts`;
 
      await helpers.emitStepStarted("Adding test data", "seeder", {
        table_name,
        seeder_path: seederFilePath
      });
 
      // Read schema for context
      const schemaContext = await fileSystem.readFile("src/db/schema.ts");
      if (!schemaContext) {
        throw new Error("Schema file not found");
      }
 
      // Generate code using LLM
      const { text: rawCode } = await generateText({
        model,
        messages: [
          { role: 'system', content: SEEDER_GENERATION_PROMPT },
          { role: 'user', content: generateSeederPrompt(instructions, schemaContext, table_name) }
        ],
        maxRetries: 3,
        ...(abortSignal ? { abortSignal } : {}),
      });
 
      // Clean up markdown fences
      let generatedCode = rawCode;
      if (generatedCode.includes('```')) {
        generatedCode = generatedCode
          .replace(/```(?:typescript|ts|javascript|js)?\n?/g, '')
          .replace(/```$/g, '')
          .trim();
      }
 
      // Write to filesystem
      await fileSystem.writeFile(seederFilePath, generatedCode);
 
      // Track changes
      changesTracked?.push({
        filePath: seederFilePath,
        content: generatedCode
      });
 
      // Execute the seeder
      await helpers.runSeederCommand(table_name);
 
      await helpers.emitStepCompleted("Added test data", "seeder");
 
      return `✅ Successfully generated and executed seeder for '${table_name}'`;
 
    } catch (error) {
      const errorMsg = `Error: ${error.message}`;
      await helpers.emitStepFailed("Added test data", errorMsg, "seeder");
      return errorMsg;
    }
  }
});

Key points:

Sub-models passed via context
Respect abort signals for cancellation
Clean up LLM output (markdown fences)
Chain operations (generate → write → execute)
Track all changes for rollback

HTTP Request Tool

export const testApiRouteTool = tool({
  name: 'test_api_route',
  description: 'Tests API endpoints by making HTTP requests',
  parameters: z.object({
    endpoint: z.string().describe(
      'The API endpoint to test (e.g., "/api/users")'
    ),
    method: z.string().default('GET').describe(
      'HTTP method (GET, POST, PUT, DELETE, PATCH). Defaults to GET.'
    ),
    data: z.string().nullable().optional().describe(
      'JSON data for POST/PUT/PATCH requests'
    ),
    headers: z.string().nullable().optional().describe(
      'Additional headers (e.g., "Content-Type: application/json")'
    )
  }),
  async execute({ endpoint, method = 'GET', data, headers }, runContext?) {
    const { helpers } = runContext?.context || {};
    if (!helpers) throw new Error('Context not available');
 
    try {
      // Normalize endpoint to full URL
      const fullUrl = endpoint.startsWith("http")
        ? endpoint
        : `http://localhost:3000${endpoint.startsWith("/") ? endpoint : "/" + endpoint}`;
 
      await helpers.emitStepStarted("Testing API route", "test", {
        endpoint: fullUrl,
        method: method.toUpperCase()
      });
 
      // Build fetch options
      const fetchOptions: RequestInit = {
        method: method.toUpperCase(),
        headers: {},
      };
 
      // Add headers
      if (headers) {
        const headerLines = headers.trim().split('\n');
        for (const header of headerLines) {
          const [key, ...valueParts] = header.split(':');
          if (key && valueParts.length > 0) {
            fetchOptions.headers[key.trim()] = valueParts.join(':').trim();
          }
        }
      } else if (["POST", "PUT", "PATCH"].includes(method.toUpperCase()) && data) {
        fetchOptions.headers['Content-Type'] = 'application/json';
      }
 
      // Add data
      if (data && ["POST", "PUT", "PATCH"].includes(method.toUpperCase())) {
        fetchOptions.body = data;
      }
 
      // Execute request
      const response = await fetch(fullUrl, fetchOptions);
      const statusCode = response.status.toString();
      const responseText = await response.text();
 
      // Format JSON if possible
      let formattedResponse;
      try {
        const parsed = JSON.parse(responseText);
        formattedResponse = JSON.stringify(parsed, null, 2);
      } catch {
        formattedResponse = responseText;
      }
 
      await helpers.emitStepCompleted("Tested API route", "test", {
        endpoint: fullUrl,
        status_code: statusCode
      });
 
      if (statusCode.startsWith("2")) {
        return `✅ API test successful!\n\nStatus: ${statusCode}\nResponse:\n\`\`\`json\n${formattedResponse}\n\`\`\``;
      } else {
        return `⚠️ API test completed with status ${statusCode}\n\nResponse:\n\`\`\`json\n${formattedResponse}\n\`\`\``;
      }
 
    } catch (error) {
      const errorMsg = `Error testing API: ${error.message}`;
      await helpers.emitStepFailed("Tested API route", errorMsg, "test");
      return errorMsg;
    }
  }
});

This shows:

URL normalization
Building fetch options with proper headers
Response parsing and status handling
JSON formatting for readability
Different success messages based on status code

Organizing Multiple Tools

When you have dozens of tools, organization matters:

// tools/database/index.ts
export const manageTableSchemaTool = tool({ /* ... */ });
export const manageApiRoutesTool = tool({ /* ... */ });
export const generateSeederFileTool = tool({ /* ... */ });
export const executeSqlQueryTool = tool({ /* ... */ });
export const testApiRouteTool = tool({ /* ... */ });
 
// Export as array for easy registration
export const allDatabaseTools = [
  manageTableSchemaTool,
  manageApiRoutesTool,
  generateSeederFileTool,
  executeSqlQueryTool,
  testApiRouteTool,
];
 
export default allDatabaseTools;

Group related tools together. Export as arrays. Makes it easy to register tools per agent type:

import allDatabaseTools from './tools/database';
import allDesignTools from './tools/design';
 
// Database agent gets database tools
const dbAgent = new Agent({
  tools: allDatabaseTools,
  // ...
});
 
// Design agent gets design tools
const designAgent = new Agent({
  tools: allDesignTools,
  // ...
});

Different agents, different capabilities.

Error Handling Patterns

Good error handling gives the model useful feedback:

Input Validation Errors

if (!diff.trim()) {
  const errorMsg = "❌ Changes are required. Please specify what changes to make.";
  await helpers.emitStepFailed("Operation", errorMsg);
  return errorMsg; // Don't throw - return message so model sees it
}
 
if (!command.startsWith("npm")) {
  return "❌ Invalid command. Must start with 'npm'";
}

Return error messages instead of throwing. The model sees the message and can adjust its next call.

Operation Errors

try {
  const result = await fileSystem.writeFile(filePath, content);
 
  if (!result.success) {
    await helpers.emitStepFailed("Operation", result.error);
    return `❌ Operation failed:\n${result.error}`;
  }
 
  await helpers.emitStepCompleted("Operation");
  return `✅ Success: File written successfully`;
 
} catch (error) {
  const errorMsg = `Error: ${error instanceof Error ? error.message : String(error)}`;
  await helpers.emitStepFailed("Operation", errorMsg);
  return errorMsg;
}

Emit events for both success and failure. Return detailed messages. The model uses these to decide what to do next.

Retries and Fallbacks

Some operations should retry:

const maxAttempts = 3;
let response = null;
 
for (let attempt = 0; attempt < maxAttempts; attempt++) {
  try {
    response = await fetch(url);
 
    if (response.status === 502 && attempt < maxAttempts - 1) {
      // Retry on gateway errors
      await helpers.startService(); // Try to fix the issue
      await new Promise(resolve => setTimeout(resolve, 1000));
      continue;
    }
 
    break; // Success or non-retryable error
  } catch (error) {
    if (attempt === maxAttempts - 1) throw error;
    await new Promise(resolve => setTimeout(resolve, 1000));
  }
}

Retry transient failures. Try to recover from known issues. Give up after max attempts with clear message.

Common Patterns

Progress Tracking

await helpers.emitStepStarted("Operation name", "category", {
  additional: "context"
});
 
// Do work
 
await helpers.emitStepCompleted("Operation name", "category", {
  result: "data"
});

Emit events so the frontend shows progress. Include context so users know what's happening.

Change Tracking

// Track changes for later rollback or review
changesTracked?.push({
  filePath: file_path,
  content: updatedContent
});

Keep a record of what changed. Useful for rollback, review, or git commits.

Context Passing

// Update context for subsequent tools
if (runContext?.context) {
  runContext.context.clonedWebsiteContext = clonedData;
}

Tools can pass data to each other through shared context.

What We're Skipping

Tool versioning, dynamic tool loading, tool result caching, complex parameter transformations. These matter for production systems but not for understanding the basics.

Focus on getting tools working first. Make them reliable. Add sophistication later.

What's Next

You have tools that the agent can call. Next, we'll look at agent context and state management—how to manage conversation history, coordinate between tools, and handle long-running operations.

That's where your agent becomes stateful and can handle complex multi-step tasks.