Back to home

02 — TOOL CREATION

Tool execution with RunContext

10 min read

Your tool receives parameters from the LLM. But parameters aren't enough. Where does the tool read files from? How does it tell the frontend what's happening? How does it coordinate with other tools running concurrently?

This is what RunContext<AgentContext> solves. It's the shared infrastructure your tools need to do real work—file system access, event streaming, state coordination, cancellation signals.

Without context, you'd pass the same services as parameters to every tool. With context, you define them once and every tool gets access.

What Context Provides

Context is a container for everything tools need beyond their direct parameters:

Core Services: File system client, message queue, project identifier. The infrastructure every tool uses.

Mutable State: Todo lists, tracked changes, session data. Tools can read and modify shared state so subsequent tools see updates.

Optional Capabilities: Desktop mode detection, cancellation signals, environment-specific features. Tools check if these exist and adapt behavior.

Callbacks: Functions tools call to signal state transitions back to the agent. Like marking when a file operation completes.

Here's what AgentContext looks like in practice:

export interface AgentContext {
  // Core services - always available
  projectId: string;
  sandboxClient: SandboxClient;
  mux: AgentMux;
 
  // Optional project info
  template?: string;
  userEmail?: string;
  gitRepoUrl?: string;
 
  // Mutable state - tools can modify
  filesInSession?: string[];
  todos?: TodoItem[];
  attachmentImageUrls?: string[];
 
  // Control flow
  abortSignal?: AbortSignal;
 
  // Environment-specific
  desktopClient?: DesktopClient;
  fileLockManager?: FileLockManager;
 
  // State tracking
  hasFileOperations?: boolean;
  agentUserMessageContent?: AgentInputItem[];
 
  // Callbacks
  markFileOperationDone?: () => Promise<void>;
 
  // User info
  userId?: string;
}

Three categories of properties:

  1. Infrastructure (projectId, sandboxClient, mux): Never null, always available, tools depend on these
  2. Session State (todos, filesInSession): Mutable, tools read and update, persists across tool calls
  3. Optional Features (desktopClient, abortSignal): Conditional, tools check existence before use

Specialized Context Interfaces

Different agent types need different capabilities. Your coding agent needs file operations. Your database agent needs query execution and model access for generating seeders.

Define specialized contexts by extending the base pattern:

export interface DatabaseAgentContext {
  // Core infrastructure
  projectId: string;
  sandboxClient: SandboxClient;
  mux: AgentMux;
 
  // Domain-specific helpers
  helpers: DatabaseHelpers;
  changesTracked: Array<{ filePath: string; content: string }>;
  datahostHostUrl: string;
 
  // Sub-models for generation tasks
  apiRouteGenerationModel: ReturnType<typeof anthropic>;
  seederGenerationModel: ReturnType<typeof anthropic>;
 
  // Control flow
  abortSignal?: AbortSignal;
}

Same core services (projectId, sandboxClient, mux). Different helpers and models. Each agent type gets exactly what it needs.

Accessing Context in Tools

Tools receive context as the second parameter to execute():

export const writeTool = tool({
  name: "Write",
  description: "Writes a file to the local filesystem",
  parameters: WRITE_PARAMETERS,
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    // Extract what you need
    const sandboxClient = runContext?.context.sandboxClient;
    const mux = runContext?.context.mux;
    const todos = runContext?.context.todos;
 
    // Validate required services exist
    if (!sandboxClient) throw new Error("Sandbox client not available");
 
    // Use services for operations
    await sandboxClient.writeFile(file_path, content);
 
    // Return result with context
    return appendTodoContext(result, todos);
  }
});

Pattern:

  1. Extract the services you need from runContext?.context
  2. Validate required services exist
  3. Use services to perform operations
  4. Optionally enrich results with state

The ? optional chaining handles cases where context isn't provided (like in tests).

Common Context Patterns

Pattern 1: Event Lifecycle

The mux (message queue) streams progress to the frontend. Use it to emit lifecycle events:

async execute(input: any, runContext?: RunContext<AgentContext>) {
  const { file_path, content } = parseResult.data;
  const sandboxClient = runContext?.context.sandboxClient;
  const mux = runContext?.context.mux;
 
  if (!sandboxClient) throw new Error("Sandbox client not available");
 
  // Emit start event
  await mux?.put({
    type: "coding_agent.create_file.started",
    data: { file_path },
  });
 
  try {
    // Send progress chunk
    await mux?.put({
      type: "coding_agent.create_file.chunk",
      data: { file_path, chunk: content },
    });
 
    // Perform operation
    await sandboxClient.writeFile(file_path, content);
 
    // Emit completion event
    await mux?.put({
      type: "coding_agent.create_file.completed",
      data: { file_path },
    });
 
    return `Successfully created ${file_path}`;
 
  } catch (error) {
    const errorMsg = error instanceof Error ? error.message : String(error);
 
    // Emit error event
    await mux?.put({
      type: "coding_agent.create_file.error",
      data: { file_path, error: errorMsg },
    });
 
    return `Error writing file: ${errorMsg}`;
  }
}

Three events: started, completed, error. The frontend shows real-time progress. The user sees what's happening as it happens.

Pattern 2: State Sharing

Tools can modify context state directly. Other tools see the updates:

export const todoWriteTool = tool({
  name: "TodoWrite",
  description: "Create and manage task list",
  parameters: TODO_WRITE_PARAMETERS,
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const { todos } = parseResult.data;
    const mux = runContext?.context.mux;
    const currentTodos = runContext?.context.todos || [];
 
    // Merge new todos with existing by ID
    const todosMap = new Map(currentTodos.map(t => [t.id, { ...t }]));
 
    for (const newTodo of todos) {
      todosMap.set(newTodo.id, {
        id: newTodo.id,
        content: newTodo.content,
        status: newTodo.status,
        priority: newTodo.priority,
      });
    }
 
    const updatedTodos = Array.from(todosMap.values());
 
    // Modify context state directly
    runContext!.context.todos = updatedTodos;
 
    // Emit event so frontend updates
    await mux?.put({
      type: "coding_agent.todo_write.completed",
      data: { todos: updatedTodos },
    });
 
    return `Updated ${todos.length} todo items`;
  }
});

Direct mutation: runContext.context.todos = updatedTodos. Subsequent tools read runContext.context.todos and see the updated list.

This is how appendTodoContext works—it reads context.todos and appends them to tool results so the LLM always sees current task status.

Pattern 3: Sequential Access Control

When multiple tools try to edit the same file concurrently, you get race conditions. FileLockManager serializes access:

async execute(input: any, runContext?: RunContext<AgentContext>) {
  const { file_path, edit, instruction } = parseResult.data;
  const sandboxClient = runContext?.context.sandboxClient;
  const fileLockManager = runContext?.context.fileLockManager;
 
  if (!sandboxClient) throw new Error("Sandbox client not available");
 
  // Read original content
  const originalCode = await sandboxClient.readFile(file_path);
 
  try {
    // Acquire lock for this file
    const releaseLock = fileLockManager
      ? await fileLockManager.acquireLock(file_path)
      : null;
 
    let currentCode: string | null = null;
 
    try {
      // Re-read file after acquiring lock
      // (ensures we have latest version if another tool was writing)
      currentCode = await sandboxClient.readFile(file_path);
      if (!currentCode) {
        throw new Error(`File ${file_path} was deleted while waiting for lock`);
      }
 
      // Apply edit
      const updatedCode = await applyCodeEdit(currentCode, edit, instruction);
 
      // Write updated content
      await sandboxClient.writeFile(file_path, updatedCode);
 
    } finally {
      // Always release lock, even if write fails
      if (releaseLock) {
        releaseLock();
      }
    }
 
    return `Successfully updated ${file_path}`;
 
  } catch (error) {
    return `Error editing file: ${error.message}`;
  }
}

Key points:

  1. Acquire before work: Get lock before modifying file
  2. Re-read after lock: File might have changed while waiting for lock
  3. Release in finally: Always release, even if operation fails
  4. Fallback gracefully: If no lock manager, skip locking (useful for tests)

Without this, two concurrent edits could overwrite each other.

Pattern 4: Conditional Features

Some features only exist in certain environments. Check before using:

async execute(input: any, runContext?: RunContext<AgentContext>) {
  const { file_path, content } = parseResult.data;
  const sandboxClient = runContext?.context.sandboxClient;
  const projectId = runContext?.context.projectId;
 
  // Write the file
  await sandboxClient.writeFile(file_path, content);
 
  let result = `Successfully wrote ${file_path}`;
 
  // Check for runtime errors after writing
  if (projectId) {
    let errors = "";
 
    // Desktop mode has specialized error checking
    if (runContext?.context.desktopClient) {
      errors = await runContext.context.desktopClient.checkForErrorsInDestop();
    } else {
      // Cloud mode uses different error checking
      errors = await checkRuntimeErrors(projectId);
    }
 
    if (errors && errors.trim().length > 0) {
      result += `\n\nRuntime errors detected:\n${errors}`;
    }
  }
 
  return result;
}

Optional chaining: runContext?.context.desktopClient. If it exists, use specialized error checking. If not, fall back to standard approach.

Tools adapt to their environment without needing separate implementations.

Pattern 5: Callbacks for State Transitions

Sometimes tools need to notify the agent about state changes. Callbacks enable this:

async execute(input: any, runContext?: RunContext<AgentContext>) {
  const { file_path, content } = parseResult.data;
  const sandboxClient = runContext?.context.sandboxClient;
 
  // Write file
  await sandboxClient.writeFile(file_path, content);
 
  // Notify agent that a file operation completed
  await runContext?.context.markFileOperationDone?.();
 
  return `Successfully wrote ${file_path}`;
}

The agent tracks whether any file operations have occurred. This callback signals a state transition. The agent uses this to decide whether to run build checks, update git status, etc.

Callbacks let tools communicate with the agent without tight coupling.

Context Initialization

Context is created once per conversation and passed to all tool executions.

Here's how the coding agent builds context:

// In CodingAgent class, before running the agent
const agentContext: AgentContext = {
  // Core services
  projectId: this.projectId,
  sandboxClient: this.sandboxClient as any,
  mux: this.mux,
 
  // Project info
  template: this.template,
  userEmail: this.userEmail,
  gitRepoUrl: this.gitRepoUrl,
 
  // Session state
  filesInSession,
  todos: this.todos,
  attachmentImageUrls: this.attachmentImageUrls,
 
  // Concurrency control
  fileLockManager: this.fileLockManager,
 
  // State tracking
  hasFileOperations: this.hasFileOperations,
  agentUserMessageContent: agentUserMessageContent as AgentInputItem[],
 
  // Callback
  markFileOperationDone: () => this.markFileOperationDone(),
 
  // Optional features (use spread to only include if defined)
  ...(this.abortSignal ? { abortSignal: this.abortSignal } : {}),
  ...(this.desktopClient ? { desktopClient: this.desktopClient } : {}),
  ...(this.userId ? { userId: this.userId } : {}),
};
 
// Pass context to run function
const stream = await run(this.agent, inputMessages, {
  context: agentContext,
  maxTurns: 100,
  stream: true,
  ...(this.abortSignal ? { signal: this.abortSignal } : {}),
});

Build context from agent instance properties. Use conditional spread for optional fields. Pass to run() options.

Every tool execution in this conversation gets the same context object. State mutations persist. Services remain consistent.

Real-World Example: Edit Tool

Here's a complete tool using multiple context features:

export const editDiffTool = tool({
  name: "Edit",
  description: "Edits an existing file using a diff format",
  parameters: EDIT_DIFF_PARAMETERS,
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const { file_path, instruction, edit } = parseResult.data;
 
    // Extract context services
    const sandboxClient = runContext?.context.sandboxClient;
    const mux = runContext?.context.mux;
    const todos = runContext?.context.todos;
    const fileLockManager = runContext?.context.fileLockManager;
    const userId = runContext?.context.userId;
 
    if (!sandboxClient) throw new Error("Sandbox client not available");
 
    // Read original content
    const originalCode = await sandboxClient.readFile(file_path);
    if (!originalCode) {
      return `Error: File ${file_path} does not exist`;
    }
 
    // Emit start event
    await mux?.put({
      type: "coding_agent.edit_file.started",
      data: { file_path },
    });
 
    try {
      // Acquire lock for sequential access
      const releaseLock = fileLockManager
        ? await fileLockManager.acquireLock(file_path)
        : null;
 
      let updatedCode = "";
 
      try {
        // Re-read after lock (file might have changed)
        const currentCode = await sandboxClient.readFile(file_path);
        if (!currentCode) {
          throw new Error(`File was deleted while waiting for lock`);
        }
 
        // Apply edit
        updatedCode = await applyCodeEdit(currentCode, edit, instruction);
 
        // Send progress chunk
        await mux?.put({
          type: "coding_agent.edit_file.chunk",
          data: { file_path, chunk: updatedCode },
        });
 
        // Write updated content
        await sandboxClient.writeFile(file_path, updatedCode);
 
      } finally {
        // Always release lock
        if (releaseLock) {
          releaseLock();
        }
      }
 
      // Emit completion
      await mux?.put({
        type: "coding_agent.edit_file.completed",
        data: { file_path, result: "File updated successfully" },
      });
 
      // Mark file operation done
      await runContext?.context.markFileOperationDone?.();
 
      let result = `Successfully updated ${file_path}`;
 
      // Check for runtime errors (environment-specific)
      const projectId = runContext?.context.projectId;
      if (projectId) {
        let errors = "";
        if (runContext?.context.desktopClient) {
          errors = await runContext.context.desktopClient.checkForErrorsInDestop();
        } else {
          errors = await checkRuntimeErrors(projectId);
        }
        if (errors && errors.trim().length > 0) {
          result += `\n\nRuntime errors detected:\n${errors}`;
        }
      }
 
      // Append current todos to result
      return appendTodoContext(result, todos);
 
    } catch (error) {
      const errorMsg = error instanceof Error ? error.message : String(error);
 
      await mux?.put({
        type: "coding_agent.edit_file.error",
        data: { file_path, error: errorMsg },
      });
 
      return appendTodoContext(`Error editing file: ${errorMsg}`, todos);
    }
  }
});

This tool uses:

  • sandboxClient: Read and write files
  • mux: Stream lifecycle events (started/chunk/completed/error)
  • fileLockManager: Serialize concurrent edits to same file
  • todos: Append task context to results
  • markFileOperationDone: Signal state transition to agent
  • desktopClient: Environment-specific error checking
  • projectId: Conditional feature gating

All through context. No parameters needed for infrastructure.

What We're Skipping

Context versioning, middleware patterns, dependency injection frameworks, sophisticated state machines. These add complexity you don't need initially.

The patterns shown here—extract services, emit events, share state, acquire locks, check optional features—cover most use cases. Start simple. Add sophistication when you need it.

What's Next

Your tools now have access to shared infrastructure through context. They can coordinate operations, maintain state, and adapt to different environments.

Next, we'll look at how the agent coordinates these tools—managing conversation history, handling streaming responses, and orchestrating multi-step operations.

That's where your tools become part of a larger system that can handle complex, multi-turn interactions.