Building AI Coding Agents

Your agent is editing a file. While it's working, the user switches to another file in their editor. The agent finishes, streams a success message to the frontend. But wait—the frontend still shows the old file content. Or worse: two tools try to edit the same file simultaneously. The second one overwrites the first's changes. The user sees nothing. Your agent thinks it succeeded.

This is the state management problem. Your agent isn't just one process. It's a distributed system: tools running in parallel, events streaming to multiple frontends (web and desktop), conversation history in a database, file operations in a sandbox, real-time updates flowing through websockets. All of it needs to stay synchronized.

Get state management wrong, and you get race conditions, lost updates, stale UI, and confused agents. Get it right, and everything just works—tools coordinate seamlessly, multiple agents share context, the UI updates instantly, and users never see inconsistency.

Why State Actually Matters

Here's what happens without good state management:

Your agent edits a file while the user is also editing it. Who wins? If you don't coordinate, you get merge conflicts or lost changes.

Two tools read the same file simultaneously. One gets stale data. It makes decisions based on old information. The resulting code doesn't match reality.

A sub-agent updates the todo list. But the main agent doesn't see it. So it duplicates work or skips tasks.

The user cancels a request. But the agent keeps running. It burns through API credits doing work nobody wants.

Your agent streams events to the frontend. But events arrive out of order. The UI shows "File created" before "Creating file." Users get confused.

This isn't theoretical. These are real bugs you'll hit. State management is how you prevent them.

The Mental Model: State Layers

Think of state in three layers:

Layer 1: Request-Scoped State - Lives for one agent execution. Things like "which files am I editing right now?" or "what's the current task?" Dies when the request completes.

Layer 2: Session-Scoped State - Lives across multiple requests in a conversation. Things like conversation history, project context, accumulated changes. Persists in a database.

Layer 3: Global State - Lives across all sessions and users. Things like project files in the sandbox, Supabase configuration, Stripe integration settings. Shared by everyone working on the project.

Your challenge: coordinate between these layers without losing data or creating conflicts.

Agent Context: The State Container

The AgentContext interface is your primary state container. It's passed to every tool, every agent, every operation:

// server/src/types/agent.ts
export interface AgentContext {
  projectId: string;                    // Which project we're working on
  sandboxClient: SandboxClient;         // File system access
  mux: AgentMux;                        // Event streaming
  todos?: TodoItem[];                   // Current task list
  filesInSession?: string[];            // Files touched this session
  abortSignal?: AbortSignal;            // Cancellation mechanism
  fileLockManager?: FileLockManager;    // Coordinate file access
  userId?: string;                      // Who's making the request
  userEmail?: string;
  gitRepoUrl?: string;
  template?: string;
  desktopClient?: DesktopClient;
  attachmentImageUrls?: string[];
  hasFileOperations?: boolean;
  markFileOperationDone?: () => Promise<void>;
}

Every tool receives this through RunContext:

// server/src/agents/coding/tools/v2/write.tsx
export const writeTool = tool({
  name: "Write",
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    // Extract what you need from context
    const sandboxClient = runContext?.context.sandboxClient;
    const mux = runContext?.context.mux;
    const todos = runContext?.context.todos;
    const fileLockManager = runContext?.context.fileLockManager;
    const abortSignal = runContext?.context.abortSignal;
 
    // Check if operation was cancelled
    if (abortSignal?.aborted) {
      throw new Error('Operation cancelled by user');
    }
 
    // Use context for operations
    await sandboxClient.writeFile(filePath, content);
 
    // Emit events through shared mux
    await mux?.put({
      type: 'coding_agent.create_file.completed',
      data: { file_path: filePath }
    });
 
    return result;
  }
});

Why this pattern works:

Single source of truth: All tools see the same context
No global state: Context is explicitly passed, easier to test
Type-safe: TypeScript ensures tools access valid properties
Composable: Sub-agents can create their own context layers

The Mux: Event Streaming Hub

The AgentMux is your pub/sub system for real-time events. Tools emit events, frontend subscribes:

// server/src/lib/mux.ts
export class AgentMux {
  private emitter: EventEmitter;
 
  constructor() {
    this.emitter = new EventEmitter();
    this.emitter.setMaxListeners(0); // Unlimited listeners
  }
 
  async put(event: any): Promise<void> {
    this.emitter.emit('event', event);
  }
 
  subscribe(handler: (event: any) => void | Promise<void>): () => void {
    this.emitter.on('event', handler);
    return () => this.emitter.off('event', handler); // Cleanup function
  }
 
  listenerCount(): number {
    return this.emitter.listenerCount('event');
  }
 
  clear(): void {
    this.emitter.removeAllListeners();
  }
}

Usage pattern in tools:

// Tool emits start event
await mux?.put({
  type: 'coding_agent.read_file.started',
  data: { file_path: 'src/app.ts' }
});
 
// Do the work
const content = await sandboxClient.readFile('src/app.ts');
 
// Emit completion event
await mux?.put({
  type: 'coding_agent.read_file.completed',
  data: { file_path: 'src/app.ts', content }
});

Usage in SSE handler:

// server/src/routes/agent-routes.ts
const mux = agent.getMux();
 
mux.subscribe(async (event) => {
  stream.writeSSE({
    data: JSON.stringify(event),
  });
});

Why mux instead of direct writes?

Decoupling: Tools don't need to know about SSE or HTTP
Multiple subscribers: Web frontend, desktop app, test harness all listen
Event ordering: Single emitter ensures order is preserved
Memory efficient: No buffering, events flow through immediately

Conversation History: Context Across Turns

Users iterate. "Add a button." Then "Make it blue." Then "Add a click handler." Each request builds on the previous. Your agent needs conversation history:

// server/src/routes/agent-routes.ts
export interface ConversationMessage {
  role: 'user' | 'assistant';
  content: string;
}
 
// Client sends history with each request
const conversationHistory: ConversationMessage[] = body.chatHistory?.map(m => {
  let content = m.content;
 
  // Include tool usage in context
  if (m.actionLogs && m.actionLogs.length > 0) {
    const filteredLogs = filterActionLogs(m.actionLogs);
    content += `\n\nTools called:\n${
      filteredLogs.map(log =>
        `- ${log.tool_name}: ${log.input}`
      ).join('\n')
    }`;
  }
 
  return {
    role: m.role as 'user' | 'assistant',
    content
  };
});
 
await agent.processRequest({
  userPrompt: currentPrompt,
  conversationHistory, // Past context
  // ...
});

What gets included in history:

Previous user prompts
Assistant responses
Tool calls made (read, write, edit, etc.)
Results from those tool calls

Why include tool calls?

Imagine this conversation:

Turn 1:
User: "Create a button component"
Agent: *uses write tool to create Button.tsx*

Turn 2:
User: "Make it blue"

Without tool call history, the agent doesn't know it already created Button.tsx. It might try to create it again (error) or ask which file to edit (annoying).

With history:

Previous turn context:
Tool used: write(file_path="src/components/Button.tsx", content="...")

Now the agent knows: "Ah, I created Button.tsx. I should edit that file to make it blue."

Shared State: Integration Context

Your agent needs to know about project integrations. Is Supabase configured? What's the database URL? Which Stripe account should we use?

This context is fetched once, cached, and passed to the agent:

// server/src/agents/coding/agent.ts
export class CodingAgent {
  private supabaseContext: string | null = null;
  private stripeContext: string | null = null;
 
  async initialize(): Promise<void> {
    // Fetch integration contexts
    this.supabaseContext = await getSupabaseContext(this.projectId);
    this.stripeContext = await getStripeContext(this.projectId);
 
    this.initialized = true;
  }
 
  private buildSystemPrompt(): string {
    let prompt = BASE_CODING_AGENT_PROMPT;
 
    // Inject contexts into prompt
    if (this.supabaseContext) {
      prompt += `\n\n${this.supabaseContext}`;
    }
 
    if (this.stripeContext) {
      prompt += `\n\n${this.stripeContext}`;
    }
 
    return prompt;
  }
}

Supabase context example:

// server/src/lib/supabase-utils.ts
export async function getSupabaseContext(projectId: string): Promise<string | null> {
  // Query database for Supabase projects linked to this project
  const { data } = await supabase
    .from('supabase_projects')
    .select('*')
    .eq('project_id', projectId);
 
  if (!data || data.length === 0) return null;
 
  // Format as XML for LLM consumption
  return `
<supabase_context>
  <supabase_project>
    <id>${data.id}</id>
    <supabase_url>${data.endpoint}</supabase_url>
    <database_connection_string>${data.db_url}</database_connection_string>
    <api_keys>
      <api_key>
        <name>anon</name>
        <value>${data.anon_key}</value>
        <type>client-side</type>
      </api_key>
      <api_key>
        <name>service_role</name>
        <value>${data.service_role_key}</value>
        <type>server-side</type>
      </api_key>
    </api_keys>
  </supabase_project>
 
  <guidelines>
When using Supabase:
1. Use anon key for client-side code
2. Use service_role key for server-side admin operations
3. Connection string format: postgresql://...
  </guidelines>
</supabase_context>`;
}

Why XML format?

LLMs are good at parsing structured context. XML is verbose but unambiguous. The agent can extract the anon key when writing client code, or the service_role key when writing server code.

Stripe context works identically:

// server/src/lib/stripe-utils.ts
export async function getStripeContext(projectId: string): Promise<string | null> {
  const { data } = await supabase
    .from('stripe_sandboxes')
    .select('*')
    .eq('project_id', projectId);
 
  if (!data) return null;
 
  return `
<stripe_context>
  <stripe_sandbox>
    <publishable_key>${data.publishable_key}</publishable_key>
    <secret_key>${data.secret_key}</secret_key>
    <account_id>${data.account_id}</account_id>
  </stripe_sandbox>
 
  <guidelines>
When implementing Stripe:
1. Use publishable key in frontend
2. Use secret key in API routes (never expose)
3. Test with card 4242 4242 4242 4242
4. Webhook signing secret: ${data.webhook_secret}
  </guidelines>
</stripe_context>`;
}

Now when a user says "Add Stripe checkout," the agent knows which keys to use.

Abort Signals: Cancellation That Works

Users cancel requests. Maybe they clicked the wrong button. Maybe the agent is taking too long. Maybe they closed the tab. You need to handle this gracefully:

// server/src/routes/agent-routes.ts
agentRoutes.post('/coding-agent', async (c) => {
  // HTTP abort signal (client disconnect)
  const httpAbortSignal = c.req.raw.signal;
 
  // Agent abort controller (for manual cancellation, e.g., credit exhaustion)
  const agentAbortController = new AbortController();
 
  return streamSSE(c, async (stream) => {
    const agent = new CodingAgent({
      projectId: body.projectId,
      abortSignal: agentAbortController.signal,
      abortController: agentAbortController, // Agent can abort itself
    });
 
    // When HTTP connection closes, abort agent
    const handleAbort = () => {
      agentAbortController.abort();
    };
    httpAbortSignal.addEventListener('abort', handleAbort);
 
    try {
      await agent.processRequest({...});
    } finally {
      httpAbortSignal.removeEventListener('abort', handleAbort);
      agent.cleanup();
    }
  });
});

Dual abort signal pattern:

HTTP signal (c.req.raw.signal): Triggers when client disconnects
Agent controller (agentAbortController): Agent can abort itself (e.g., when credits run out)

Usage in tools:

export const bashTool = tool({
  name: "Bash",
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const abortSignal = runContext?.context.abortSignal;
 
    // Check before expensive operation
    if (abortSignal?.aborted) {
      throw new Error('Operation cancelled');
    }
 
    // Long-running operation with periodic checks
    const result = await sandboxClient.executeCommand(
      input.command,
      { signal: abortSignal } // Pass signal to sandbox
    );
 
    return result;
  }
});

Agent self-abort example (credit exhaustion):

// server/src/agents/coding/agent.ts
private async calculateAndDeductTokens(usage: TokenUsage, model: ModelName) {
  const tokensToDeduct = this.calculateWeightedTokens(usage, model);
 
  // Try to deduct credits
  const { data } = await supabase.rpc('decrement_credits', {
    user_id: this.context.userId,
    tokens: tokensToDeduct,
  });
 
  // Check if credits exhausted
  if (data.monthly_credits <= 0 && data.one_time_credits <= 0) {
    // Abort the agent - no more credits
    this.abortController?.abort();
 
    await this.mux.put({
      type: 'coding_agent.credits_exhausted',
      data: { message: 'Credits exhausted. Stopping agent.' }
    });
  }
}

File Lock Manager: Preventing Race Conditions

Two tools try to edit the same file. Without coordination, the second edit overwrites the first. Users lose changes:

// server/src/agents/coding/utils.ts
export class FileLockManager {
  private locks: Map<string, Promise<void>> = new Map();
 
  async acquireLock(filePath: string): Promise<() => void> {
    // Wait for any existing lock on this file
    const existingLock = this.locks.get(filePath) || Promise.resolve();
 
    let releaseLock: () => void;
 
    // Create new lock
    const newLock = new Promise<void>((resolve) => {
      releaseLock = () => {
        // Clean up lock when released
        const currentLock = this.locks.get(filePath);
        if (currentLock === newLock) {
          this.locks.delete(filePath);
        }
        resolve();
      };
    });
 
    // Register lock
    this.locks.set(filePath, newLock);
 
    // Wait for previous operation to finish
    await existingLock;
 
    // Return release function
    return releaseLock!;
  }
}

Usage in tools:

export const editTool = tool({
  name: "Edit",
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const fileLockManager = runContext?.context.fileLockManager;
 
    if (!fileLockManager) {
      throw new Error('FileLockManager not available');
    }
 
    // Acquire lock - waits if file is locked
    const release = await fileLockManager.acquireLock(input.file_path);
 
    try {
      // Read current content
      const content = await sandboxClient.readFile(input.file_path);
 
      // Apply edit
      const newContent = applyEdit(content, input.old_string, input.new_string);
 
      // Write back
      await sandboxClient.writeFile(input.file_path, newContent);
 
      return 'Edit successful';
    } finally {
      // Always release lock
      release();
    }
  }
});

How it works:

Tool A wants to edit app.ts → acquires lock immediately
Tool B wants to edit app.ts → waits on Tool A's lock
Tool A finishes, releases lock
Tool B's lock resolves, it can proceed
No overwrites, no lost changes

Why not use a mutex library?

This is simpler and enough for our use case. We're not dealing with high concurrency—usually 1-3 tools at a time. The Promise-based approach is clear and works well with async/await.

Todo State: Shared Task Tracking

Your agent tracks what it's working on. Sub-agents update the list. Tools append it to results so the agent stays aware:

// server/src/types/agent.ts
export interface TodoItem {
  id: string;
  content: string;
  status: 'pending' | 'in_progress' | 'completed' | 'cancelled';
  priority?: 'high' | 'medium' | 'low';
}

Formatting todos for context:

// server/src/agents/coding/utils.ts
export const formatTodoContext = (todos?: TodoItem[]): string => {
  if (!todos || todos.length === 0) {
    return '';
  }
 
  const todoLines = ['\n\n=== Current Task List ==='];
 
  for (const todo of todos) {
    const priority = todo.priority || 'medium';
    const emoji = todo.status === 'completed' ? '✅' :
                  todo.status === 'in_progress' ? '🔄' : '⏳';
    todoLines.push(
      `${emoji} [${todo.id}] ${todo.content} (${priority} priority)`
    );
  }
 
  todoLines.push('========================');
  return todoLines.join('\n');
};
 
export const appendTodoContext = (result: string, todos?: TodoItem[]): string => {
  const todoContext = formatTodoContext(todos);
  return result + todoContext;
};

Usage in tools:

export const writeTool = tool({
  name: "Write",
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    const todos = runContext?.context.todos;
 
    // Perform write operation
    await sandboxClient.writeFile(filePath, content);
 
    let result = `Successfully created ${filePath}`;
 
    // Append todo context so agent sees current tasks
    result = appendTodoContext(result, todos);
 
    return result;
  }
});

Example output:

Successfully created src/components/Button.tsx

=== Current Task List ===
✅ [1] Create Button component (high priority)
🔄 [2] Add click handler (medium priority)
⏳ [3] Write tests (low priority)
========================

The agent sees this after every tool call. It knows what's done, what's in progress, what's next.

Putting It All Together: Request Flow

Let's trace a complete request to see how state flows:

// 1. Request arrives
agentRoutes.post('/coding-agent', async (c) => {
  const body = await c.req.json<CodingAgentRequest>();
  const httpAbortSignal = c.req.raw.signal;
  const agentAbortController = new AbortController();
 
  return streamSSE(c, async (stream) => {
    // 2. Create agent with initial state
    const agent = new CodingAgent({
      projectId: body.projectId,
      chatSessionId: body.chatSessionId,
      abortSignal: agentAbortController.signal,
      abortController: agentAbortController,
    });
 
    // 3. Initialize agent (fetches integration contexts)
    await agent.initialize();
 
    // 4. Subscribe to event stream
    const mux = agent.getMux();
    mux.subscribe((event) => {
      stream.writeSSE({ data: JSON.stringify(event) });
    });
 
    // 5. Process request with conversation history
    const result = await agent.processRequest({
      userPrompt: body.prompt,
      conversationHistory: body.chatHistory,
      attachmentUrls: body.attachmentUrls,
      fileStructure: body.fileStructure,
    });
 
    // 6. Persist message to database
    await addMessageToProject(body.projectId, {
      role: 'assistant',
      content: result.content,
      userId: body.userId,
      chatSessionId: body.chatSessionId,
      parts: result.parts,
    });
 
    // 7. Send completion
    stream.writeSSE({
      data: JSON.stringify({
        type: 'complete',
        data: 'Agent processing complete'
      }),
    });
  });
});

Inside the agent:

// server/src/agents/coding/agent.ts
async processRequest(options: ProcessRequestOptions) {
  // Build context for this request
  const context: AgentContext = {
    projectId: this.projectId,
    sandboxClient: this.sandboxClient,
    mux: this.mux,
    todos: this.todos,
    fileLockManager: this.fileLockManager,
    abortSignal: this.abortSignal,
    userId: this.userId,
  };
 
  // Route to appropriate model
  const selectedModel = await this.routeRequest(options);
 
  // Initialize agent with selected model
  this.agent = this.initializeAgent(selectedModel);
 
  // Run agent with context
  const stream = await run(this.agent, {
    userPrompt: options.userPrompt,
    conversationHistory: options.conversationHistory,
    context, // Context flows to all tools
  });
 
  // Stream events
  for await (const event of stream) {
    // Event processing, credit tracking, etc.
    await this.mux.put(event);
  }
 
  return result;
}

Inside tools:

export const bashTool = tool({
  name: "Bash",
  async execute(input: any, runContext?: RunContext<AgentContext>) {
    // Extract state from context
    const sandboxClient = runContext?.context.sandboxClient;
    const mux = runContext?.context.mux;
    const abortSignal = runContext?.context.abortSignal;
    const fileLockManager = runContext?.context.fileLockManager;
    const todos = runContext?.context.todos;
 
    // Check abort
    if (abortSignal?.aborted) throw new Error('Cancelled');
 
    // Emit start event
    await mux?.put({
      type: 'coding_agent.bash.started',
      data: { command: input.command }
    });
 
    // Execute command
    const result = await sandboxClient.executeCommand(
      input.command,
      { signal: abortSignal }
    );
 
    // Emit completion
    await mux?.put({
      type: 'coding_agent.bash.completed',
      data: { output: result }
    });
 
    // Append todo context
    return appendTodoContext(result, todos);
  }
});

State flows:

HTTP → Agent: Request body + abort signal
Agent → Context: Builds AgentContext with all state
Context → Tools: Tools receive via RunContext
Tools → Mux: Emit events through shared mux
Mux → SSE: Frontend receives real-time updates
Tools → Agent: Return results with todo context
Agent → Database: Persist conversation

Real-Time Sync: SSE Manager

Multiple frontends (web, desktop) need to stay in sync. The SSE manager handles this:

// server/src/lib/sse-manager.ts
export class SSEManager {
  private clients: Map<string, SSEClient[]> = new Map();
  private supabaseChannels: Map<string, RealtimeChannel> = new Map();
 
  createStream(
    c: Context,
    projectId: string,
    channelType: 'sync' | 'messages'
  ): Response {
    const clientId = crypto.randomUUID();
    const key = `${projectId}:${channelType}`;
 
    // Register client
    if (!this.clients.has(key)) {
      this.clients.set(key, []);
      this.setupSupabaseChannel(projectId, channelType);
    }
 
    // Create SSE stream
    return streamSSE(c, async (stream) => {
      const client = { id: clientId, stream };
      this.clients.get(key)!.push(client);
 
      // Cleanup on disconnect
      c.req.raw.signal.addEventListener('abort', () => {
        this.removeClient(key, clientId);
      });
 
      // Keep-alive ping every 30s
      const pingInterval = setInterval(() => {
        stream.writeSSE({ data: 'ping' });
      }, 30000);
 
      // Wait for connection to close
      await c.req.raw.signal;
      clearInterval(pingInterval);
    });
  }
 
  private setupSupabaseChannel(projectId: string, channelType: string) {
    const channel = supabase.channel(`${projectId}:${channelType}`)
      .on('broadcast', { event: 'sync' }, ({ payload }) => {
        // Forward broadcast to all SSE clients
        this.broadcast(`${projectId}:${channelType}`, {
          type: 'sync',
          data: payload
        });
      })
      .on('postgres_changes',
        {
          event: 'INSERT',
          schema: 'public',
          table: 'messages',
          filter: `project_id=eq.${projectId}`
        },
        (payload) => {
          // Forward DB changes to SSE clients
          this.broadcast(`${projectId}:${channelType}`, {
            type: 'message.new',
            data: payload.new
          });
        }
      )
      .subscribe();
 
    this.supabaseChannels.set(`${projectId}:${channelType}`, channel);
  }
 
  private broadcast(key: string, event: SSEEvent) {
    const clients = this.clients.get(key) || [];
    for (const client of clients) {
      client.stream.writeSSE({ data: JSON.stringify(event) });
    }
  }
}

How sync works:

Desktop app writes file → broadcasts via Supabase Realtime
SSE Manager receives broadcast → forwards to all connected clients
Web frontend receives event → updates file tree

Same flow in reverse: web → Supabase → desktop.

Common Patterns

Pattern 1: Lazy Initialization

Don't fetch state until you need it:

export class CodingAgent {
  private initialized: boolean = false;
  private supabaseContext: string | null = null;
 
  async initialize(): Promise<void> {
    if (this.initialized) return; // Only initialize once
 
    // Fetch expensive state
    this.supabaseContext = await getSupabaseContext(this.projectId);
    this.stripeContext = await getStripeContext(this.projectId);
 
    // Connect to sandbox
    await this.sandboxClient.connectToProject();
 
    this.initialized = true;
  }
 
  async processRequest(options: ProcessRequestOptions) {
    await this.initialize(); // Initialize on first request
    // ... process
  }
}

Pattern 2: Cleanup on Exit

Always clean up resources:

async processRequest(options: ProcessRequestOptions) {
  const release = await this.fileLockManager.acquireLock(filePath);
 
  try {
    // Do work
    return result;
  } finally {
    // Always release lock, even on error
    release();
  }
}

Pattern 3: Event Correlation

Include correlation IDs for debugging:

const operationId = crypto.randomUUID();
 
await mux.put({
  type: 'coding_agent.operation.started',
  data: { operationId, file_path: 'app.ts' }
});
 
// Do work...
 
await mux.put({
  type: 'coding_agent.operation.completed',
  data: { operationId, file_path: 'app.ts', duration: elapsed }
});

Now you can trace individual operations through the event stream.

Pattern 4: Optimistic Updates

Update local state immediately, sync to DB async:

// Update local todo state
this.todos.push({ id, content, status: 'pending' });
 
// Sync to database asynchronously
this.syncTodos().catch(err => {
  console.error('Failed to sync todos:', err);
  // Optionally rollback local state
});

Common Mistakes

Not checking abort signals: Tools keep running after user cancels. Waste API credits and confuse users.

Forgetting to release locks: File remains locked forever. Next edit hangs indefinitely.

Not appending todo context: Agent loses track of tasks. Duplicates work or skips steps.

Sharing mux between projects: Events leak between users. Privacy nightmare.

Not cleaning up event listeners: Memory leak. Server slows down over time.

Forgetting to persist messages: Conversation history lost on refresh. Frustrating UX.

Not handling context fetch failures gracefully: If Supabase context fetch fails, don't crash—continue without it.

Testing State Management

Mock the context for unit tests:

import { describe, it, expect } from 'vitest';
import { bashTool } from './bash';
 
describe('Bash Tool', () => {
  it('respects abort signal', async () => {
    const abortController = new AbortController();
 
    const mockContext: AgentContext = {
      projectId: 'test',
      sandboxClient: mockSandboxClient,
      mux: new AgentMux(),
      abortSignal: abortController.signal,
      fileLockManager: new FileLockManager(),
    };
 
    // Start operation
    const promise = bashTool.execute(
      { command: 'sleep 10' },
      { context: mockContext }
    );
 
    // Abort after 100ms
    setTimeout(() => abortController.abort(), 100);
 
    // Should throw
    await expect(promise).rejects.toThrow('cancelled');
  });
});

What We're Skipping

State persistence strategies (when to write to DB), state migration (handling schema changes), distributed state (multiple server instances), conflict resolution (handling concurrent edits differently), state snapshots (checkpointing for recovery), state debugging tools (inspecting state at runtime).

These matter for production systems, but the patterns we've covered handle most cases.

What's Next

You now understand how state flows through your agent system—from HTTP request to tool execution to real-time frontend updates. But state is just plumbing. The magic happens in your prompts. In the next guide, we'll explore prompt engineering: how to write system prompts that get reliable results, how to structure tool descriptions, and how to guide agent behavior.

That's where your agent goes from "technically works" to "feels intelligent."