Building AI Coding Agents

LLMs are stateless. They don't remember your file structure, your database schema, or what you built yesterday. Every request starts from scratch.

That's a problem when you're building a coding agent. Without context, your agent can't see what files exist, what integrations you've set up, or what the user just fixed. It's flying blind.

Good context management changes everything. Your agent knows the codebase, understands the current state, and makes informed decisions. But loading context is expensive - you need smart caching and careful initialization.

We'll build a system that loads context once, caches everything, and assembles rich user messages from multiple sources. This is what makes a coding agent actually useful.

What Context Does a Coding Agent Need?

Think about what you need to know when working on a project:

Project structure: What files and directories exist? What's the layout?
External integrations: Database connections, API credentials, third-party services
Conversation history: What did we talk about? What tools were called?
Session state: What's the current workspace? Are there runtime errors? Did they upload files?
Project metadata: Configuration settings, environment variables, feature flags

Your agent needs all of this to be useful. Load too little and it's confused. Load too much and requests get expensive. The key is knowing what to load, when to load it, and how to cache it.

The Initialization Pattern (Load Once, Cache Everything)

Don't reload context on every request. Load it once during initialization and cache it in your agent instance.

Here's the pattern:

export class CodingAgent {
  private initialized = false;
  private fileStructure: string = '';
  private integrationContext: Record<string, any> = {};
  private projectConfig: any = null;
 
  async initialize() {
    if (this.initialized) return;
 
    // Load everything in parallel for speed
    await Promise.all([
      this.loadFileStructure(),
      this.loadIntegrations(),
      this.loadProjectConfig()
    ]);
 
    this.initialized = true;
  }
 
  private async loadFileStructure() {
    this.fileStructure = await this.getCodebaseStructure();
  }
 
  private async loadIntegrations() {
    // Load external service credentials (databases, APIs, etc.)
    this.integrationContext = await this.fetchIntegrationCredentials();
  }
 
  private async loadProjectConfig() {
    // Load environment variables, feature flags, etc.
    this.projectConfig = await this.fetchProjectConfiguration();
  }
}

Run initialize() once before handling the first request. Everything gets loaded in parallel for speed, then cached in instance properties. Subsequent requests just use the cached data.

Structuring State in Your Agent Class

Store context as private instance properties. This keeps state encapsulated, makes testing easier, and supports multiple concurrent sessions.

export class CodingAgent {
  // Project context (cached, rarely changes)
  private fileStructure: string = '';
  private projectConfig: any = null;
 
  // Integration context (cached, loaded once)
  private integrationContext: Record<string, any> = {};
 
  // Session state (request-specific, changes frequently)
  private currentWorkspace: string = '';
  private attachments: string[] = [];
  private errorState: boolean = false;
 
  // Agent instance
  private agent?: Agent<AgentContext>;
}

Project context rarely changes - load once, reuse. Session state changes per request - reset as needed. Integration context is semi-permanent - cache with invalidation logic.

Building Rich User Messages

The user's message isn't just their text input. It's their request plus all the context they need to see.

Assemble user message content from multiple sources:

const agentUserMessageContent: Array<{type: string; text?: string}> = [];
 
// 1. User request (wrapped in XML for clarity)
agentUserMessageContent.push({
  type: 'input_text',
  text: `<user_request>\n${userRequest}\n</user_request>`
});
 
// 2. Current workspace context
if (currentWorkspace) {
  agentUserMessageContent.push({
    type: 'input_text',
    text: `Current workspace: ${currentWorkspace.path}\n${currentWorkspace.details}`
  });
}
 
// 3. Runtime errors (if detected)
if (detectedErrors) {
  agentUserMessageContent.push({
    type: 'input_text',
    text: `Detected errors:\n${detectedErrors}`
  });
}
 
// 4. Attachments (images/files)
if (attachments && attachments.length > 0) {
  agentUserMessageContent.push(...processedAttachments);
}
 
// 5. File structure (codebase overview)
if (this.fileStructure) {
  agentUserMessageContent.push({
    type: 'input_text',
    text: `<project_structure>\n${this.fileStructure}\n</project_structure>`
  });
}
 
// 6. Relevant file contents
if (relevantFiles && relevantFiles.length > 0) {
  for (const file of relevantFiles) {
    const content = await this.readFile(file.path);
    agentUserMessageContent.push({
      type: 'input_text',
      text: `<file path="${file.path}">\n${content}\n</file>`
    });
  }
}
 
// 7. Integration context (if configured)
if (Object.keys(this.integrationContext).length > 0) {
  const formattedContext = this.formatIntegrationContext(this.integrationContext);
  agentUserMessageContent.push({
    type: 'input_text',
    text: formattedContext
  });
}

Each piece of context gets added conditionally. If the user isn't viewing a page, skip it. If there are no errors, skip it. Build exactly what's needed for this request.

Context Formatting: XML for Structured Data

Plain text works for most context. But when you have structured data - credentials, API keys, configuration - XML keeps things organized.

Here's how to format integration context:

function formatIntegrationContext(integrations: Record<string, any>): string {
  const xmlParts: string[] = ['<integrations>'];
 
  // Include guidelines so the agent knows what to do
  xmlParts.push(`<guidelines>`);
  xmlParts.push(`When using these integrations:`);
  xmlParts.push(`- Use the provided credentials securely`);
  xmlParts.push(`- Follow the documented API patterns`);
  xmlParts.push(`- Handle errors appropriately`);
  xmlParts.push(`</guidelines>`);
 
  // Add each integration's details
  for (const [name, config] of Object.entries(integrations)) {
    xmlParts.push(`<integration name="${escapeXML(name)}">`);
    xmlParts.push(`  <endpoint>${escapeXML(config.endpoint)}</endpoint>`);
 
    if (config.credentials) {
      xmlParts.push(`  <credentials>`);
      for (const [key, value] of Object.entries(config.credentials)) {
        xmlParts.push(`    <${key}>${escapeXML(value as string)}</${key}>`);
      }
      xmlParts.push(`  </credentials>`);
    }
 
    xmlParts.push(`</integration>`);
  }
 
  xmlParts.push(`</integrations>`);
  return xmlParts.join('\n');
}

Why XML? Clear boundaries, includes guidelines, and the LLM can parse it reliably. This pattern works for databases, external APIs, or any structured credentials.

Conversation History Management

Conversation history needs careful handling. You're not just passing text - you're passing context about what tools were called and what they returned.

Here's the pattern:

const inputMessages: AgentInputItem[] = [];
 
// Add conversation history (oldest first)
if (conversationHistory && conversationHistory.length > 0) {
  const chronologicalHistory = [...conversationHistory].reverse();
  for (const msg of chronologicalHistory) {
    let content = msg.content;
 
    // Include tool calls and results
    if (msg.actionLogs && msg.actionLogs.length > 0) {
      const filteredLogs = filterActionLogs(msg.actionLogs);
      content += `\n\nTools called:\n${filteredLogs.map(log => `---\n${log}\n---`).join('\n')}`;
    }
 
    inputMessages.push({
      role: msg.role as 'user' | 'assistant',
      content: content
    });
  }
}
 
// Add current user message LAST
inputMessages.push({
  role: 'user',
  content: agentUserMessageContent
});
 
// Pass to agent
const stream = await run(this.agent, inputMessages, {
  context: agentContext,
  maxTurns: 100,
  stream: true,
});

History comes from your database. Reverse it to chronological order (oldest first), add the current message last, then pass to run(). The agent sees the full conversation and can reference previous decisions.

Dynamic Prompt Selection Based on State

Your system prompt should change based on the situation. Different modes need different instructions. Initial setup needs onboarding. Ongoing work needs standard guidelines.

Select the prompt dynamically:

private createSystemPrompt(): string {
  // 1. Check for override (testing/debugging)
  if (this.systemPromptOverride) {
    return this.systemPromptOverride;
  }
 
  // 2. Read-only mode has limited capabilities
  if (this.mode === 'readonly') {
    return READ_ONLY_MODE_PROMPT;
  }
 
  // 3. Initial setup needs extra guidance
  if (this.isInitialSetup) {
    return INITIAL_SETUP_PROMPT;
  }
 
  // 4. Different environments have different constraints
  if (this.environment === 'production') {
    return PRODUCTION_MODE_PROMPT;
  }
 
  // 5. Standard agent prompt
  return STANDARD_AGENT_PROMPT;
}

Call this when building your agent. The prompt adapts to the current state without any manual switching.

Caching Strategies

Loading context is expensive. Fetch it once, cache it, and check the cache before refetching.

Here's the pattern:

private async getProjectConfig(): Promise<any> {
  // Check cache first
  if (this.projectConfig !== null) {
    return this.projectConfig;
  }
 
  // Fetch from source (database, file system, API)
  const config = await this.fetchConfiguration();
 
  // Cache result
  this.projectConfig = config;
  return this.projectConfig;
}

Same pattern works for file structure, integration contexts, and project metadata. Load once, cache in instance properties, reuse on subsequent calls.

For context that can change during a session (like file structure after writes), add a method to invalidate the cache:

private invalidateFileStructure() {
  this.fileStructure = '';
}
 
async refreshFileStructure() {
  this.invalidateFileStructure();
  this.fileStructure = await this.getCodebaseStructure();
}

State Persistence

Some state needs to persist beyond the current session. Track project stage, user preferences, or feature flags in your storage layer.

Here's the pattern:

private async updateProjectState(key: string, value: any) {
  // Only update if value changed
  const currentValue = this.projectState[key];
  if (currentValue === value) return;
 
  // Update storage layer (database, file, etc.)
  await this.storage.update('project_state', {
    projectId: this.projectId,
    [key]: value
  });
 
  // Update cached value
  this.projectState[key] = value;
 
  // Notify listeners (optional)
  this.eventEmitter.emit('state_changed', {
    key,
    oldValue: currentValue,
    newValue: value
  });
}

Storage for persistence, instance property for cache, event emitter for real-time updates. This keeps all parts of your system in sync.

Mental Model

Here's how it all flows:

Initialization
    ↓
Load & Cache Context (file structure, integrations, project config)
    ↓
User Request Arrives
    ↓
Build User Message (request + workspace + errors + attachments + structure + cached contexts)
    ↓
Select System Prompt (based on mode, stage, environment)
    ↓
Add Conversation History (oldest first)
    ↓
Pass to LLM with Context
    ↓
Stream Response Back

Context loads once. Prompts adapt to state. Messages include everything the agent needs. History provides continuity. State persists across sessions.

What We're Skipping (For Now)

There's more you might add - context invalidation strategies, partial context updates, smart summarization for long histories, metrics on context usage. We'll cover these in later guides.

Right now, focus on the basics. Load context once, cache it, build rich messages, and adapt to state. Get this working first, then optimize.

What's Next

You have context and state management working. In the next guide, we'll build tools your agent can actually call - reading files, searching code, executing commands.

That's where your agent becomes truly capable.