Building AI Agents In Action : Architectures, Algorithms, and Source Code Using LangGraph, FastAPI

【代码】Building AI Agents In Action : Architectures, Algorithms, and Source Code Using LangGraph, FastAPI。

禅与计算机程序设计艺术

430人浏览 · 2026-01-17 15:28:12

禅与计算机程序设计艺术 · 2026-01-17 15:28:12 发布

Here is a comprehensive structure and content draft for the book “Building AI Agents In Action”. This manuscript focuses on the practical implementation of autonomous agents using the modern stack of LangGraph, FastAPI, Vue.js, and Docker.

Building AI Agents In Action : Architectures, Algorithms, and Source Code Using LangGraph, FastAPI, Vue, Docker

Author: [Photon AI]
Format: Technical / Educational

Part I: The Architecture of Autonomy

Introduction to Agentic Workflows: From Chatbots to Agents.
The Tech Stack Demystified: Why LangGraph, FastAPI, and Vue?
Designing Agentic Brains: State Machines vs. DAGs.

Part II: Backend & The Brain (LangGraph & FastAPI)
4. Foundations of LangGraph: Nodes, Edges, and State.
5. Building the Toolkit: Shell, File Ops, and Web Search.
6. Browser Automation: Integrating browser-use for Web Agents.
7. Creating the API: FastAPI for Real-Time Agent Streaming.
8. Memory and Persistence: Checkpointing and RAG.

Part III: Frontend & Interaction (Vue.js)
9. Visualizing Thought: Building a Streaming UI in Vue 3.
10. Controlling the Agent: Human-in-the-Loop Interfaces.

Part IV: Deployment & Infrastructure (Docker)
11. Sandboxing for Safety: Dockerizing Tool Execution.
12. Production-Grade Deployment: Multi-stage builds and Orchestration.
13. Security: Guardrails and Sandboxes in Production.

Chapter 1: Introduction to Agentic Workflows

(Excerpt)

The era of simple “prompt-response” AI is ending. We are entering the age of Agentic Workflows—systems that don’t just generate text but plan, reason, utilize tools, and execute code to achieve complex goals.

In this book, we move beyond theory. We will build a robust system capable of interacting with a file system, browsing the web autonomously, and executing shell commands—all wrapped in a secure Docker container and presented through a reactive Vue.js frontend.

The Modern Agent Stack

Orchestration: LangGraph. Unlike sequential chains, LangGraph allows for cyclic graphs, enabling agents to loop, retry, and self-correct.
Backend: FastAPI. High performance, native async support, and perfect for handling Server-Sent Events (SSE) for streaming agent thoughts.
Frontend: Vue.js. Reactivity is key when visualizing an agent’s step-by-step reasoning process.
Infrastructure: Docker. The only safe way to run an agent with shell and file access permissions.

Chapter 4: Foundations of LangGraph

(Source Code Focus)

The core of our agent is the Graph. In LangGraph, we define a State that circulates between Nodes.

Defining the Agent State

First, we define the data structure that our agent will pass around and update.

# agent/state.py
from typing import Annotated, TypedDict, List, Sequence
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # The add_messages function automatically handles message history
    messages: Annotated[Sequence[BaseMessage], add_messages]
    
    # Specific fields for tool execution tracking
    next_action: str
    user_intent: str

The Graph Architecture

We will build a “Supervisor” pattern where an LLM decides whether to call a tool (Search, Shell, Browser) or finish.

# agent/graph.py
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from .nodes import call_model, tool_node, should_continue

def create_graph():
    workflow = StateGraph(AgentState)
    
    # Initialize the LLM
    model = ChatOpenAI(model="gpt-4o", temperature=0)
    
    # Define Nodes
    workflow.add_node("agent", call_model)
    workflow.add_node("tools", tool_node)
    
    # Define Entry Point
    workflow.set_entry_point("agent")
    
    # Define Conditional Edges (The "Brain")
    # After the agent acts, decide: Do we stop? Or call a tool?
    workflow.add_conditional_edges(
        "agent",
        should_continue,
        {
            "continue": "tools",
            "end": END,
        },
    )
    
    # Define Normal Edges
    # After a tool is used, go back to the agent to observe the result
    workflow.add_edge("tools", "agent")
    
    return workflow.compile()

Chapter 6: Browser Automation

(Integrating browser-use)

One of the most powerful capabilities of a modern agent is the ability to “see” and “click” the web. We will integrate the browser-use library as a LangChain tool.

The Browser Tool Wrapper

We need a safe wrapper that executes browser actions within a controlled headless instance.

# tools/browser_tool.py
from langchain_core.tools import tool
from browser_use import Agent as BrowserAgent
import asyncio

@tool
def browse_website(url: str, objective: str) -> str:
    """
    Navigates to a URL and performs an objective using the browser.
    Args:
        url: The website URL.
        objective: What to achieve (e.g., "Find the price of the iPhone 15").
    """
    async def _run():
        # Initialize the browser agent
        agent = BrowserAgent(
            task=objective,
            browser_config={"headless": True}, # Server-friendly
        )
        # Note: In production, manage the browser context lifecycle better
        result = await agent.run()
        return result.extracted_content or "Task completed, but no text extracted."
    
    try:
        # Run the async browser task in a sync context
        return asyncio.run(_run())
    except Exception as e:
        return f"Browser failed: {str(e)}"

Chapter 7: Creating the API with FastAPI

(Streaming the Thoughts)

Agents take time. Users don’t want to wait 10 seconds for a black box to resolve. We must stream the “tokens” and the “steps” back to the frontend using Server-Sent Events (SSE).

The Streaming Endpoint

# api/main.py
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from .graph import create_graph
import json

app = FastAPI()
graph = create_graph()

class UserRequest(BaseModel):
    message: str
    thread_id: str

@app.post("/chat")
async def chat_endpoint(request: UserRequest):
    config = {"configurable": {"thread_id": request.thread_id}}
    
    inputs = {"messages": [("user", request.message)]}
    
    async def event_generator():
        try:
            # Stream the graph execution
            async for event in graph.astream(inputs, config):
                # Parse different types of events (Node execution, LLM tokens)
                for node_name, node_output in event.items():
                    if node_name != "__end__":
                        # Send JSON updates to the frontend
                        yield f"data: {json.dumps({'type': 'step', 'node': node_name, 'output': str(node_output)})}\n\n"
                        
        except Exception as e:
            yield f"data: {json.dumps({'type': 'error', 'message': str(e)})}\n\n"
        yield "data: [DONE]\n\n"

    return StreamingResponse(event_generator(), media_type="text/event-stream")

Chapter 9: Visualizing Thought in Vue.js

(Source Code Focus)

The Vue component needs to handle an incoming stream of SSE data and render a “Chain of Thought” visualization.

The Agent Chat Component

<!-- src/components/AgentChat.vue -->
<template>
  <div class="chat-container">
    <div v-for="(msg, index) in history" :key="index" class="message">
      <div :class="msg.role">{{ msg.content }}</div>
      
      <!-- Visualization of Agent Steps (Tools used) -->
      <div v-if="msg.steps" class="steps-log">
        <div v-for="(step, sIdx) in msg.steps" :key="sIdx" class="step-badge">
          🤖 <strong>{{ step.node }}</strong>: {{ formatOutput(step.output) }}
        </div>
      </div>
    </div>
  </div>
</template>

<script setup>
import { ref, onMounted } from 'vue';

const history = ref([]);
let eventSource = null;

const startChat = async (message) => {
  history.value.push({ role: 'user', content: message, steps: [] });
  const currentMsgIndex = history.value.length - 1;
  
  // Standard fetch to initiate stream (or EventSource directly)
  eventSource = new EventSource(`http://localhost:8000/chat?message=${message}`);

  eventSource.onmessage = (event) => {
    if (event.data === '[DONE]') {
      eventSource.close();
      return;
    }

    const data = JSON.parse(event.data);
    
    if (data.type === 'step') {
      // Append steps to the current message visualization
      if (!history.value[currentMsgIndex].steps) {
        history.value[currentMsgIndex].steps = [];
      }
      history.value[currentMsgIndex].steps.push(data);
    } else if (data.type === 'token') {
      // Append raw text to the content
      history.value[currentMsgIndex].content += data.content;
    }
  };
};

const formatOutput = (output) => {
  // Truncate long shell/browser outputs for UI cleanliness
  return output.length > 100 ? output.substring(0, 100) + '...' : output;
};
</script>

<style scoped>
.steps-log {
  background: #f4f4f4;
  padding: 10px;
  border-left: 3px solid #42b883;
  margin-top: 5px;
  font-family: monospace;
  font-size: 0.9em;
}
.step-badge {
  margin-bottom: 4px;
  color: #35495e;
}
</style>

Chapter 11: Sandbox Safety with Docker

(Deploy, Sandbox, Shell, File Ops)

This is the most critical part of the book. An agent that can run rm -rf or install malicious Python packages must be contained.

The Docker Strategy

We use a Multi-Stage Docker Build.

Builder Stage: Compiles the Vue frontend into static files.
Runner Stage: A Python image that serves both the API (FastAPI) and the static Frontend (Vue).
Isolation: The agent tools (Shell, File Ops) run inside this container. If the agent goes rogue, it only destroys the container, not the host server.

Dockerfile

# --- Stage 1: Build the Vue Frontend ---
FROM node:18-alpine as frontend-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm install
COPY frontend/ .
RUN npm run build

# --- Stage 2: The Backend Runtime ---
FROM python:3.11-slim

# Install system dependencies needed for browser automation (Playwright/Selenium)
RUN apt-get update && apt-get install -y \
    wget \
    gnupg \
    procps \
    libnss3 \
    libnspr4 \
    libatk1.0-0 \
    libatk-bridge2.0-0 \
    libcups2 \
    libdrm2 \
    libxkbcommon0 \
    libxcomposite1 \
    libxdamage1 \
    libxfixes3 \
    libxrandr2 \
    libgbm1 \
    libasound2

WORKDIR /app

# Copy Python requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Install Playwright Browsers
RUN playwright install --with-deps chromium

# Copy the compiled Vue files from Stage 1
COPY --from=frontend-builder /app/frontend/dist ./static

# Copy Backend Code
COPY backend/ .

# Expose port
EXPOSE 8000

# Security Principle: Run as a non-root user
RUN useradd -m agentuser
USER agentuser

# Command to serve both API and Static files
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Docker Compose for Orchestration

# docker-compose.yml
version: '3.8'

services:
  agent-app:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
      - LANGCHAIN_TRACING_V2=true
    volumes:
      # Mount a safe workspace directory for file ops
      - ./agent_workspace:/app/workspace
    restart: unless-stopped

Chapter 13: Production-Grade Security

(Summary)

When deploying agents with shell access, standard API security is not enough.

The Allow-List Pattern: Do not let the LLM generate any shell command. Force it to choose from a pre-defined Python function list (e.g., read_file, write_file, list_directory).
Jailbreak Detection: Implement a middleware check in FastAPI that scores incoming prompts for “jailbreak” attempts before sending them to the LLM.
Resource Limits: Configure Docker (via Compose) to limit CPU and Memory usage (deploy: resources: limits:). This prevents an infinite loop agent from freezing your server.
Ephemeral Containers: Ideally, for every user session, spin up a fresh Docker container that dies when the session ends. This ensures no “memory” of user data persists between sessions.

Appendix A: Putting It All Together

Project: “The DevOps Agent”

Goal: An agent that checks a GitHub repo, runs tests using the shell, and if a test fails, it reads the error logs, modifies the code using file ops, and re-runs the test.
Implementation: Combines LangGraph’s loop capability, Docker’s isolation, and Vue’s real-time log streaming.

This book structure provides a complete end-to-end guide, from the Python code running the logic to the Javascript displaying the results, all wrapped in the safety of containerization.