AI Agent 生产环境部署实战

前两篇我们学会了搭建和优化 AI Agent,这篇文章将带你把 Agent 部署到生产环境,让它真正为用户服务。

🎯 你将学到什么

  • 如何选择合适的部署架构
  • Docker 容器化部署
  • 使用 FastAPI 构建 API 服务
  • 负载均衡和高可用
  • 监控、日志和告警
  • 安全性和权限控制
  • 实战案例:部署智能客服系统

技术栈: Docker, FastAPI, Nginx, Prometheus, Grafana


一、部署架构选择

1.1 架构对比

架构类型 适用场景 优点 缺点
单机部署 个人项目、MVP 简单快速 无法扩展
容器化部署 中小型项目 易迁移、易扩展 需要容器知识
Serverless 低频调用 按需付费、免运维 冷启动慢
Kubernetes 大型项目 高可用、自动扩展 复杂度高

1.2 推荐架构

中小型项目(推荐):

┌─────────────────────────────────────┐
│         Nginx (负载均衡)             │
└──────────────┬──────────────────────┘
               │
       ┌───────┴───────┐
       │               │
┌──────▼─────┐  ┌─────▼──────┐
│  Agent 1   │  │  Agent 2   │
│  (Docker)  │  │  (Docker)  │
└──────┬─────┘  └─────┬──────┘
       │               │
       └───────┬───────┘
               │
┌──────────────▼──────────────────────┐
│      Redis (缓存 + 队列)             │
└──────────────┬──────────────────────┘
               │
┌──────────────▼──────────────────────┐
│      PostgreSQL (数据库)             │
└─────────────────────────────────────┘

二、Docker 容器化

2.1 Dockerfile

# 使用官方 Python 镜像
FROM python:3.11-slim

# 设置工作目录
WORKDIR /app

# 安装系统依赖
RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    && rm -rf /var/lib/apt/lists/*

# 复制依赖文件
COPY requirements.txt .

# 安装 Python 依赖
RUN pip install --no-cache-dir -r requirements.txt

# 复制应用代码
COPY . .

# 暴露端口
EXPOSE 8000

# 健康检查
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# 启动命令
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

2.2 requirements.txt

fastapi==0.109.0
uvicorn[standard]==0.27.0
langchain==0.1.0
langchain-openai==0.0.2
chromadb==0.4.22
redis==5.0.1
psycopg2-binary==2.9.9
python-dotenv==1.0.0
prometheus-client==0.19.0
pydantic==2.5.3
httpx==0.26.0

2.3 docker-compose.yml

version: '3.8'

services:
  # AI Agent 服务
  agent:
    build: .
    container_name: ai-agent
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_URL=redis://redis:6379
      - DATABASE_URL=postgresql://user:password@postgres:5432/agentdb
    depends_on:
      - redis
      - postgres
    volumes:
      - ./logs:/app/logs
      - ./data:/app/data
    restart: unless-stopped
    networks:
      - agent-network

  # Redis 缓存
  redis:
    image: redis:7-alpine
    container_name: ai-agent-redis
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped
    networks:
      - agent-network

  # PostgreSQL 数据库
  postgres:
    image: postgres:16-alpine
    container_name: ai-agent-postgres
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=agentdb
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    restart: unless-stopped
    networks:
      - agent-network

  # Nginx 负载均衡
  nginx:
    image: nginx:alpine
    container_name: ai-agent-nginx
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/nginx/ssl
    depends_on:
      - agent
    restart: unless-stopped
    networks:
      - agent-network

volumes:
  redis-data:
  postgres-data:

networks:
  agent-network:
    driver: bridge

三、FastAPI 服务

3.1 主应用(main.py)

from fastapi import FastAPI, HTTPException, Depends
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import Optional
import logging
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import Response
import time

# 初始化 FastAPI
app = FastAPI(
    title="AI Agent API",
    description="智能助手 API 服务",
    version="1.0.0"
)

# CORS 配置
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # 生产环境应该限制具体域名
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Prometheus 指标
REQUEST_COUNT = Counter(
    'agent_requests_total',
    'Total number of requests',
    ['method', 'endpoint', 'status']
)

REQUEST_DURATION = Histogram(
    'agent_request_duration_seconds',
    'Request duration in seconds',
    ['method', 'endpoint']
)

# 日志配置
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('logs/agent.log'),
        logging.StreamHandler()
    ]
)
logger = logging.getLogger(__name__)

# 请求模型
class ChatRequest(BaseModel):
    message: str
    user_id: Optional[str] = None
    session_id: Optional[str] = None

class ChatResponse(BaseModel):
    response: str
    session_id: str
    timestamp: float

# Agent 实例(全局单例)
from agent import anagementAgent
agent = ProjectManntAgent()

# 中间件:请求计时
@app.middleware("http")
async def add_process_time_header(request, call_next):
    start_time = time.time()
    response = await call_next(request)
    process_time = time.time() - start_time
    
    # 记录指标
    REQUEST_COUNT.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()
    
    REQUEST_DURATION.labels(
        method=request.method,
        endpoint=request.url.path
    ).observe(process_time)
    
    response.headers["X-Process-Time"] = str(process_time)
    return response

# 健康检查
@app.get("/health")
async def health_check():
    """健康检查端点"""
    return {
        "status": "healthy",
        "timestamp": time.time()
    }

# 就绪检查
@app.get("/ready")
async def readiness_check():
    """就绪检查端点"""
    try:
        # 检查 Agent 是否就绪
        # 检查数据库连接
        # 检查 Redis 连接
        return {"status": "ready"}
    except Exception as e:
        raise HTTPException(status_code=503, detail=str(e))

# Prometheus 指标端点
@app.get("/metrics")
async def metrics():
    """Prometheus 指标"""
    return Response(
        content=generate_latest(),
        media_type="text/plain"
    )

# n@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRe
    """
    处理用户消息
    
    Args:
        request: 聊天请求
    
    Returns:
        Agent 响应
    """
    try:
        logger.info(f"收到消息: {request.message} (用户: {request.user_id})")
        
        # 调用 Agent
        response = agent.run(request.message)
        
        logger.info(f"Agent 响应: {response[:100]}...")
        
        return ChatResponse(
            response=response,
            session_id=request.session_id or "default",
            timestamp=time.time()
        )
        
    except Exception as e:
        logger.error(f"处理消息失败: {e}", exc_info=True)
      raise HTTPException(status_code=500, detail=str(e))

# 会话管理
@app.post("/session/create")
async def create_session(user_id: str):
    """创建新会话"""
    session_id = f"session_{user_id}_{int(time.time())}"
    return {"session_id": session_id}

@app.delete("/session/{session_id}")
async def delete_session(session_id: str):
    """删除会话"""
    # 清理会话数据
    return {"status": "deleted"}

# 启动事件
@app.on_event("startup")
async def startup_event():
    """应用启动时执行"""
    loo("AI Agent API 启动")
    # 初始化数据库连接
    # 预热模型

@app.on_event("shutdown")
async def shutdown_event():
    """应用关闭时执行"""
    logger.info("AI Agent API 关闭")
    # 清理资源

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

3.2 启动服务

# 开发环境
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# 生产环境(多进程)
gunicorn main:app \
    --workers 4 \
    --worker-class uvicorn.workers.UvicornWorker \
    --bind 0.0.0.0:8000 \
    --timeout 120 \
    --access-logfile logs/access.log \
    --error-logfile logs/error.log

四、负载均衡

4.1 Nginx ginx

nginx.conf

upstream agent_backend {
least_conn; # 最少连接负载均衡
server agent1:8000 max_fails=3 fail_timeout=30s;
server agent2:8000 max_fails=3 fail_timeout=30s;
}

server {
listen 80;
server_name your-domain.com;

# 重定向到 HTTPS
return 301 https://$server_name$request_uri;

}

server {
listen 443 ssl http2;
server_name your-domain.com;

# SSL 证书
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;

# 日志
access_log /var/log/nginx/agent_access.log;
error_log /var/log/nginx/agent_error.log;

# 限流
limit_req_zone $binary_remote_addr zone=agent_limit:10m rate=10r/s;
limit_req zone=agent_limit burst=20 nodelay;

# 代理配置
location / {
    proxy_pass http://agent_backend;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # 超时设置
    proxy_connect_timeout 60s;
    proxy_send_timeout 60s;
    proxy_read_timeout 60s;

    # WebSocket 支持
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

# 健康检查
location /health {
    proxy_pass http://agent_backend/health;
    access_log off;
}

# Prometheus 指标(仅内网访问)
location /metrics {
    allow 10.0.0.0/8;
    deny all;
    proxy_pass http://agent_backend/metrics;
}

}


---

## 五、监控和日志

### 5.1 Prometheus 配置

```yaml
# prometheus.yml

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'ai-agent'
    static_configs:
      - targets: ['agent:8000']
    metrics_path: '/metrics'

5.2 Grafana 仪表板

{
  "dashboard": {
    "title": "AI Agent 监控",
    "panels": [
      {
        "title": "请求速率",
        "targets": [
          {
            "expr": "rate(agent_requests_total[5m])"
          }
        ]
      },
      {
        "title": "响应时间",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, agent_request_duration_seconds)"
          }
        ]
      },
      {
        "title": "错误率",
        "targets": [
          {
            "expr": "rate(agent_requests_total{status=~\"5..\"}[5m])"
          }
        ]
      }
    ]
  }
}

5.3 合

# 使用 structlog 结构化日志
import structlog

logger = structlog.get_logger()

logger.info(
    "agent_request",
    user_id=user_id,
    message=message,
    response_time=response_time,
    status="success"
)

六、安全性

6.1 API 认证

from fastapi import Security, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import jwt

security = HTTPBearer()

def verify_token(credentials: HTTPAuthorizationCredentials = Security(security)):
    """验证 JWT Token"""
    try:
        token = credentials.credentials
        payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
        return payload
    except jwt.ExpiredSignatureError:
        raise HTTPException(status_code=401, detail="Token 已过期")
    except jwt.InvalidTokenError:
        raise HTTPException(status_code=401, detail="无效的 Token")

@app.post("/chat")
async def chat(
    request: ChatRequest,
    user=Depends(verify_token)
):
    """需要认证的聊天端点"""
    # 处理请求
    pass

6.2 速率限制

from slowapi import Limiter, _rate_limit_exed_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.post("/chat")
@limiter.limit("10/minute")
async def chat(request: Request, chat_request: ChatRequest):
    """限制每分钟 10 次请求"""
    pass

6.3 输入验证

from pydantic import BaseModel, validator

class ChatRequest(BaseModel):
    message: str
    
    @validator('message')
    def validate_message(cls, v):
        if len(v) > 1000:
            raise ValueError('消息长度不能超过 1000 字符')
        if not v.strip():
            raise ValueError('消息不能为空')
        return v.strip()

七、部署流程

7.1 构建镜像

# 构建 Docker 镜像
docker build -t ai-agent:latest .

# 推送到镜像仓库
docker tag ai-agent:latest your-registry/ai-agent:latest
docker push your-registry/ai-agent:latest

7.2 启动服务

# 使用 docker-compose
docker-compose up -d

# 查看日志
docker-compose logs -f agent

# 查看状态
docker-compose ps

7.3 滚动更新

# 拉取新镜像
docker-compose pull agent

# 滚动更新(零停机)
docse up -d --no-deps --build agent

# 回滚
docker-compose up -d --no-deps ai-agent:previous-version

八、高可用方案

8.1 多实例部署

# docker-compose.yml

services:
  agent1:
    image: ai-agent:latest
    container_name: agent-1
    # ... 配置

  agent2:
    image: ai-agent:latest
    container_name: agent-2
    # ... 配置

  agent3:
    image: ai-agent:latest
    container_name: agent-3
    # ... 配置

8.2 健康检查和自动重启

services:
  agent:
    # ...
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    restart: unless-stopped

8.3 数据备份

#!/bin/bash
# backup.sh

# 备份 PostgreSQL
docker exec ai-agent-postgres pg_dump -U user agentdb > backup_$(date +%Y%m%d).sql

# 备份 Redis
docker exec ai-agent-redis redis-cli SAVE
docker cp ai-agent-redis:/data/dump.rdb backup_redis_$(date +%Y%m%d).rdb

# 上传到 S3
aws s3 cp backup_$(date +%Y%m%d).sql s3://your-bucket/backups/

九、性能优化

9.1 连接池

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=20,
    max_overflow=10,
    pool_timeout=30,
    pool_recycle=3600
)

9.2 缓存策略

import redis
import json
import hashlib

redis_client = redis.Redis(host='redis', port=6379, decode_responses=True)

def cached_agent_response(message: str, ttl: int = 3600):
    """缓存 Agent 响应"""
    cache_key = f"agent:{hashlib.md5(message.encode()).hexdigest()}"
    
    # 尝试从缓存获取
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # 调用 Agent
    response = agent.run(message)
    
    #  redis_client.setex(cache_key, ttl, json.dumps(response))
    
    return response

9.3 异步处理

from celery import Celery

celery_app = Celery('agent', broker='redis://redis:6379/0')

@celery_app.task
def process_message_async(message: str, user_id: str):
    """异步处理消息"""
    response = agent.run(message)
    # 保存结果到数据库
    # 发送通知给用户
    return response

@app.post("/chat/async")
async def chat_async(request: ChatRequest):
    """异步聊天端点"""
    task = process_message_async.delay(request.message, request.user_id)
    return {"task_id": task.id, "status": "processing"}

十、故障排查

10.1 常见问题

问题 1:容器启动失败

# 查看日志
docker-compose logs agent

# 检查配置
docker-compose config

# 进入容器调试
docker-compose exec agent /bin/bash

问题 2:API 响应慢

# 查看 Prometheus 指标
curl http://localhost:8000/metrics

# 查看数据库连接
docker-compose exec postgres psql -U user -d agentdb -c "SELECT * FROM pg_stat_activity;"

# 查看 Redis 状态
docker-compose exec redis redis-cli INFO

问题 3:内存泄漏

# 监控内存使用
docker stats

# 重启服务
docker-compose restart agent

十一、总结

11.1 关键要点

  1. 容器化是标准做法 — Docker + docker-compose
  2. API 服务化 — FastAPI 提供 RESTful API
  3. 负载均衡 — Nginx 分发请求
  4. 监控告警 — Prometheus + Grafana
  5. 安全第一 — 认证、限流、输入验证

11.2 生产环境检查清单

  • Docker 镜像构建成功
  • 环境变量配置正确
  • 数据库连接正常
  • Redis 缓存可用
  • 健康检查端点正常
  • 日志输出正常
  • Prometheus 指标可访问
  • SSL 证书配置
  • 备份策略就绪
  • 监控告警配置

11.3 下一步

  • 📊 第 4 篇:性能优化和成本控制 — 如何降低成本提高性能

关注我,不错过后续更新! 🚀


如果觉得有帮助,欢迎点赞、收藏、转发! ❤️

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐