OpenClaw 连接本地 vLLM 报 “Connection error“ 问题排查与解决

摘要：OpenClaw 2026.2.x迁移vLLM服务后出现Connection error问题，排查发现models.json优先级高于openclaw.json导致请求仍指向旧服务器。通过strace确认请求被本地拦截，系因连续失败触发cooldown保护机制。解决方案为更新~/.openclaw/agents/main/agent/models.json中的IP并重启gateway。建议使

伟大的大威

19人浏览 · 2026-02-28 17:18:45

伟大的大威 · 2026-02-28 17:18:45 发布

适用版本：OpenClaw 2026.2.x
问题现象：修改 vLLM 服务地址后，所有消息均返回 Connection error，gateway 日志持续输出 embedded run agent end: isError=true error=Connection error.

一、问题背景

笔者在将 vLLM 推理服务从旧服务器（10.10.85.220）迁移到新服务器（192.168.1.221）后，修改了 ~/.openclaw/openclaw.json 中的 baseUrl，并重启了 gateway。然而之后无论发送什么消息，均持续报错：

warn  agent/embedded  embedded run agent end: isError=true error=Connection error.

二、排查过程

2.1 确认 gateway 状态正常

openclaw gateway status

输出显示 gateway 正常运行，RPC probe OK，监听地址正确，排除 gateway 本身问题。

2.2 确认 vLLM 服务可达

curl http://192.168.1.221:8000/v1/models
curl -X POST http://192.168.1.221:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-oss-120b","messages":[{"role":"user","content":"hello"}],"max_tokens":10}'

两个接口均正常响应，排除网络问题。

2.3 用 Node.js 直接测试（排除 curl 差异）

node -e "
const http = require('http');
const req = http.request({
  hostname: '192.168.1.221', port: 8000,
  path: '/v1/completions', method: 'POST',
  headers: {'Content-Type': 'application/json'}
}, res => { res.on('data', c => console.log(c.toString())); });
req.write(JSON.stringify({model:'gpt-oss-120b', prompt:'hi', max_tokens:5}));
req.end();
"

Node.js 直接调用也成功，排除运行时问题。

2.4 用 strace 抓系统调用

PID=$(pgrep -f "openclaw.*gateway")
strace -p $PID -e trace=network -f 2>&1 | grep -E "connect|192.168" &
sleep 10; kill %1

关键发现：10 秒内完全没有任何到 192.168.1.221:8000 的 TCP connect 调用！所有网络活动都是 mDNS（5353 端口）流量。

这说明 OpenClaw 根本没有发出 HTTP 请求，请求在本地就被拦截了。

2.5 检查 provider 状态与配置覆盖文件

openclaw models status

输出中发现：

source=models.json: ~/.openclaw/agents/main/agent/models.json

检查该文件：

cat ~/.openclaw/agents/main/agent/models.json

发现问题根源：

{
  "providers": {
    "vllm": {
      "baseUrl": "http://10.10.85.220:8000/v1",  // ← 旧 IP！
      ...
    }
  }
}

models.json 的优先级高于 openclaw.json，导致所有请求实际发往已下线的旧服务器，反复失败后触发了 OpenClaw 的 cooldown（冷却保护）机制，之后请求在本地直接被拦截，不再发出网络请求。

三、根本原因分析

OpenClaw 存在多层配置文件，优先级从高到低为：

配置文件	路径	说明
`models.json`	`~/.openclaw/agents/main/agent/models.json`	最高优先级，agent 级别模型配置
`auth-profiles.json`	`~/.openclaw/agents/main/agent/auth-profiles.json`	认证信息
`openclaw.json`	`~/.openclaw/openclaw.json`	全局配置

当用户只修改 openclaw.json 时，models.json 中的旧配置仍然生效，且会覆盖全局配置。

此外，OpenClaw 内置了 Cooldown 冷却保护机制：当某个 provider 连续请求失败多次后，会临时屏蔽对该 provider 的所有请求，表现为请求不发出、直接返回 Connection error。这是一种保护机制，但容易误导排查方向。

四、解决方案

方案一：直接修改 models.json（推荐）

# 将旧 IP 替换为新 IP
sed -i 's/10.10.85.220/192.168.1.221/g' \
  ~/.openclaw/agents/main/agent/models.json

# 确认修改
grep baseUrl ~/.openclaw/agents/main/agent/models.json

# 重启 gateway 清除 cooldown 状态
openclaw gateway restart

方案二：同步更新所有配置文件

如果迁移了服务器，建议同时检查并更新所有涉及 IP 的配置：

# 全局搜索旧 IP
grep -r "10.10.85.220" ~/.openclaw/

# 批量替换
find ~/.openclaw/ -name "*.json" -exec \
  sed -i 's/10.10.85.220/192.168.1.221/g' {} \;

openclaw gateway restart

方案三：重置 cooldown

如果只是想解除冷却状态而不修改配置：

openclaw models status --reset-cooldown
# 或直接重启 gateway
openclaw gateway restart

五、验证修复

重启后观察日志，正常情况应看到请求成功：

openclaw logs --follow 2>&1 | grep -E "agent end|agent start"

正常输出示例：

debug  agent/embedded  embedded run agent start: runId=xxx
info   agent/embedded  embedded run agent end: runId=xxx isError=false

六、预防建议

迁移服务器后，除修改 openclaw.json 外，还需检查：
- ~/.openclaw/agents/main/agent/models.json
- ~/.openclaw/agents/*/agent/models.json（多 agent 场景）
使用命令修改配置，而非直接编辑文件，可避免遗漏覆盖文件：
```
openclaw config set models.providers.vllm.baseUrl http://192.168.1.221:8000/v1
```
该命令会同步更新所有相关配置。
遇到 Connection error 时的排查顺序：
- openclaw gateway status → 确认 gateway 正常
- curl <vllm_url>/v1/models → 确认后端可达
- openclaw models status → 查看 provider 状态和配置来源
- 检查 source= 字段指向的配置文件是否包含旧配置

七、总结

现象	原因	解决
`Connection error` 持续出现	cooldown 机制拦截 + models.json 旧 IP	修复 models.json + 重启 gateway
strace 无 TCP connect	请求未发出，被本地拦截	同上
curl/Node.js 直接调用正常	底层网络无问题，问题在 OpenClaw 内部	同上
修改 openclaw.json 无效	被高优先级的 models.json 覆盖	修改正确的配置文件

🦞 核心教训：OpenClaw 有多层配置覆盖机制，openclaw.json 并非唯一生效的配置。排查问题时务必用 openclaw models status 确认实际生效的配置来源（source= 字段）。

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

MCP网关：它们是什么、为何需要它们，以及它们如何增强模型上下文协议

模型上下文协议（MCP，Model Context Protocol）是一种开放标准，它提供了一种标准化方式，让 AI 模型（如大型语言模型）能够连接外部数据源、工具与服务。它由 Anthropic 于 2024 年末提出；MCP 常被形容为一种“通用适配器”或“面向 AI 的 USB-C”——开发者无需为每个工具或 API 都构建定制集成，而是可以用一种协议把 AI 助手“插入”到许多系统中。在