为 nanobot 实现自定义斜杠命令：从踩坑到上线的完整复盘

本文记录了在nanobot AI Agent框架中实现/soul和/mem斜杠命令的过程。/mem通过复用已有记忆归档逻辑顺利实现，而/soul在迭代中遇到LLM API兼容性问题。最初版本因未调用LLM而无效，第二版尝试强制工具调用时发现Moonshot API不支持tool_choice与深度思考模式并用。最终方案通过降级处理解决了兼容性问题，实现了用户主动触发人格文件写入和记忆归档的功能。

Takoony

424人浏览 · 2026-03-19 11:17:17

Takoony · 2026-03-19 11:17:17 发布

本文记录了在一个基于 LLM 的 AI Agent 框架（nanobot）中，实现 /soul 和 /mem 两个斜杠命令的完整过程。涵盖需求分析、三个版本的迭代演进、LLM API 兼容性踩坑，以及最终方案的技术原理。

一、背景与需求

nanobot 是一个 Python 异步 AI Agent 框架，核心是 AgentLoop 类。它从消息总线接收用户消息，构建上下文调用 LLM，执行工具调用，返回响应。

框架已有一套记忆系统：MemoryConsolidator 负责在上下文窗口快满时，自动将旧对话归档到 MEMORY.md（长期记忆）和 HISTORY.md（可检索日志）。但这个过程是被动触发的——只有 token 数超过阈值才会执行。

同时，框架有一个 SOUL.md 文件定义 AI 的人格（性格、价值观、沟通风格），但只能手动编辑文件来修改。

需求很明确：

命令	目的
`/soul <描述>`	新增斜杠命令，让用户通过对话强制触发人格文件写入
`/mem [文本]`	新增斜杠命令，让用户显式触发记忆归档

二、原始代码分析

在动手之前，先看原始的斜杠命令处理逻辑（位于 _process_message 方法中）：

# 原始代码：斜杠命令匹配
cmd = msg.content.strip().lower()
if cmd == "/new":
    # ... 处理 /new
if cmd == "/help":
    # ... 处理 /help

关键点：使用 == 精确匹配。这意味着 /new 可以工作，但 /new 附加文本 不会被识别为命令，而是作为普通消息发给 LLM。

这个设计对 /new 和 /help 没问题——它们不需要参数。但对 /soul 和 /mem 来说，这就是第一个坑。

三、第一版实现——/mem 一次过，/soul 踩坑

3.1 /mem 的第一版（正确）

# 第一版 /mem
if cmd == "/mem":
    snapshot = session.messages[session.last_consolidated:]
    if not snapshot:
        return OutboundMessage(
            channel=msg.channel, chat_id=msg.chat_id,
            content="No new messages to consolidate into memory.",
        )
    ok = False
    lock = self.memory_consolidator.get_lock(session.key)
    async with lock:
        ok = await self.memory_consolidator.archive_messages(snapshot)
        if ok:
            session.last_consolidated = len(session.messages)
            self.sessions.save(session)
    return OutboundMessage(
        channel=msg.channel, chat_id=msg.chat_id,
        content="Memory consolidation triggered." if ok else "Memory consolidation failed.",
    )

为什么 /mem 第一版就能工作？因为它复用了已有的 MemoryConsolidator.archive_messages() 方法。这个方法内部调用 MemoryStore.consolidate()，而 consolidate() 已经处理好了所有 LLM 调用的兼容性问题：

# memory.py 中已有的兼容性处理
async def consolidate(self, messages, provider, model):
    forced = {"type": "function", "function": {"name": "save_memory"}}
    response = await provider.chat_with_retry(
        messages=chat_messages,
        tools=_SAVE_MEMORY_TOOL,
        model=model,
        tool_choice=forced,  # 强制调用 save_memory 工具
    )
    # 如果 tool_choice 不被支持，降级为 auto
    if response.finish_reason == "error" and _is_tool_choice_unsupported(response.content):
        response = await provider.chat_with_retry(
            messages=chat_messages,
            tools=_SAVE_MEMORY_TOOL,
            model=model,
            tool_choice="auto",  # 降级
        )

/mem 站在了巨人的肩膀上——MemoryConsolidator 已经把脏活累活都干了。

3.2 /soul 的第一版（失败：空操作）

# 第一版 /soul —— 完全错误
if cmd == "/soul":
    soul_path = self.workspace / "SOUL.md"
    if soul_path.exists():
        content = soul_path.read_text(encoding="utf-8")
        soul_path.write_text(content, encoding="utf-8")  # 读出来又原样写回去！
        return OutboundMessage(
            channel=msg.channel, chat_id=msg.chat_id,
            content=f"Soul file reloaded: {soul_path}",
        )

这段代码有三个问题：

读了又写回同样的内容，是一个 noop（空操作）
没有调用 LLM，无法从对话中提炼人格
提示说 “reloaded”，但 ContextBuilder._load_bootstrap_files() 每次 build_messages 时都会重新读取文件，根本不存在缓存需要 reload

四、第二版实现——/soul 用 tool_choice 踩坑

第二版参考了 /mem 背后 MemoryStore.consolidate() 的模式：定义一个专用工具（save_soul），用 tool_choice 强制 LLM 调用它。

# 第二版 /soul —— tool_choice 方案
_SAVE_SOUL_TOOL = [{
    "type": "function",
    "function": {
        "name": "save_soul",
        "description": "Save the synthesized personality to SOUL.md.",
        "parameters": {
            "type": "object",
            "properties": {
                "soul_content": {
                    "type": "string",
                    "description": "Full SOUL.md content in markdown format.",
                }
            },
            "required": ["soul_content"],
        },
    },
}]

async def _handle_soul(self, msg, session):
    # ... 构建 prompt ...
    response = await self.provider.chat_with_retry(
        messages=chat_messages,
        tools=self._SAVE_SOUL_TOOL,
        model=self.model,
        tool_choice={"type": "function", "function": {"name": "save_soul"}},
    )
    # 解析 tool call 结果，写入文件

4.1 踩坑：Moonshot API 的 tool_choice + thinking 不兼容

部署后测试，直接报错：

litellm.BadRequestError: MoonshotException - tool_choice 'specified' is incompatible with thinking enabled

Moonshot（Kimi）的 API 在开启 thinking（深度思考）模式时，不允许使用 tool_choice 强制指定工具。这是一个 LLM Provider 兼容性问题。

4.2 为什么 /mem 没有这个问题？

关键在于调用链的差异：

/mem → archive_messages() → MemoryStore.consolidate()
                                    ↓
                            chat_with_retry() → _safe_chat() → chat()
                                    ↓
                            已有 tool_choice 降级逻辑 ✅

/soul (第二版) → 直接调用 chat_with_retry()
                        ↓
                 tool_choice 报错 → _safe_chat() 吞掉异常
                        ↓
                 返回 finish_reason="error" 的 response

MemoryStore.consolidate() 内部已经处理了 tool_choice 不兼容的情况。而 /soul 第二版是自己直接调用 chat_with_retry()，虽然也加了 fallback 逻辑，但遇到了一个更深层的问题。

4.3 _safe_chat 的异常吞噬机制

chat_with_retry 内部使用 _safe_chat 包装实际的 LLM 调用：

# providers/base.py
async def _safe_chat(self, **kw):
    try:
        return await self.chat(**kw)
    except asyncio.CancelledError:
        raise  # 唯一允许传播的异常
    except Exception as e:
        return LLMResponse(finish_reason="error", content=str(e))

所有异常（包括 BadRequestError）都被转换为 LLMResponse(finish_reason="error")。这意味着：

第二版中的 except Exception as e: 分支永远不会被触发
错误只能通过检查 response.finish_reason == "error" 来发现

虽然第二版也加了 _is_tool_choice_unsupported 检查，但由于 chat_with_retry 内部的重试逻辑会先判断这不是 transient error，直接返回 error response，而不会走到我们的 fallback 分支——因为 chat_with_retry 在检测到非 transient error 时，还会尝试去掉图片重试，如果没有图片就直接返回了。

这就是为什么第二版的 /soul 虽然加了两层防护，但仍然失败。

五、第三版实现——彻底换思路

5.1 核心思路转变

既然 tool_choice 在某些 Provider 上有兼容性问题，而 /soul 的需求本质上只是"让 LLM 生成一段文本然后写入文件"，那为什么要用 tool call？

直接让 LLM 输出纯文本，不传 tools 参数，彻底绕开兼容性问题。

# 第三版 /soul —— 纯文本生成，最终方案
_SOUL_CONSOLIDATION_PROMPT = (
    "You are a personality editor for an AI assistant. "
    "Given the current SOUL.md and user instructions, produce an updated SOUL.md. "
    "Output ONLY the new SOUL.md content in markdown, nothing else. "
    "Keep sections: Personality, Values, Communication Style. "
    "Merge new traits with existing ones. Do not wrap in code fences."
)

async def _handle_soul(self, msg, session):
    extra_text = msg.content.strip()[5:].strip()
    # ... 读取当前 SOUL.md ...

    response = await self.provider.chat_with_retry(
        messages=[
            {"role": "system", "content": self._SOUL_CONSOLIDATION_PROMPT},
            {"role": "user", "content": prompt},
        ],
        model=self.model,
        # 注意：没有 tools 参数，没有 tool_choice
    )

    content = (response.content or "").strip()
    # 防御性处理：去掉 LLM 可能包裹的 code fence
    if content.startswith("```") and content.endswith("```"):
        content = content.split("\n", 1)[-1].rsplit("\n", 1)[0].strip()

    soul_path.write_text(content, encoding="utf-8")

5.2 为什么 /mem 用 tool call 而 /soul 用纯文本？

这不是随意选择，而是由数据结构决定的：

维度	/mem	/soul
输出结构	两个字段：`history_entry` + `memory_update`	单一文本块
为什么用/不用 tool call	需要结构化输出来分别写入两个文件	只需要一段 markdown 文本
兼容性风险	已有成熟的 fallback 机制	新写的代码，不值得重复造轮子

/mem 背后的 MemoryStore.consolidate() 需要 LLM 同时返回 history_entry（写入 HISTORY.md）和 memory_update（写入 MEMORY.md），用 tool call 的结构化输出是合理的。而 /soul 只需要一整段 markdown，纯文本输出完全够用。

六、另一个隐蔽 Bug：斜杠命令的参数匹配

6.1 问题现象

测试 /mem 我是AI助手 时，nanobot 回复"好的，我记住了"，但 MEMORY.md 是空的。

6.2 根因分析

cmd = msg.content.strip().lower()
if cmd == "/mem":  # 精确匹配！
    # ... 处理 /mem

当用户输入 /mem 我是AI助手 时，cmd 的值是 "/mem 我是ai助手"，不等于 "/mem"，所以命令匹配失败。消息直接走了普通对话流程，LLM 自作主张回复"已保存"，但实际上什么都没写入。

这是一个典型的 LLM 幻觉 + 命令路由 Bug 的组合：代码没有正确路由命令，而 LLM 又"善意地"假装执行了操作。

6.3 修复方案

# 修复后：同时匹配精确命令和带参数的命令
if cmd == "/soul" or cmd.startswith("/soul "):
    return await self._handle_soul(msg, session)

if cmd == "/mem" or cmd.startswith("/mem "):
    return await self._handle_mem(msg, session)

同时在 handler 中提取参数：

async def _handle_mem(self, msg, session):
    extra_text = msg.content.strip()[4:].strip()  # 去掉 "/mem" 前缀
    snapshot = list(session.messages[session.last_consolidated:])
    if extra_text:
        # 将用户附带的文本作为额外消息加入归档
        snapshot.append({
            "role": "user",
            "content": extra_text,
            "timestamp": datetime.now().isoformat(),
        })

注意 [4:] 而不是 [5:]——/mem 是 4 个字符，/soul 是 5 个字符。这种硬编码的偏移量虽然不优雅，但在斜杠命令这种简单场景下足够可靠。

七、三个版本的对比总结

版本	/soul 方案	/mem 方案	命令匹配	结果
V1	读文件→原样写回（noop）	调用 archive_messages	`==` 精确匹配	/soul 无效，/mem 不支持参数
V2	tool_choice 强制调用 save_soul	同 V1 + 修复 ok 变量作用域	`startswith` 前缀匹配	/soul 在 Moonshot 上报错
V3	纯文本生成，不用 tool call	同 V2 + 支持附带文本	同 V2	全部通过

八、经验教训

8.1 复用 > 重写

/mem 之所以一次就对，是因为它复用了 MemoryConsolidator 这个已经在生产环境验证过的组件。而 /soul 第二版试图自己实现一套类似的 tool call 逻辑，结果踩了 Provider 兼容性的坑。

教训：在已有系统中添加新功能时，优先寻找可复用的组件，而不是从头实现。

8.2 了解你的抽象层

_safe_chat 吞掉异常这个行为，如果不读源码是不会知道的。第二版的 except Exception 分支写得很自信，但因为不了解底层的异常处理机制，这段代码永远不会执行。

教训：在调用框架内部方法时，必须了解它的异常传播行为，而不是想当然地加 try-except。

8.3 选择合适的 LLM 交互模式

不是所有场景都需要 tool call。当输出是单一文本块时，纯文本生成更简单、更兼容。tool call 适合需要结构化输出的场景（多个字段、需要精确解析）。

场景	推荐方式
输出是结构化数据（JSON、多字段）	Tool Call
输出是单一文本块	纯文本生成
需要跨多个 Provider 兼容	纯文本生成（更安全）

8.4 斜杠命令设计的注意事项

参数匹配：如果命令支持参数，不能用 == 精确匹配，要用 startswith + 空格判断
参数提取：用固定偏移量（如 [5:]）提取参数时，注意命令名的长度
空参数处理：/soul 无参数时应该有合理的默认行为（显示当前内容），而不是报错
LLM 幻觉防御：命令路由失败时，消息会走普通对话流程，LLM 可能会"假装"执行了命令。这种 bug 很隐蔽，因为用户看到的回复是"正确"的

九、跨 Provider 兼容性分析：不只是 Moonshot 的问题

9.1 tool_choice 不兼容是行业通病

第二版 /soul 踩的坑并非 Moonshot 独有。看一下框架中已经积累的错误标记就知道：

_TOOL_CHOICE_ERROR_MARKERS = (
    "tool_choice",                    # Moonshot、部分国产模型
    "toolchoice",                     # 某些 Provider 的驼峰写法
    "does not support",               # 通用的"不支持"提示
    'should be ["none", "auto"]',     # 明确拒绝 specified 模式
)

这四个 marker 覆盖了至少三种不同的错误格式，说明开发者在之前的迭代中已经踩过多个 Provider 的坑。这不是某一家的问题，而是 LLM API 生态的现状：

Moonshot（Kimi）：thinking 模式下不允许 tool_choice='specified'
部分国产模型：根本不支持 tool_choice 参数，只接受 "none" 和 "auto"
某些开源模型部署：通过 litellm 转发时，tool_choice 的格式可能不被底层模型识别

特别是当模型开启了"深度思考"（thinking/reasoning）模式时，tool_choice 强制指定工具与思考链的生成逻辑存在冲突——模型需要先"想清楚"再决定调用什么工具，而 tool_choice=specified 跳过了这个思考过程。

9.2 最终方案的 Provider 兼容性

命令	LLM 调用方式	Moonshot	Minimax	GLM	OpenAI	通用兼容性
`/soul`	纯文本生成（无 tools、无 tool_choice）	✅	✅	✅	✅	所有 Provider
`/mem`	tool_choice=forced → 失败自动降级 auto	✅	✅	✅	✅	有降级保底

具体分析：

/soul（纯文本生成）：调用时不传 tools 和 tool_choice 参数，这是最基础的 LLM API 调用——发消息、收文本。任何符合 OpenAI Chat Completions 格式的 Provider 都支持，不存在兼容性风险。

# /soul 的调用——最大公约数式的 API 调用
response = await self.provider.chat_with_retry(
    messages=[
        {"role": "system", "content": self._SOUL_CONSOLIDATION_PROMPT},
        {"role": "user", "content": prompt},
    ],
    model=self.model,
    # 没有 tools，没有 tool_choice，没有任何可能出问题的参数
)

/mem（tool call + 降级）：通过 MemoryStore.consolidate() 调用，内部有两级降级：

tool_choice=forced（指定工具）
    │
    ├─ 成功 → 解析 tool call 结果 ✅
    │
    ├─ 失败（_is_tool_choice_unsupported）
    │   └─ 降级为 tool_choice="auto" → LLM 自主决定是否调用工具
    │       ├─ 调用了 save_memory → 解析结果 ✅
    │       └─ 没调用 → _fail_or_raw_archive() → 原始消息直接写入 HISTORY.md ✅
    │
    └─ 连续失败 3 次 → _raw_archive() 兜底，保证数据不丢失 ✅

即使某个 Provider 完全不支持 tool call（比如某些轻量级开源模型），/mem 最终也会通过 _raw_archive() 兜底，把原始消息直接写入 HISTORY.md。数据不会丢失，只是没有经过 LLM 总结。

9.3 设计启示：按兼容性风险分级选择方案

在多 Provider 环境下开发 AI Agent 功能时，可以按以下优先级选择 LLM 交互方式：

兼容性风险	方式	适用场景
🟢 零风险	纯文本生成	输出是单一文本块（如 `/soul`）
🟡 低风险	tool call + `tool_choice="auto"`	需要结构化输出，但不强制
🟠 中风险	tool call + `tool_choice=forced` + 降级	需要确保调用特定工具（如 `/mem`）
🔴 高风险	tool call + `tool_choice=forced` 无降级	不推荐，除非确定只用单一 Provider

核心原则：能用纯文本解决的，就不要用 tool call。需要用 tool call 的，必须有降级方案。

十、最终架构图

用户输入: "/soul 你是AI助手"
    │
    ▼
_process_message()
    │
    ├─ cmd.startswith("/soul ") → True
    │
    ▼
_handle_soul()
    │
    ├─ 提取参数: "你是AI助手"
    ├─ 读取当前 SOUL.md
    ├─ 构建 prompt (当前人格 + 用户指令)
    │
    ▼
provider.chat_with_retry()  ← 纯文本模式，无 tools 参数
    │
    ├─ 检查 finish_reason != "error"
    ├─ 去掉可能的 code fence
    ├─ 验证内容长度 >= 10
    │
    ▼
soul_path.write_text()  → SOUL.md 更新完成

用户输入: "/mem 我是AI助手"
    │
    ▼
_process_message()
    │
    ├─ cmd.startswith("/mem ") → True
    │
    ▼
_handle_mem()
    │
    ├─ 提取参数: "我是AI助手"
    ├─ 获取 session 未归档消息
    ├─ 将参数作为额外 user 消息追加
    │
    ▼
memory_consolidator.archive_messages()
    │
    ├─ MemoryStore.consolidate()
    │   ├─ tool_choice=forced → save_memory tool call
    │   ├─ (如果失败) → tool_choice=auto 降级
    │   └─ 解析 history_entry + memory_update
    │
    ├─ 写入 HISTORY.md (追加)
    ├─ 写入 MEMORY.md (覆盖)
    │
    ▼
session.last_consolidated 更新 → 防止重复归档

十一、斜杠命令为什么不进上下文却能立即生效？

这是一个容易被忽略但非常关键的设计问题。

11.1 斜杠命令的消息隔离

所有斜杠命令（/new、/help、/soul、/mem）都不会进入模型的对话上下文。原因是 _process_message 中的早返回机制：

def _process_message(self, msg, ...):
    cmd = msg.content.strip().lower()

    if cmd == "/new":
        return OutboundMessage(...)       # ← 直接返回
    if cmd == "/help":
        return OutboundMessage(...)       # ← 直接返回
    if cmd.startswith("/soul"):
        return await self._handle_soul()  # ← 直接返回
    if cmd.startswith("/mem"):
        return await self._handle_mem()   # ← 直接返回

    # ═══ 以下代码只有非命令消息才会执行 ═══

    history = session.get_history()
    messages = self.context.build_messages(       # 构建 LLM 上下文
        history=history,
        current_message=msg.content,              # 用户消息在这里才进入上下文
    )
    final_content, _, all_msgs = await self._run_agent_loop(messages)  # 调用 LLM
    self._save_turn(session, all_msgs, ...)       # 保存到 session 历史

斜杠命令在 build_messages 之前就 return 了，所以：

命令文本不会被发给 LLM
命令文本不会被 _save_turn 写入 session.messages
对话历史中看不到任何斜杠命令的痕迹

11.2 为什么写入后下一轮对话就能感知？

关键在于 SOUL.md 和 MEMORY.md 的加载时机——它们是每轮对话实时读取的，不是启动时加载一次。

每次用户发送普通消息时，调用链如下：

用户发送 "你好"
    │
    ▼
_process_message()
    │
    ├─ 不是斜杠命令，继续往下
    │
    ▼
context.build_messages()
    │
    ▼
context.build_system_prompt()
    │
    ├─ _load_bootstrap_files()
    │   └─ 读取 SOUL.md        ← 每次都从磁盘读，拿到最新内容
    │
    ├─ memory.get_memory_context()
    │   └─ 读取 MEMORY.md      ← 每次都从磁盘读，拿到最新内容
    │
    └─ 组装成 system prompt → 发给 LLM

对应的代码：

# context.py — 每次 build 都重新读文件
def _load_bootstrap_files(self) -> str:
    for filename in self.BOOTSTRAP_FILES:  # ["AGENTS.md", "SOUL.md", "USER.md", "TOOLS.md"]
        file_path = self.workspace / filename
        if file_path.exists():
            content = file_path.read_text(encoding="utf-8")  # 实时读取，无缓存
            parts.append(f"## {filename}\n\n{content}")

# memory.py — 同样每次实时读取
def get_memory_context(self) -> str:
    long_term = self.read_long_term()  # 读 MEMORY.md
    return f"## Long-term Memory\n{long_term}" if long_term else ""

11.3 完整的数据流

把整个过程串起来：

第 1 步：用户发送 "/soul 你是AI助手"
    │
    ├─ 早返回，不进 session 历史
    ├─ _handle_soul() 调用 LLM 生成新人格
    └─ 写入 SOUL.md 文件（磁盘）

第 2 步：用户发送 "/mem 记住我叫小明"
    │
    ├─ 早返回，不进 session 历史
    ├─ _handle_mem() 调用 LLM 归档记忆
    └─ 写入 MEMORY.md 文件（磁盘）

第 3 步：用户发送 "你好"（普通消息）
    │
    ├─ 不是命令，走正常流程
    ├─ build_system_prompt()
    │   ├─ 读 SOUL.md   → "你是AI助手..."（第 1 步写入的）
    │   └─ 读 MEMORY.md → "用户叫小明..."（第 2 步写入的）
    │
    ├─ system prompt 包含了最新的人格和记忆
    └─ LLM 回复时已经知道自己是AI助手，用户叫小明

这个设计的巧妙之处在于：斜杠命令通过文件系统作为中介，实现了"命令不进上下文，但效果进上下文"的解耦。 命令本身是一次性的控制指令，不应该污染对话历史；而命令的产物（SOUL.md、MEMORY.md）作为系统级配置，自然地融入每轮对话的 system prompt 中。

11.4 重点：/soul 使用的是独立的 LLM 调用，不是主对话

这是一个容易混淆的点，需要特别说明。

当用户发送 /soul 你是一个幽默的助手 时，_handle_soul 内部会调用一次 LLM。但这次 LLM 调用和用户正在进行的主对话完全隔离——它有自己独立的 system prompt、独立的消息列表，不携带任何对话历史。

用户发送: "/soul 你是一个幽默的助手"
    │
    ▼
提取参数: extra_text = "你是一个幽默的助手"
    │
    ▼
构造独立的 LLM 请求（注意：不是主对话的 LLM）
    │
    ├─ system: "You are a personality editor..."  ← 专用的 system prompt
    ├─ user: "当前 SOUL.md 内容 + 用户指令"       ← 只有这两样，没有对话历史
    ├─ 无 tools 参数                              ← 纯文本生成
    └─ 无 tool_choice 参数                        ← 零兼容性风险
    │
    ▼
LLM 返回新的 SOUL.md 内容
（这次 LLM 调用能看到 "你是一个幽默的助手"，因为它在 prompt 里）
    │
    ▼
写入文件: workspace/SOUL.md
    │
    ▼
返回用户: "Soul updated."
    │
    ▼
这次对话结束 ── prompt 不保存到 session，LLM 上下文随即释放
    │
    ▼
下一轮普通对话:
    ├─ 对话历史中看不到 "/soul 你是一个幽默的助手"
    ├─ 但 build_system_prompt() 会重新读取 SOUL.md
    └─ 新的人格已经生效，LLM 会以幽默风格回复

对应的代码可以清楚地看到这个隔离：

# _handle_soul 中的 LLM 调用
response = await self.provider.chat_with_retry(
    messages=[
        # 独立的 system prompt，不是主对话的 build_system_prompt()
        {"role": "system", "content": self._SOUL_CONSOLIDATION_PROMPT},
        # 独立的 user message，只包含当前 SOUL + 用户指令
        {"role": "user", "content": prompt},
    ],
    model=self.model,
    # 没有 tools，没有 tool_choice
)

对比主对话的 LLM 调用：

# 主对话的 LLM 调用（_run_agent_loop 中）
initial_messages = self.context.build_messages(
    history=history,              # 包含完整对话历史
    current_message=msg.content,  # 当前用户消息
    channel=msg.channel,
    chat_id=msg.chat_id,
)
# build_messages 内部会调用 build_system_prompt()
# 其中包含 SOUL.md、MEMORY.md、skills 等所有上下文

两者的区别一目了然：

维度	/soul 的 LLM 调用	主对话的 LLM 调用
System Prompt	固定的 `_SOUL_CONSOLIDATION_PROMPT`	`build_system_prompt()`（含 SOUL.md、MEMORY.md、skills）
对话历史	无	`session.get_history()`
Tools	无	完整的工具注册表
结果去向	写入 SOUL.md 文件	返回给用户 + 保存到 session
生命周期	一次性，用完即弃	持续累积在 session 中