通过 LangChain Hook 方式解决 Gemini 3 Thought Signature 问题

使用 OpenRouter + Gemini 3 Pro 时，LangChain 的ChatOpenAI不支持 Thought Signature，导致 400 错误（通过方式 Hook函数，确保在消息合并过程中不丢失。这是在不修改 LangChain 核心代码的前提下，扩展其功能的唯一可行方法。Monkey Patch 机制，通过运行时替换 LangChain 内部函数，实现 Thought S

信马堂

1276人浏览 · 2025-11-25 11:30:23

信马堂 · 2025-11-25 11:30:23 发布

📋 摘要

问题： 使用 OpenRouter + Gemini 3 Pro 时，LangChain 的 ChatOpenAI 不支持 Thought Signature，导致 400 错误（Function call is missing a thought_signature）。

解决方案： 通过 LangChain Hook 方式，必须同时使用两种互补的实现方法：

继承重写： 扩展 ChatOpenAI 类，重写 _get_request_payload（注入 Thought Signature）和 _convert_chunk_to_generation_chunk（提取 Thought Signature）方法
Monkey Patch： Hook add_ai_message_chunks 函数，修复 LangChain 消息合并逻辑，确保 thought_signature 在流式消息合并时不丢失

为什么必须同时使用： 继承重写负责注入和提取，但无法修复 LangChain 内部的消息合并逻辑；Monkey Patch 修复了合并逻辑，但无法完成注入和提取。两者缺一不可。

这是在不修改 LangChain 核心代码的前提下，扩展其功能的唯一可行方法。

核心技术： LangChain Hook 机制，通过继承重写和 Monkey Patch 两种互补方式，实现 Thought Signature 的完整传递。

适用场景： 通过 OpenRouter 使用 Gemini 3 Pro 的 LangChain 项目。

1. 背景与问题

最近，Google 发布了 Gemini 3 Pro 模型。在使用 Gemini 3 Pro 进行多轮工具调用时，必须正确处理 thought_signature（思考签名），否则会报 400 错误。

虽然 langchain-google 最新版本本身是支持 Gemini 3 的，但我们项目中使用的是 OpenRouter，而 OpenRouter 使用的是 OpenAI 的接口调用方式。 LangChain 的 ChatOpenAI 最新版本并没有提供对 Thought Signature 的支持，导致我们在通过 OpenRouter 调用 Gemini 3 Pro 时，无法正确处理 Thought Signature，出现 400 错误。

⚠️ 核心问题：

强制要求： Gemini 3 Pro 要求在多轮对话中维护 thought_signature，用于保持上下文一致性
接口差异： OpenRouter 使用 OpenAI 兼容接口，而不是 Google 原生接口，无法直接使用 langchain-google 的支持
框架限制： LangChain 的 ChatOpenAI（用于 OpenAI 兼容接口）不支持 thought_signature 的传递和保留
解决方案： 只能通过 LangChain Hook 方式自己解决 Gemini 3 Thought Signature 的问题，否则会报 400 错误

💡 本文档将详细说明：

为什么需要 Thought Signature 以及它的工作原理
OpenRouter 的特殊性带来的挑战
如何通过 LangChain Hook 方式解决这些问题
完整的代码实现和最佳实践

2. Thought Signature 核心概念

2.1 什么是 Thought Signature？

thought_signature（思考签名）是 Google Gemini API 的关键机制，用于在多轮对话中维护模型的"思考状态"。它是一个加密的签名，包含模型在调用工具时的内部推理过程的加密表示。 Gemini 3 Pro 要求在多轮工具调用场景中必须正确处理和传递 thought_signature，否则会报 400 错误。

关键规则：

必须传递： 使用 Gemini 3 Pro 时，必须在函数调用期间传递回思维签名，否则会收到 400 验证错误
多步调用： 如果存在连续的函数调用（多步），每个函数调用都会有一个签名，您必须将所有签名都传递回去
位置精确： 必须在收到签名时所在的精确位置返回此签名

参考文档：

更多详细信息请参考 Google AI 官方文档 - Thought Signatures。

3. 为什么需要自定义解决方案

虽然 langchain-google-genai 最新版本已经原生支持 Gemini 3 的 thought_signature 功能（参考 GitHub Issue #1364），但我们的项目使用的是 OpenRouter 作为 LLM 提供商。

核心问题：

OpenRouter 使用 OpenAI 兼容的 API 接口，通过 langchain-openai 的 ChatOpenAI 类来访问 Gemini 模型。然而，ChatOpenAI 类最新版本并没有提供对 Gemini 3 thought_signature 的支持，直接使用会导致 InvalidArgument: 400 Function call is missing a thought_signature 错误。

无法直接使用 langchain-google-genai 的原生支持，因为需要通过 OpenRouter 的 API。

因此，我们必须通过 LangChain Hook 方式自定义实现 Thought Signature 的支持：

扩展 ChatOpenAI 类，创建 ChatOpenRouterGemini3
Hook LangChain 内部函数，修复消息合并逻辑
实现 Thought Signature 的注入、提取和传递

方案对比：

方案	适用场景	状态
`langchain-google-genai`	直接使用 Google API	✅ 官方支持
`langchain-openai` + OpenRouter	通过 OpenRouter 使用 Gemini	❌ 不支持
本文方案	OpenRouter + Gemini 3	✅ 自定义实现

4. LangChain Hook 机制详解（核心技术）

🔥 这是整个解决方案的核心技术！

本文的核心是通过 LangChain Hook（Monkey Patch） 方式来解决 Thought Signature 问题。这是在不修改 LangChain 核心代码的前提下，扩展其功能的唯一可行方法。

4.1 Hook 与 Monkey Patch 的关系

概念澄清：

Hook（钩子）： 是一种编程模式/概念，允许在程序执行的特定点插入自定义代码。Hook 是一个抽象概念，有多种实现方式。
Monkey Patch（猴子补丁）： 是 Python 中实现 Hook 的一种具体技术手段，即在运行时动态替换函数或方法。
继承重写（Override）： 也是实现 Hook 的一种方式，通过继承类并重写方法来实现。

关系总结：

Hook 是概念，Monkey Patch 是实现方式： Hook 是更广泛的设计模式，Monkey Patch 是 Python 中实现 Hook 的一种具体技术。
两种实现方式：
- 方式一：继承重写 - 通过继承类并重写方法（推荐，更安全）
- 方式二：Monkey Patch - 运行时替换模块中的函数（仅在必要时使用）

⚠️ 重要区别：

在我们的解决方案中，我们同时使用了两种方式：

继承重写： 重写 _get_request_payload 和 _convert_chunk_to_generation_chunk 方法
Monkey Patch： 替换 add_ai_message_chunks 模块函数

两者都是实现 Hook 的方式，但实现机制不同。

4.2 为什么需要 Hook LangChain？

问题根源：

LangChain 的标准实现不支持 thought_signature 的传递，具体表现为：

消息合并时丢失： add_ai_message_chunks 函数在合并流式消息时，会丢失 tool_call_chunks.extras 字段
序列化时忽略： LangChain 的序列化逻辑不会处理 extras 字段中的自定义数据
无法扩展： LangChain 的核心类（如 AIMessage）不支持直接添加 thought_signature 字段

4.3 LangChain 中可重写的 Hook 方法

在 LangChain 中，有两种方式可以实现 Hook：

4.3.1 方式一：继承重写（Override）

通过继承 LangChain 的类并重写其方法，这是最常用和推荐的方式。在 ChatOpenAI 类中，以下方法可以被重写：

📋 可重写的关键方法：

方法名	调用时机	作用	我们是否重写
`_get_request_payload`	发送 API 请求前	构建和修改请求 payload	✅ 是
`_convert_chunk_to_generation_chunk`	流式处理每个 chunk 时	转换和提取流式响应数据	✅ 是
`_create_chat_result`	创建最终响应结果时	构建 ChatResult 对象	❌ 否
`_generate`	同步生成响应时	处理同步 API 调用	❌ 否
`_astream`	异步流式生成时	处理异步流式 API 调用	❌ 否
`_convert_input`	转换输入消息时	将输入转换为消息格式	❌ 否

4.3.2 我们项目中重写的方法详解

1. _get_request_payload(self, input_, **kwargs)

调用时机： 每次发送 API 请求前，LangChain 会调用此方法构建请求 payload。

作用： 这是我们在请求中注入 thought_signature 的关键位置。

我们的实现：

先调用父类方法获取基础 payload
检测活跃循环，确定哪些消息需要真实签名
为历史消息注入 DUMMY 签名，为活跃循环消息注入真实签名
注入 reasoning_details 和 function.thought_signature
添加 include_reasoning=True 到 extra_body

def _get_request_payload(self, input_, **kwargs):
    """重写方法，在发送请求前处理 thought_signature"""
    payload = super()._get_request_payload(input_, **kwargs)
    
    # 处理 thought_signature 注入逻辑
    # ... 具体实现见第 8 章 ...

2. _convert_chunk_to_generation_chunk(self, chunk, default_chunk_class, base_generation_info)

调用时机： 在流式处理过程中，每个从 API 返回的 chunk 都会经过此方法转换。

作用： 这是我们从流式响应中提取 thought_signature 和推理文本的关键位置。

我们的实现：

累积 reasoning_details 中的各个部分
提取 reasoning.encrypted 类型的 thought_signature
提取 reasoning.text 类型的推理文本
将提取的数据存储到 AIMessageChunk.additional_kwargs 中
在 finish_reason 阶段进行最终整理

def _convert_chunk_to_generation_chunk(
    self, chunk: dict, default_chunk_class: type, base_generation_info: Optional[dict]
) -> Optional[ChatGenerationChunk]:
    """重写方法，处理 reasoning_details 和多个 tool calls"""
    # 提取和累积 reasoning_details
    # ... 具体实现见第 9 章 ...

4.3.3 方式二：Monkey Patch（运行时替换）

对于 LangChain 内部模块级别的函数，无法通过继承重写，只能使用 Monkey Patch 方式。

📋 我们项目中 Monkey Patch 的函数：

函数名	所属模块	调用时机	作用
`add_ai_message_chunks`	`langchain_core.messages.ai`	合并流式消息 chunks 时	保留 `tool_call_chunks.extras` 中的 `thought_signature`

⚠️ 为什么需要 Monkey Patch add_ai_message_chunks？

LangChain 在合并多个 AIMessageChunk 时，会调用 add_ai_message_chunks 函数。这个函数是模块级别的函数，不是类方法，无法通过继承重写。而且，LangChain 的标准实现会丢失 tool_call_chunks.extras 字段，导致我们存储的 thought_signature 在合并时丢失。

因此，我们必须通过 Monkey Patch 方式替换这个函数，确保在合并时保留 thought_signature。

4.4 两种实现方式的对比

4.4.1 两种方式在我们的实现中的体现

📊 对比说明：

对比项	继承重写（方式一）	Monkey Patch（方式二）
实现位置	在 `ChatOpenRouterGemini3` 类中重写方法	在模块级别替换函数
具体例子	`def _get_request_payload(self, ...)` `def _convert_chunk_to_generation_chunk(self, ...)`	`ai_module.add_ai_message_chunks = _patched_add_ai_message_chunks`
代码位置	`openrouter_gemini3.py` 第 110 行和第 332 行	`openrouter_gemini3.py` 第 502-560 行
作用范围	只影响 `ChatOpenRouterGemini3` 实例	影响所有使用 `add_ai_message_chunks` 的地方
调用方式	通过 `super()` 调用父类方法	保存原始函数引用，在包装函数中调用
为什么选择这种方式	这些是类方法，可以通过继承重写，更安全	这是模块级函数，无法通过继承重写，必须使用 Monkey Patch

💡 具体代码示例对比：

方式一：继承重写示例

# 在 ChatOpenRouterGemini3 类中
class ChatOpenRouterGemini3(ChatOpenAI):
    def _get_request_payload(self, input_, **kwargs):
        """重写方法，在发送请求前处理 thought_signature"""
        # 1. 先调用父类方法
        payload = super()._get_request_payload(input_, **kwargs)
        
        # 2. 添加自定义逻辑
        if "messages" in payload:
            # 处理 thought_signature 注入
            # ...
        
        return payload

特点：

✅ 代码在类内部，结构清晰
✅ 只影响当前类的实例
✅ 通过 super() 调用父类方法，保持向后兼容
✅ 易于理解和维护

方式二：Monkey Patch 示例

# 在模块级别定义
from langchain_core.messages.ai import add_ai_message_chunks as _original_add_ai_message_chunks

def _patched_add_ai_message_chunks(left, *others):
    """修复版本的函数"""
    # 1. 先调用原始函数
    result = _original_add_ai_message_chunks(left, *others)
    
    # 2. 添加自定义逻辑
    # 收集和恢复 thought_signature
    # ...
    
    return result

# 3. 替换模块中的函数
from langchain_core.messages import ai as ai_module
ai_module.add_ai_message_chunks = _patched_add_ai_message_chunks

特点：

⚠️ 在模块级别替换函数，影响全局
⚠️ 必须保存原始函数引用
⚠️ 在模块加载时执行，影响所有后续调用
✅ 唯一能修改模块级函数的方式

4.4.2 如何选择合适的 Hook 方式？

💡 选择指南：

优先使用继承重写：
- 如果目标方法是类方法（如 _get_request_payload）
- 如果只需要修改单个类的行为
- 这种方式更安全、更易维护
使用 Monkey Patch：
- 如果目标是模块级别的函数（如 add_ai_message_chunks）
- 如果需要修改全局行为，影响所有使用该函数的地方
- 只有在无法通过继承实现时才使用

4.4.3 Hook 方法的调用流程

📊 完整的调用流程：

用户调用 agent.ainvoke() 或 agent.astream()
    ↓
ChatOpenRouterGemini3._get_request_payload()  ← 🔥 Hook 1: 注入 thought_signature
    ↓
发送 API 请求到 OpenRouter
    ↓
接收流式响应 chunks
    ↓
ChatOpenRouterGemini3._convert_chunk_to_generation_chunk()  ← 🔥 Hook 2: 提取 thought_signature
    ↓
LangChain 合并多个 chunks
    ↓
add_ai_message_chunks()  ← 🔥 Hook 3 (Monkey Patch): 保留 thought_signature
    ↓
返回最终结果给用户

4.5 最佳实践

💡 Hook 实现的最佳实践：

保存原始函数： 始终保存原始函数的引用，以便必要时恢复或调用
保持向后兼容： 包装函数应该先调用原始函数，保持原有功能
错误处理： 添加适当的错误处理，避免 Hook 失败影响整个系统
日志记录： 记录 Hook 的应用情况，便于调试和监控
文档说明： 清晰说明 Hook 的目的和影响范围
版本检查： 在 Hook 前检查 LangChain 版本，确保兼容性
优先继承： 优先使用继承重写，只有在必要时才使用 Monkey Patch

5. 完整实现方案

基于 LangChain Hook 机制，我们实现了完整的 Thought Signature 支持方案。本章将详细介绍各个实现细节。

5.1 Thought Signature 注入

根据 Google Gemini API 官方文档， Thought Signature 需要在两个位置注入到请求 payload 中。

5.1.1 修复 add_ai_message_chunks（Monkey Patch）

LangChain 在合并 AIMessageChunk 时，会丢失 tool_call_chunks 的 extras 字段。我们需要修复这个函数来保留 thought_signature。

from langchain_core.messages.ai import add_ai_message_chunks as _original_add_ai_message_chunks

def _patched_add_ai_message_chunks(
    left: AIMessageChunk, *others: AIMessageChunk
) -> AIMessageChunk:
    """修复版本的 add_ai_message_chunks，保留 tool_call_chunks 的 extras 字段"""
    # 先调用原始函数
    result = _original_add_ai_message_chunks(left, *others)
    
    # 收集所有 tool_call_chunks 的 extras
    all_chunks = [left] + list(others)
    tool_call_extras_map = {}  # {tool_call_id: extras}
    
    for chunk in all_chunks:
        if hasattr(chunk, 'tool_call_chunks') and chunk.tool_call_chunks:
            for tcc in chunk.tool_call_chunks:
                if isinstance(tcc, dict):
                    tcc_id = tcc.get('id')
                    tcc_extras = tcc.get('extras', {})
                    if tcc_id and tcc_extras and 'thought_signature' in tcc_extras:
                        tool_call_extras_map[tcc_id] = tcc_extras.copy()
    
    # 注入到合并后的 tool_call_chunks
    if hasattr(result, 'tool_call_chunks') and result.tool_call_chunks:
        for tcc in result.tool_call_chunks:
            if isinstance(tcc, dict):
                tcc_id = tcc.get('id')
                if tcc_id in tool_call_extras_map:
                    if "extras" not in tcc:
                        tcc["extras"] = {}
                    tcc["extras"].update(tool_call_extras_map[tcc_id])
    
    return result

# 应用 monkey patch
from langchain_core.messages import ai as ai_module
ai_module.add_ai_message_chunks = _patched_add_ai_message_chunks

5.1.2 注入时机和位置

📋 Gemini API 要求的注入位置：

消息级别的 reasoning_details： 包含完整的推理详情，包括 reasoning.encrypted 类型的 thought_signature
函数调用级别的 function.thought_signature： 每个工具调用的 function 对象中必须包含 thought_signature 字段

我们通过重写 _get_request_payload 方法来实现注入。这个方法在每次发送 API 请求前被 LangChain 调用，此时 payload 已经构建完成，但还没有发送到服务器。这是修改 payload 的最佳时机。

执行流程：

调用父类方法： 先调用 super()._get_request_payload() 获取 LangChain 构建的基础 payload
遍历消息： 遍历 payload 中的每条消息，找出包含工具调用的消息
提取签名： 从历史消息中提取之前保存的 thought_signature
注入签名： 将签名注入到两个位置：reasoning_details 和 function.thought_signature

5.1.3 签名的提取策略

根据 Google 文档，在多步函数调用中，每个函数调用可能有独立的签名。我们的提取策略遵循以下优先级：

提取优先级（从高到低）：

从 tool_calls.extras 提取： 这是最可靠的方式，因为我们在流式处理时已经将签名存储在这里
从 AIMessage.additional_kwargs['_thought_signature'] 提取： 作为备用方案
从 reasoning_details 中提取： 如果前两种方式都没有找到
使用 DUMMY 签名： 如果以上方式都失败，使用占位符签名（仅限活跃循环外的历史消息）

5.1.4 注入到 reasoning_details

根据 Google 文档，reasoning_details 是一个数组，包含多个推理详情项。每个项都有以下结构：

{
  "format": "google-gemini-v1",
  "index": 0,
  "type": "reasoning.encrypted",
  "data": ""
}

我们的实现逻辑：

活跃循环内的消息： 如果原始消息中有 reasoning_details，直接使用；如果没有，从 thought_signature 构造一个
历史消息： 不注入 reasoning_details，只注入 function.thought_signature（使用 DUMMY），这样可以减少 token 消耗

5.1.5 注入到 function.thought_signature

根据 Google 文档，每个工具调用的 function 对象中必须包含 thought_signature 字段。这是 API 验证的关键字段，如果缺失会导致 400 错误。

⚠️ 关键要求：

根据 Google 文档，在多步函数调用中，每个函数调用可能有独立的签名。我们的实现会：

优先提取独立签名： 尝试从每个 tool_call 的 extras 中提取独立的签名
回退到消息级别签名： 如果没有找到独立签名，使用消息级别的签名
确保所有函数调用都有签名： 即使使用 DUMMY，也要确保每个函数调用都有 thought_signature 字段

注入格式：

{
  "role": "assistant",
  "tool_calls": [
    {
      "id": "call_123",
      "type": "function",
      "function": {
        "name": "get_weather",
        "arguments": "{\"location\":\"Paris\"}",
        "thought_signature": ""  // 🔥 必须注入这里
      }
    }
  ],
  "reasoning_details": [  // 🔥 活跃循环内的消息需要这个
    {
      "format": "google-gemini-v1",
      "index": 0,
      "type": "reasoning.encrypted",
      "data": ""
    }
  ]
}

5.1.6 完整实现代码

以下是完整的实现代码，包含详细的注释说明：

def _get_request_payload(self, input_, **kwargs):
    """重写方法，在发送请求前处理 thought_signature
    
    根据 Google Gemini API 文档，thought_signature 需要注入到两个位置：
    1. 消息级别的 reasoning_details（仅限活跃循环）
    2. 函数调用级别的 function.thought_signature（所有消息）
    """
    # 1. 先调用父类方法，获取 LangChain 构建的基础 payload
    payload = super()._get_request_payload(input_, **kwargs)
    
    if "messages" not in payload:
        return payload
    
    # 2. 获取消息对象列表（用于提取签名）
    messages_objs = self._convert_input(input_).to_messages()
    
    # 3. 找到活跃循环的起始位置（用于区分历史消息和活跃消息）
    active_loop_start = self._find_active_loop_start(messages_objs)
    
    # 4. 遍历每条消息，处理 thought_signature
    for i, (msg_dict, orig_msg) in enumerate(zip(payload["messages"], messages_objs)):
        # 跳过没有工具调用的消息
        if "tool_calls" not in msg_dict or not msg_dict["tool_calls"]:
            continue
        
        # 5. 提取 thought_signature（根据消息位置和可用性）
        message_thought_signature = None
        use_dummy = False
        
        if i < active_loop_start:
            # 历史消息：使用 DUMMY 签名，减少 token 消耗
            message_thought_signature = DUMMY_THOUGHT_SIGNATURE
            use_dummy = True
        else:
            # 活跃循环：尝试提取真实签名
            if isinstance(orig_msg, (AIMessage, AIMessageChunk)):
                # 优先级1: 从 tool_calls.extras 中提取（最可靠）
                if hasattr(orig_msg, 'tool_calls') and orig_msg.tool_calls:
                    for tc in orig_msg.tool_calls:
                        if isinstance(tc, dict):
                            extras = tc.get("extras", {})
                            if "thought_signature" in extras:
                                message_thought_signature = extras["thought_signature"]
                                break
                
                # 优先级2: 从 additional_kwargs 中提取
                if not message_thought_signature:
                    message_thought_signature = orig_msg.additional_kwargs.get("_thought_signature")
                
                # 优先级3: 从 reasoning_details 中提取
                if not message_thought_signature:
                    reasoning_details = orig_msg.additional_kwargs.get("reasoning_details")
                    if reasoning_details:
                        for item in reasoning_details:
                            if isinstance(item, dict) and item.get("type") == "reasoning.encrypted":
                                message_thought_signature = item.get("data")
                                break
                
                # 如果还是没找到，使用 DUMMY（fallback）
                if not message_thought_signature:
                    message_thought_signature = DUMMY_THOUGHT_SIGNATURE
                    use_dummy = True
        
        # 6. 注入 reasoning_details（仅限活跃循环内的消息）
        # 根据 Google 文档，reasoning_details 包含完整的推理详情
        reasoning_details_to_inject = None
        if not use_dummy and isinstance(orig_msg, (AIMessage, AIMessageChunk)):
            # 优先使用原始消息中的 reasoning_details
            reasoning_details_to_inject = orig_msg.additional_kwargs.get("reasoning_details")
            
            # 如果没有，从 thought_signature 构造
            if not reasoning_details_to_inject and message_thought_signature:
                reasoning_details_to_inject = [{
                    "format": "google-gemini-v1",
                    "index": 0,
                    "type": "reasoning.encrypted",
                    "data": message_thought_signature
                }]
        
        # 注入到消息级别（仅限活跃循环）
        if reasoning_details_to_inject and not use_dummy:
            msg_dict["reasoning_details"] = reasoning_details_to_inject
        
        # 7. 注入到 function.thought_signature（所有消息都需要）
        # 根据 Google 文档，每个函数调用都必须有 thought_signature
        if "tool_calls" in msg_dict:
            for tc in msg_dict["tool_calls"]:
                if "function" not in tc:
                    continue
                
                # 尝试提取每个工具调用的独立签名（如果存在）
                tool_call_signature = None
                if not use_dummy and isinstance(orig_msg, (AIMessage, AIMessageChunk)):
                    if hasattr(orig_msg, 'tool_calls') and orig_msg.tool_calls:
                        tc_name = tc["function"].get("name")
                        # 通过 name 匹配找到对应的原始 tool_call
                        for orig_tc in orig_msg.tool_calls:
                            if isinstance(orig_tc, dict):
                                orig_tc_name = orig_tc.get("name") or orig_tc.get("function", {}).get("name")
                                if orig_tc_name == tc_name:
                                    extras = orig_tc.get("extras", {})
                                    if "thought_signature" in extras:
                                        tool_call_signature = extras["thought_signature"]
                                        break
                
                # 使用独立签名（如果存在），否则使用消息级别签名
                final_signature = tool_call_signature if tool_call_signature else message_thought_signature
                tc["function"]["thought_signature"] = final_signature
    
    # 8. 添加 include_reasoning 参数（启用推理输出）
    model_name = payload.get("model") or getattr(self, "model_name", "")
    if isinstance(model_name, str) and "google/gemini-3" in model_name:
        extra_body = payload.get("extra_body") or {}
        extra_body["include_reasoning"] = True
        payload["extra_body"] = extra_body
    
    return payload

5.1.7 关键要点总结

✅ 注入的关键要点：

两个位置都必须注入： reasoning_details 和 function.thought_signature 都需要包含签名
活跃循环 vs 历史消息： 活跃循环内的消息使用真实签名和完整的 reasoning_details，历史消息使用 DUMMY 签名且不注入 reasoning_details
独立签名支持： 支持每个工具调用有独立的签名，符合 Google 文档中多步函数调用的要求
提取优先级： 按照可靠性从高到低的顺序提取签名，确保尽可能使用真实签名
启用推理输出： 通过 include_reasoning=True 启用推理文本的输出

5.2 Thought Signature 提取（流式处理）

在流式处理时，我们需要从 reasoning_details 中提取 thought_signature：

def _convert_chunk_to_generation_chunk(
    self,
    chunk: dict,
    default_chunk_class: type,
    base_generation_info: Optional[dict],
) -> Optional[ChatGenerationChunk]:
    """重写方法，处理 reasoning_details 和多个 tool calls"""
    
    # ... 处理 reasoning_details 的累积 ...
    
    if finish_reason := choice.get("finish_reason"):
        if isinstance(message_chunk, AIMessageChunk):
            chunk_id = chunk.get("id", "default")
            if chunk_id in self._accumulated_reasoning:
                accumulated = self._accumulated_reasoning[chunk_id]
                
                # 🔥 提取 thought_signature
                thought_sig = None
                for item in accumulated:
                    if isinstance(item, dict) and item.get("type") == "reasoning.encrypted" and "data" in item:
                        thought_sig = item["data"]
                        break
                
                if thought_sig:
                    message_chunk.additional_kwargs["_thought_signature"] = thought_sig
                    
                    # 🔥 关键：注入到每个 tool_call 的 extras
                    if message_chunk.tool_calls:
                        for tc in message_chunk.tool_calls:
                            if isinstance(tc, dict):
                                if "extras" not in tc:
                                    tc["extras"] = {}
                                tc["extras"]["thought_signature"] = thought_sig
    
    return ChatGenerationChunk(
        message=message_chunk, generation_info=generation_info or None
    )

5.3 推理文本提取与显示

Gemini 3 Pro 支持输出推理过程，我们需要从 reasoning_details 中提取 reasoning.text 类型的内容：

# 在流式处理时提取 reasoning.text
if item_type == "reasoning.text" and isinstance(item, dict):
    reasoning_text = item.get("text", "")
    if reasoning_text:
        # 从累积的 reasoning_details 中提取所有 reasoning.text
        accumulated_reasoning_text_parts = []
        for acc_item in self._accumulated_reasoning[chunk_id]:
            if isinstance(acc_item, dict) and acc_item.get("type") == "reasoning.text":
                acc_text = acc_item.get("text", "")
                if acc_text:
                    accumulated_reasoning_text_parts.append(acc_text)
        
        # 设置累积的完整推理文本
        if accumulated_reasoning_text_parts:
            full_reasoning_content = "".join(accumulated_reasoning_text_parts)
            message_chunk.additional_kwargs["reasoning_content"] = full_reasoning_content

# 在 finish_reason 阶段，最终提取所有推理文本
if finish_reason:
    reasoning_text_parts = []
    for item in accumulated:
        if isinstance(item, dict) and item.get("type") == "reasoning.text":
            text = item.get("text", "")
            if text:
                reasoning_text_parts.append(text)
    
    if reasoning_text_parts:
        full_reasoning_content = "".join(reasoning_text_parts)
        message_chunk.additional_kwargs["reasoning_content"] = full_reasoning_content

5.3.1 前端显示优化

为了避免重复发送推理文本，我们在 handle_chat_model_stream_openrouter 中添加了去重逻辑：

def handle_chat_model_stream_openrouter(event, starti, result_str, last_content, 
                                        need_output, state, thinking_str, thinking_signature):
    """OpenRouter专用的handle_chat_model_stream函数，支持reasoning处理"""
    chunk = event["data"]["chunk"]
    
    if hasattr(chunk, 'additional_kwargs'):
        additional_kwargs = getattr(chunk, 'additional_kwargs', {})
        if isinstance(additional_kwargs, dict) and 'reasoning_content' in additional_kwargs:
            reasoning_content = additional_kwargs['reasoning_content']
            
            if reasoning_content != "":
                # 🔥 关键修复：去重逻辑 - 只发送新增的推理文本部分
                if '_last_sent_reasoning' not in state:
                    state['_last_sent_reasoning'] = ""
                
                last_sent_reasoning = state['_last_sent_reasoning']
                
                # 如果当前推理文本包含已发送的部分，只提取新增的部分
                if reasoning_content.startswith(last_sent_reasoning):
                    incremental_reasoning = reasoning_content[len(last_sent_reasoning):]
                else:
                    # 如果不匹配，可能是新的推理文本（例如，新的工具调用）
                    incremental_reasoning = reasoning_content
                    state['_last_sent_reasoning'] = ""
                
                if incremental_reasoning:
                    # 发布推理内容到前端
                    sse_publish({
                        "message": incremental_reasoning,
                        "type": "thinking"
                    }, sse_type, channel_id)
                    state['_last_sent_reasoning'] = reasoning_content

5.4 Monkey Patch：修复消息合并

⚠️ 为什么需要 Monkey Patch？

由于 LangChain 的标准实现不支持 thought_signature 的传递，我们需要通过 Monkey Patch 来 Hook LangChain 的内部函数，确保 thought_signature 在序列化和合并过程中不会丢失。

关键点： 这是唯一能够在不修改 LangChain 核心代码的情况下，实现 Thought Signature 传递的方法。

LangChain 在合并 AIMessageChunk 时，会丢失 tool_call_chunks 的 extras 字段。我们需要修复这个函数来保留 thought_signature。

5.5 完整的类实现

class ChatOpenRouterGemini3(ChatOpenAI):
    """
    Gemini 3 Pro 专用的 ChatOpenRouter 实现
    
    支持顺序调用多个 tool 的场景，正确处理 thought_signature 的传递。
    """
    
    # 使用 Pydantic 私有属性存储本地状态
    _accumulated_reasoning: Dict[str, List[Dict[str, Any]]] = PrivateAttr(default_factory=dict)
    _latest_reasoning: Optional[List[Dict[str, Any]]] = PrivateAttr(default=None)
    
    def __init__(self, **kwargs):
        """初始化 ChatOpenRouterGemini3"""
        kwargs.setdefault("base_url", "https://openrouter.ai/api/v1")
        kwargs.setdefault("api_key", get_current_openrouter_key())
        
        # 🔥 修复 JsonPointerException: 使用稳定的名称
        model_name = kwargs.get('model', 'unknown')
        safe_model_name = model_name.replace('/', '_').replace('-', '_')
        kwargs.setdefault("name", f"ChatOpenAI_Gemini3_{safe_model_name}")
        
        super().__init__(**kwargs)
    
    def _find_active_loop_start(self, messages: List[BaseMessage]) -> int:
        """找到活跃对话循环的起始位置"""
        # ... 实现见上文 ...
    
    def _get_request_payload(self, input_, **kwargs):
        """重写方法，在发送请求前处理 thought_signature"""
        # ... 实现见上文 ...
    
    def _convert_chunk_to_generation_chunk(
        self,
        chunk: dict,
        default_chunk_class: type,
        base_generation_info: Optional[dict],
    ) -> Optional[ChatGenerationChunk]:
        """重写方法，处理 reasoning_details 和多个 tool calls"""
        # ... 实现见上文 ...

5.6 辅助优化（Active Loop + 消息截断）

Active Loop 是指从最近的用户发起的消息开始，到当前消息结束的这段对话。根据 Google Gemini API 的要求，只有活跃循环中的工具调用需要真实的 thought_signature，历史消息可以使用 DUMMY 签名作为占位符，这样可以减少 token 消耗。

我们实现了活跃循环检测和消息截断逻辑，当 payload 过大时自动截断历史消息。具体实现代码请参考项目源码。

5.6.1 配置示例

在 application.yml 中配置 Gemini 3 Pro 模型：

gemini-3:
  model: google/gemini-3-pro-preview
  max_tokens: 40000
  module: esapiens.llms.openrouter_gemini3
  clazz: OpenRouterGemini3
  input_tokens_limit: 524288
  stream_handler: gemini_2_5_pro

6. 总结

6.1 关键修复点

✅ 核心修复点：

LangChain Hook（核心技术）： 通过 Monkey Patch 方式 Hook add_ai_message_chunks 函数，确保 thought_signature 在序列化和合并过程中不丢失。这是解决 Thought Signature 问题的核心方法。
流式处理（用户体验）： 正确提取和累积推理文本，支持实时显示推理过程
去重逻辑（性能优化）： 避免重复发送推理文本到前端，提升用户体验
Active Loop + DUMMY 签名（辅助优化）： 对于历史消息使用 DUMMY 签名，减少 token 消耗
消息截断（辅助手段）： 当 payload 过大时，自动截断历史消息，防止 token 超限

6.2 技术难点突破

🔧 解决的关键技术难题：

Thought Signature 丢失： 通过 LangChain Hook 机制，修复了消息合并时 thought_signature 丢失的问题
JsonPointerException： 通过为每个 LLM 实例设置唯一名称，解决了 LangChain 事件流系统的命名冲突问题
推理文本丢失： 通过流式处理时的累积逻辑和去重机制，确保推理文本正确显示
OpenRouter 兼容性： 通过扩展 ChatOpenAI 和 Hook 机制，实现了对 OpenRouter API 的完整支持

6.3 效果对比

指标	修复前	修复后
400 错误	❌ 频繁出现	✅ 已解决
Thought Signature 传递	❌ 丢失	✅ 正确传递
推理文本显示	❌ 不显示	✅ 实时显示
多工具调用	❌ 失败	✅ 成功

6.4 最佳实践

建议：

使用专用的 ChatOpenRouterGemini3 类来处理 Gemini 3 Pro 模型
在生产环境中启用日志，监控 Hook 的执行情况和 Thought Signature 的传递
合理设置消息截断阈值，防止 token 超限
定期检查 LangChain 版本更新，确保 Monkey Patch 的兼容性

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

焰境·万载——新一代文旅网站制作展示

2048 AI社区

C# TCP通信I/O线程错误深度解析与完整解决方案

I/O 操作由于线程退出或应用程序请求已中止：套接字已关闭、无效句柄、操作被中止（错误码995、10038、10054）：异步I/O操作无法执行，线程已退出全程使用异步API：坚决淘汰 Begin/End 旧式异步、同步阻塞读写，统一用 async/await 新版异步方法，线程调度更稳定。禁止手动终止线程：不使用 Thread.Abort、Task.Force 等强制终止操作，让线程自然执行完毕