不同厂商模型 Thinking 输出格式解析差异对比

大猫子

803人浏览 · 2025-11-20 15:04:27

大猫子 · 2025-11-20 15:04:27 发布

不同厂商 Thinking 输出格式完全指南

特性	Gemini	Deepseek	豆包	Claude
think原生格式	`candidates.content.parts[].thought`	`choices.message.content.<think>...</think>`	`choices.message.thinking`	`content[].thinking`
text原生格式	`candidates.content.parts[].text`	`choices.message.content`	`choices.message.content`	`content[].text`
启用方式	`include_thoughts=true`	模型选择 `reasoner`	模型选择 `thinking`	`thinking.type=enabled`
Temperature	必须 = 1	灵活	灵活	灵活
令牌控制	`budget_tokens`	无直接控制	`thinking_mode`	`budget_tokens`

结构差异

厂商	结构	特点
Gemini	`candidates[].content.parts[]`	数组嵌套，灵活
Deepseek	`choices[].message`	OpenAI 标准
豆包	`choices[].message`	OpenAI 标准
Claude	`content[]`	简洁直接

Thinking 的开启与关闭

Gemini

默认值

参数	默认值	说明
`temperature`	0.7	thinking 模式必须 = 1
`include_thoughts`	True	默认输出思考内容 (注意注意, 实际项目中吃过这个默认值的亏, 业务多花了很多隐形的前)
模型	`gemini-2.5-flash`	默认 thinking 模型

开启

import google.generativeai as genai

# 方式1：选择 thinking 模型
model = genai.GenerativeModel(
    model_name="gemini-2.0-flash-thinking-exp-01-21",
    generation_config=genai.types.GenerationConfig(
        temperature=1,  # ⚠️ 必须为 1（默认 0.7）
        include_thoughts=True,  # 启用思考输出（默认 False）
    ),
)

关闭

# 使用非 thinking 模型
model = genai.GenerativeModel(
    model_name="gemini-2.0-flash",  # 不带 thinking
)

Deepseek

默认值

参数	默认值	说明
模型	`deepseek-chat`	默认无 thinking
`temperature`	0.7	灵活设置
`top_p`	1.0	默认值

开启

from openai import OpenAI

client = OpenAI(
    api_key="your-key",
    base_url="https://api.deepseek.com"
)

# 使用 reasoner 模型启用 thinking（默认 deepseek-chat）
response = client.chat.completions.create(
    model="deepseek-reasoner",  # ✓ 启用 thinking
    messages=[{"role": "user", "content": "Your prompt"}],
)

关闭

# 使用 chat 模型关闭 thinking（默认）
response = client.chat.completions.create(
    model="deepseek-chat",  # ✗ 无 thinking（默认）
    messages=[{"role": "user", "content": "Your prompt"}],
)

豆包

默认值

参数	默认值	说明
模型	`doubao-seed-1.6-flash`	默认非 thinking 模型
`thinking_mode`	`auto`	自动决定是否思考
`temperature`	0.7	灵活设置

开启

from volcenginesdkarkruntime import Ark

client = Ark(api_key="your-key")

# 方式1：选择 thinking 模型（默认 doubao-seed-1.6-flash）
response = client.chat.completions.create(
    model="doubao-seed-1.6-thinking-250715",
    messages=[{"role": "user", "content": "Your prompt"}],
)

# 方式2：自适应模式（默认 thinking_mode="auto"）
response = client.chat.completions.create(
    model="doubao-seed-1.6-flash",
    messages=[{"role": "user", "content": "Your prompt"}],
    extra_body={"thinking_mode": "auto"},  # 自动决定是否思考（默认）
)

关闭

# 显式关闭 thinking
response = client.chat.completions.create(
    model="doubao-seed-1.6-thinking-250715",
    messages=[{"role": "user", "content": "Your prompt"}],
    extra_body={"thinking_mode": "non-thinking"},  # 显式关闭
)

Claude

默认值

参数	默认值	说明
`thinking.type`	未设置	默认不启用 thinking
`thinking.budget_tokens`	无限制	不设置时无限制
`temperature`	1.0	默认值
`max_tokens`	4096	默认值

开启

import anthropic

client = anthropic.Anthropic(api_key="your-key")

response = client.messages.create(
    model="claude-opus-4-1",
    max_tokens=16000,
    thinking={
        "type": "enabled",  # 启用 thinking（默认未设置）
        "budget_tokens": 10000,  # 思考令牌预算（默认无限制）
    },
    messages=[{"role": "user", "content": "Your prompt"}],
)

关闭

# 不设置 thinking 参数即可关闭（默认）
response = client.messages.create(
    model="claude-opus-4-1",
    max_tokens=16000,
    messages=[{"role": "user", "content": "Your prompt"}],
)

如何调用

Gemini 调用

import google.generativeai as genai

genai.configure(api_key="your-key")

model = genai.GenerativeModel(
    model_name="gemini-2.0-flash-thinking-exp-01-21",
    generation_config=genai.types.GenerationConfig(
        temperature=1,
        include_thoughts=True,
    ),
)

# 调用
response = model.generate_content("Solve: 2+2=?")

# 获取响应
print(response.text)

Deepseek 调用

from openai import OpenAI

client = OpenAI(
    api_key="your-key",
    base_url="https://api.deepseek.com"
)

# 调用
response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[
        {"role": "user", "content": "Solve: 2+2=?"}
    ],
)

# 获取响应
print(response.choices[0].message.content)

豆包调用

from volcenginesdkarkruntime import Ark

client = Ark(api_key="your-key")

# 调用
response = client.chat.completions.create(
    model="doubao-seed-1.6-thinking-250715",
    messages=[
        {"role": "user", "content": "Solve: 2+2=?"}
    ],
)

# 获取响应
print(response.choices[0].message.content)

Claude 调用

import anthropic

client = anthropic.Anthropic(api_key="your-key")

# 调用
response = client.messages.create(
    model="claude-opus-4-1",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[
        {"role": "user", "content": "Solve: 2+2=?"}
    ],
)

# 获取响应
print(response.content)

原生解析方式与案例

Gemini 原生解析

响应格式

{
  "candidates": [{
    "content": {
      "parts": [
        {"thought": "Let me calculate 2+2..."},
        {"text": "The answer is 4"}
      ]
    }
  }]
}

解析代码

import google.generativeai as genai

genai.configure(api_key="your-key")
model = genai.GenerativeModel("gemini-2.0-flash-thinking-exp-01-21")

response = model.generate_content(
    "Solve: 2+2=?",
    generation_config=genai.types.GenerationConfig(
        temperature=1,
        include_thoughts=True,
    ),
)

# 解析思考和答案
for part in response.candidates[0].content.parts:
    if hasattr(part, 'thought'):
        print(f"[Thinking] {part.thought}")
    elif hasattr(part, 'text'):
        print(f"[Answer] {part.text}")

输出示例

[Thinking] Let me calculate 2+2. This is a simple arithmetic problem. 2 plus 2 equals 4.
[Answer] The answer is 4

Deepseek 原生解析

响应格式

{
  "choices": [{
    "message": {
      "content": "<think>\nLet me calculate 2+2. This is simple arithmetic.\n</think>\n\nThe answer is 4",
      "reasoning_content": "Let me calculate 2+2. This is simple arithmetic."
    }
  }]
}

解析代码

from openai import OpenAI

client = OpenAI(
    api_key="your-key",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=[{"role": "user", "content": "Solve: 2+2=?"}],
)

message = response.choices[0].message

# 方式1：使用 reasoning_content
thinking = message.reasoning_content
print(f"[Thinking] {thinking}")

# 方式2：从 content 中提取
content = message.content
if "<think>" in content and "</think>" in content:
    start = content.index("<think>") + 7
    end = content.index("</think>")
    thinking = content[start:end].strip()
    answer = content[end + 8:].strip()
    print(f"[Thinking] {thinking}")
    print(f"[Answer] {answer}")

输出示例

[Thinking] Let me calculate 2+2. This is simple arithmetic.
[Answer] The answer is 4

豆包原生解析

响应格式

{
  "choices": [{
    "message": {
      "content": "The answer is 4",
      "thinking": "Let me calculate 2+2. This is simple arithmetic."
    },
    "thinking_mode": "thinking"
  }]
}

解析代码

from volcenginesdkarkruntime import Ark

client = Ark(api_key="your-key")

response = client.chat.completions.create(
    model="doubao-seed-1.6-thinking-250715",
    messages=[{"role": "user", "content": "Solve: 2+2=?"}],
)

message = response.choices[0].message

# 直接访问 thinking 和 content
thinking = message.thinking
answer = message.content

print(f"[Thinking] {thinking}")
print(f"[Answer] {answer}")

输出示例

[Thinking] Let me calculate 2+2. This is simple arithmetic.
[Answer] The answer is 4

Claude 原生解析

响应格式

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me calculate 2+2. This is simple arithmetic."
    },
    {
      "type": "text",
      "text": "The answer is 4"
    }
  ]
}

解析代码

import anthropic

client = anthropic.Anthropic(api_key="your-key")

response = client.messages.create(
    model="claude-opus-4-1",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "Solve: 2+2=?"}],
)

# 遍历内容块
for block in response.content:
    if block.type == "thinking":
        print(f"[Thinking] {block.thinking}")
    elif block.type == "text":
        print(f"[Answer] {block.text}")

输出示例

[Thinking] Let me calculate 2+2. This is simple arithmetic.
[Answer] The answer is 4

LangChain 统一标准化方式

启用标准化输出

import os

# 设置环境变量启用标准化
os.environ["LC_OUTPUT_VERSION"] = "v1"

统一解析接口

无论使用哪个厂商，解析方式完全相同：

# 所有厂商都使用相同的解析方式
for block in response.content_blocks:
    if block.type in ["thinking", "reasoning"]:
        # thinking: Claude
        # reasoning: Gemini, Deepseek, 豆包
        thinking = block.thinking or block.reasoning
        print(f"[Thinking] {thinking}")
    elif block.type == "text":
        answer = block.text
        print(f"[Answer] {answer}")

Gemini + LangChain

import os
from langchain_google_genai import ChatGoogleGenerativeAI

os.environ["LC_OUTPUT_VERSION"] = "v1"

model = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash-thinking-exp-01-21",
    temperature=1,
)

response = model.invoke("Solve: 2+2=?")

# 统一解析
for block in response.content_blocks:
    if block.type == "reasoning":
        print(f"[Thinking] {block.reasoning}")
    elif block.type == "text":
        print(f"[Answer] {block.text}")

Deepseek + LangChain

import os
from langchain_openai import ChatOpenAI

os.environ["LC_OUTPUT_VERSION"] = "v1"

model = ChatOpenAI(
    model="deepseek-reasoner",
    openai_api_key="your-key",
    openai_api_base="https://api.deepseek.com/v1",
)

response = model.invoke("Solve: 2+2=?")

# 统一解析
for block in response.content_blocks:
    if block.type == "reasoning":
        print(f"[Thinking] {block.reasoning}")
    elif block.type == "text":
        print(f"[Answer] {block.text}")

豆包 + LangChain

import os
from langchain_community.chat_models import ChatVolcano

os.environ["LC_OUTPUT_VERSION"] = "v1"

model = ChatVolcano(
    model="doubao-seed-1.6-thinking-250715",
    volcengine_api_key="your-key",
)

response = model.invoke("Solve: 2+2=?")

# 统一解析
for block in response.content_blocks:
    if block.type == "reasoning":
        print(f"[Thinking] {block.reasoning}")
    elif block.type == "text":
        print(f"[Answer] {block.text}")

Claude + LangChain

import os
from langchain_anthropic import ChatAnthropic

os.environ["LC_OUTPUT_VERSION"] = "v1"

model = ChatAnthropic(
    model="claude-opus-4-1",
    thinking={"type": "enabled", "budget_tokens": 10000}
)

response = model.invoke("Solve: 2+2=?")

# 统一解析
for block in response.content_blocks:
    if block.type == "thinking":
        print(f"[Thinking] {block.thinking}")
    elif block.type == "text":
        print(f"[Answer] {block.text}")

总结与对比

开启/关闭方式对比

厂商	开启方式	关闭方式
Gemini	选择 `thinking` 模型 + `include_thoughts=true`	选择非 `thinking` 模型
Deepseek	选择 `reasoner` 模型	选择 `chat` 模型
豆包	选择 `thinking` 模型或 `thinking_mode="auto"`	`thinking_mode="non-thinking"`
Claude	设置 `thinking.type="enabled"`	不设置 thinking 参数

原生格式对比

厂商	格式	特点
Gemini	`parts[].thought`	思考总结，不是完整过程
Deepseek	`<think>...</think>` XML	完整思考过程，兼容 OpenAI SDK
豆包	`message.thinking`	直接字段，支持自适应
Claude	`content[].thinking`	内容块数组，原生支持

调用方式对比

厂商	SDK	方式
Gemini	`google.generativeai`	原生 SDK
Deepseek	`openai`	兼容 OpenAI SDK
豆包	`volcenginesdkarkruntime`	原生 SDK
Claude	`anthropic`	原生 SDK

LangChain 统一方式对比

特性	说明
启用	设置 `LC_OUTPUT_VERSION=v1`
解析	所有厂商都使用 `response.content_blocks`
内容块类型	`thinking`（Claude）或 `reasoning`（其他）
优势	一套代码支持所有厂商，无需改动

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

网易 CodeWave ：用自然语言“说”出你的第一个应用

2048 AI社区

【万字长文】AI Agent“四件套”保姆级教程：从大模型到工具使用，一篇打通你的任督二脉！

2048 AI社区

基于局部重要性的注意力（LIA）ACCV2024

论文地址：https://openaccess.thecvf.com/content/ACCV2024/papers/Wang_PlainUSR_Chasing_Faster_ConvNet_for_Efficient_Super-Resolution_ACCV_2024_paper.pdf。# 对应公式 Eq. (4): A(X) = σ(X[0]) * ψ(σ(I(X))) * X。# ===