Agent智能体开发——langchain（2）第二章model

从零开始，手把手教会你使用Langchain框架大家一个属于你自己的AI agent。第二章model

誉鏐

561人浏览 · 2026-01-23 06:09:17

誉鏐 · 2026-01-23 06:09:17 发布

第二章 model

大型语言模型是强大的人工智能工具，能够像人类一样解读和生成文本。它们足够多才多艺，可以撰写内容、翻译语言、总结和回答问题，而无需专门训练。

除了文本生成，许多模型还支持：
（1）工具调用- 调用外部工具（如数据库查询或 API 调用），并在响应中使用结果。
（2）结构化输出- 模型的响应被限制遵循定义的格式。
（3）多模态- 处理和返回除文本以外的数据，如图片、音频和视频。
（4）推理- 模型通过多步推理得出结论。
模型是的推理引擎代理人. 他们主导着客服的决策过程，决定调用哪些工具，如何解读结果，以及何时给出最终答案。

你选择的模型的质量和能力直接影响代理的基线可靠性和性能。不同模型在不同任务上表现出色——有些更擅长执行复杂指令，有些擅长结构化推理，还有些支持更大的上下文窗口以处理更多信息。

LangChain的标准模型接口让你能够访问多种不同的供应商集成，这让你可以轻松地试验和切换不同模型，找到最适合你需求的方案。

1.基本用途

1.1使用agent(我们在第一章中见过，即在创建agent的时候定义模型名称，或配置模型)

1.2独立使用——模型可以直接用于文本生成、分类或提取等任务，无需agent代理框架

同一个模型界面在两种情境下都适用，这让你可以灵活地从简单开始，根据需要扩展到更复杂的基于代理的工作流程。

2.初始化模型

在 LangChain 中开始使用独立模型最简单的方法是使用init_chat_model要从你选择的聊天模式提供商中初始化一个（示例如下）：

init_chat_model的参数：

model: 你想用的具体型号名称或标识符。你也可以用 ‘：’ 格式在一个参数中指定模型和其提供者，例如“openai：o1”。
api_key: 与模型提供商认证所需的密钥。这通常是在你注册使用该模型时发放的。通常通过设置环境变量.
temperature: 控制模型输出的随机性。数字越高，回答越有创意;较低的分数使得它们更具确定性。
max_tokens: 最大token数，有效控制输出的长度。
timeout: 在取消请求前等待模型响应的最长时间（以秒为单位）。
max_retries: 如果请求因网络超时或速率限制等问题失败，系统最多尝试重发请求的次数。

from langchain.chat_models import init_chat_model

model = init_chat_model(
    "claude-sonnet-4-5-20250929",
    # Kwargs passed to the model:
    temperature=0.7,
    timeout=30,
    max_tokens=1000,
)

3.invoke()

最直接的调用模型方式是invoke()无论是一条消息还是一条消息列表。

response = model.invoke("Why do parrots have colorful feathers?")
print(response)

可以向聊天模型提供消息列表以表示对话历史。每条消息都有一个角色，模特用来表示对话中谁发送了消息。

#字典格式

conversation = [
    {"role": "system", "content": "You are a helpful assistant that translates English to French."},
    {"role": "user", "content": "Translate: I love programming."},
    {"role": "assistant", "content": "J'adore la programmation."},
    {"role": "user", "content": "Translate: I love building applications."}
]

response = model.invoke(conversation)
print(response)  # AIMessage("J'adore créer des applications.")

#消息对象格式

from langchain.messages import HumanMessage, AIMessage, SystemMessage

conversation = [
    SystemMessage("You are a helpful assistant that translates English to French."),
    HumanMessage("Translate: I love programming."),
    AIMessage("J'adore la programmation."),
    HumanMessage("Translate: I love building applications.")
]

response = model.invoke(conversation)
print(response)  # AIMessage("J'adore créer des applications.")

4.streaming

大多数模型可以在生成输出内容时流式传输。通过逐步显示输出，流媒体显著提升了用户体验，尤其是对较长回复时间。

呼唤stream()返回一个迭代器这样就能在产出时产生输出块。你可以用循环实时处理每个区块：

for chunk in model.stream("Why do parrots have colorful feathers?"):
    print(chunk.text, end="|", flush=True)

相对于invoke()，返回单一AIMessage模型生成完整响应后，返回多个
stream()AIMessageChunk每个对象包含输出文本的一部分。重要的是，流中的每个片段都
设计成通过汇总汇集成完整消息：

full = None  # None | AIMessageChunk
for chunk in model.stream("What color is the sky?"):
    full = chunk if full is None else full + chunk
    print(full.text)

# The
# The sky
# The sky is
# The sky is typically
# The sky is typically blue
# ...

print(full.content_blocks)
# [{"type": "text", "text": "The sky is typically blue..."}]

5.Batch

批量处理一组独立请求到模型可以显著提升性能并降低成本，因为处理可以并行完成：

responses = model.batch([
    "Why do parrots have colorful feathers?",
    "How do airplanes fly?",
    "What is quantum computing?"
])
for response in responses:
    print(response)

默认情况下，batch()只返回整个批次的最终输出。如果你想在每个输入生成完成时接收输出，可以用batch_as_completed():

for response in model.batch_as_completed([
    "Why do parrots have colorful feathers?",
    "How do airplanes fly?",
    "What is quantum computing?"
]):
    print(response)

6.推理inference

许多模型能够进行多步推理以得出结论。这涉及将复杂问题拆解成更小、更易管理的步骤。
如果底层模型支持，你可以展示这一推理过程，更好地理解模型如何得出最终答案。

for chunk in model.stream("Why do parrots have colorful feathers?"):
    reasoning_steps = [r for r in chunk.content_blocks if r["type"] == "reasoning"]
    print(reasoning_steps if reasoning_steps else chunk.text)

根据模型不同，有时你可以指定推理应投入的努力程度。同样，你也可以要求模型完全关闭推理。这可能表现为分类“层级”的推理（例如，或）或整数token预算。‘low’‘high’

7.服务器端工具的使用

部分服务商支持服务器端工具调用循环：模型可以与网页搜索、代码解释器及其他工具交互，并在一次对话回合中分析结果。

如果模型在服务器端调用工具，响应消息的内容将包括代表调用过程和工具结果的内容。访问内容块响应将返回服务器端工具调用，结果以与
提供者无关的形式呈现：

#使用服务器端工具调用

from langchain.chat_models import init_chat_model

model = init_chat_model("gpt-4.1-mini")

tool = {"type": "web_search"}
model_with_tools = model.bind_tools([tool])

response = model_with_tools.invoke("What was a positive news story from today?")
response.content_blocks

"""运行结果"""

[
    {
        "type": "server_tool_call",
        "name": "web_search",
        "args": {
            "query": "positive news stories today",
            "type": "search"
        },
        "id": "ws_abc123"
    },
    {
        "type": "server_tool_result",
        "tool_call_id": "ws_abc123",
        "status": "success"
    },
    {
        "type": "text",
        "text": "Here are some positive news stories from today...",
        "annotations": [
            {
                "end_index": 410,
                "start_index": 337,
                "title": "article title",
                "type": "citation",
                "url": "..."
            }
        ]
    }
]