【LangChain 源码解析二：Prompt 模板】

LittleStar_Cao

822人浏览 · 2026-03-03 16:55:45

LittleStar_Cao · 2026-03-03 16:55:45 发布

本系列共 4 部分，从日常使用到源码架构，完整拆解 LangChain 的 Prompt 模板系统。

第 1 部分：5 分钟上手 Prompt 模板（本文）
第 2 部分：ChatPromptTemplate 与消息模板体系
第 3 部分：Few-Shot、Image 与特殊模板
第 4 部分：源码架构——从 format 到 invoke 的完整链路

LangChain Prompts 深度解析（一）：5 分钟上手 Prompt 模板

一个能跑的例子

from langchain_core.prompts import PromptTemplate, ChatPromptTemplate

# ---- 字符串模板 ----
string_prompt = PromptTemplate.from_template("给我讲一个关于{topic}的笑话")
result = string_prompt.format(topic="程序员")
print(result)
# 给我讲一个关于程序员的笑话

# ---- 聊天模板 ----
chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "你是一个{style}的助手"),
    ("human", "{question}"),
])
prompt_value = chat_prompt.invoke({"style": "幽默", "question": "什么是递归？"})
print(prompt_value.to_messages())
# [SystemMessage(content='你是一个幽默的助手'),
#  HumanMessage(content='什么是递归？')]

print(prompt_value.to_string())
# System: 你是一个幽默的助手
# Human: 什么是递归？

两种模板，一个返回字符串，一个返回消息列表——但它们最终都产出同一个东西：PromptValue。

PromptValue——两种 Prompt 的统一接口

不管你用的是字符串模板还是聊天模板，invoke() 返回的都是 PromptValue。它是一个抽象基类，定义了两个转换方法：

# langchain_core/prompt_values.py:24
class PromptValue(Serializable, ABC):
    """Base abstract class for inputs to any language model."""

    @abstractmethod
    def to_string(self) -> str:
        """Return prompt value as string."""

    @abstractmethod
    def to_messages(self) -> list[BaseMessage]:
        """Return prompt as a list of messages."""

为什么需要它？因为 ChatModel 的 _convert_input 接受的就是 PromptValue。Prompt 模板不管内部怎么实现，只要产出 PromptValue，ChatModel 就能用。

StringPromptValue

# langchain_core/prompt_values.py:54
class StringPromptValue(PromptValue):
    text: str

    def to_string(self) -> str:
        return self.text                           # 直接返回文本

    def to_messages(self) -> list[BaseMessage]:
        return [HumanMessage(content=self.text)]   # 包装成 HumanMessage

ChatPromptValue

# langchain_core/prompt_values.py:80
class ChatPromptValue(PromptValue):
    messages: Sequence[BaseMessage]

    def to_string(self) -> str:
        return get_buffer_string(self.messages)     # 用 get_buffer_string 拼接

    def to_messages(self) -> list[BaseMessage]:
        return list(self.messages)                  # 直接返回消息列表

对比表

	StringPromptValue	ChatPromptValue
内部存储	`text: str`	`messages: Sequence[BaseMessage]`
`to_string()`	直接返回 text	`get_buffer_string(messages)`
`to_messages()`	包装为 `[HumanMessage(text)]`	直接返回 messages
谁产出它	`PromptTemplate`	`ChatPromptTemplate`

还有一个 ImagePromptValue（prompt_values.py:135），存储 image_url: ImageURL，to_messages() 包装成 HumanMessage 的 content block。后续第 3 篇会详细讲。

PromptTemplate——字符串模板

最简单的模板。接收一个带 {变量名} 的字符串，填入变量后返回格式化字符串。

from_template 工厂方法

# langchain_core/prompts/prompt.py:256-312
@classmethod
def from_template(
    cls,
    template: str,
    *,
    template_format: PromptTemplateFormat = "f-string",
    partial_variables: dict[str, Any] | None = None,
    **kwargs: Any,
) -> PromptTemplate:
    input_variables = get_template_variables(template, template_format)  # 自动提取变量
    partial_variables_ = partial_variables or {}

    if partial_variables_:
        input_variables = [
            var for var in input_variables if var not in partial_variables_
        ]

    return cls(
        input_variables=input_variables,
        template=template,
        template_format=template_format,
        partial_variables=partial_variables_,
        **kwargs,
    )

核心就一件事：调用 get_template_variables 自动从模板字符串中提取变量名，然后构造实例。用户不需要手动声明 input_variables。

format() 方法

# langchain_core/prompts/prompt.py:191-201
def format(self, **kwargs: Any) -> str:
    kwargs = self._merge_partial_and_user_variables(**kwargs)     # 合并 partial 变量
    return DEFAULT_FORMATTER_MAPPING[self.template_format](self.template, **kwargs)

两步：合并变量 → 用对应格式化器填充模板。DEFAULT_FORMATTER_MAPPING 后面会讲。

add 拼接

# langchain_core/prompts/prompt.py:142-184
def __add__(self, other: Any) -> PromptTemplate:
    if isinstance(other, PromptTemplate):
        # 两个 PromptTemplate 拼接
        template = self.template + other.template
        input_variables = list(set(self.input_variables) | set(other.input_variables))
        return PromptTemplate(template=template, input_variables=input_variables, ...)
    if isinstance(other, str):
        # 字符串先转 PromptTemplate 再拼接
        prompt = PromptTemplate.from_template(other, template_format=self.template_format)
        return self + prompt

# 用法
p1 = PromptTemplate.from_template("你好，{name}。")
p2 = PromptTemplate.from_template("请帮我{task}。")
combined = p1 + p2
print(combined.format(name="小明", task="写代码"))
# 你好，小明。请帮我写代码。

ChatPromptTemplate——聊天模板

真正的主力。它接收一个消息列表模板，每条消息带角色和内容模板，格式化后返回 ChatPromptValue。

from_messages 工厂方法

# langchain_core/prompts/chat.py:1118-1167
@classmethod
def from_messages(
    cls,
    messages: Sequence[MessageLikeRepresentation],
    template_format: PromptTemplateFormat = "f-string",
) -> ChatPromptTemplate:
    return cls(messages, template_format=template_format)  # 直接委托给 __init__

所有的重活在 __init__ 里：

# langchain_core/prompts/chat.py:902-995
def __init__(
    self,
    messages: Sequence[MessageLikeRepresentation],
    *,
    template_format: PromptTemplateFormat = "f-string",
    **kwargs: Any,
) -> None:
    # 1. 转换每条消息为 MessagePromptTemplate
    messages_ = [
        _convert_to_message_template(message, template_format)
        for message in messages
    ]

    # 2. 自动推断 input_variables 和 optional_variables
    input_vars: set[str] = set()
    optional_variables: set[str] = set()
    partial_vars: dict[str, Any] = {}
    for _message in messages_:
        if isinstance(_message, MessagesPlaceholder) and _message.optional:
            partial_vars[_message.variable_name] = []      # optional 默认空列表
            optional_variables.add(_message.variable_name)
        elif isinstance(_message, (BaseChatPromptTemplate, BaseMessagePromptTemplate)):
            input_vars.update(_message.input_variables)    # 收集所有输入变量

    kwargs = {
        "input_variables": sorted(input_vars),
        "optional_variables": sorted(optional_variables),
        "partial_variables": partial_vars,
        **kwargs,
    }
    super().__init__(messages=messages_, **kwargs)

5 种输入格式都通过 _convert_to_message_template 统一转换（详见第 2 篇）。

format_messages → format_prompt → ChatPromptValue

# langchain_core/prompts/chat.py:1169-1195
def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
    kwargs = self._merge_partial_and_user_variables(**kwargs)
    result = []
    for message_template in self.messages:
        if isinstance(message_template, BaseMessage):
            result.extend([message_template])                # 已经是消息，直接放入
        elif isinstance(message_template, (BaseMessagePromptTemplate, BaseChatPromptTemplate)):
            message = message_template.format_messages(**kwargs)  # 格式化为消息列表
            result.extend(message)
    return result

# langchain_core/prompts/chat.py:722-731
def format_prompt(self, **kwargs: Any) -> ChatPromptValue:
    messages = self.format_messages(**kwargs)
    return ChatPromptValue(messages=messages)               # 包装成 ChatPromptValue

调用链：format_messages() → 遍历每个模板 → 格式化 → ChatPromptValue。

三种模板格式

DEFAULT_FORMATTER_MAPPING 定义了 LangChain 支持的三种模板格式：

# langchain_core/prompts/string.py:208-212
DEFAULT_FORMATTER_MAPPING: dict[str, Callable[..., str]] = {
    "f-string": formatter.format,      # StrictFormatter（内置 Formatter 的严格版）
    "mustache": mustache_formatter,     # chevron/mustache 渲染
    "jinja2": jinja2_formatter,        # SandboxedEnvironment
}

对比表

	f-string (默认)	mustache	jinja2
语法	`{variable}`	`{{variable}}`	`{{ variable }}`
条件	不支持	`{{#flag}}...{{/flag}}`	`{% if flag %}...{% endif %}`
循环	不支持	`{{#items}}...{{/items}}`	`{% for item in items %}...{% endfor %}`
安全性	高（阻止属性访问）	中	低（即使 Sandbox 也有风险）
安装	内置	内置	需 `pip install jinja2`

# f-string（默认，推荐）
PromptTemplate.from_template("Hello {name}")

# mustache
PromptTemplate.from_template("Hello {{name}}", template_format="mustache")

# jinja2（⚠ 不要接受不可信输入！）
PromptTemplate.from_template("Hello {{ name }}", template_format="jinja2")

partial()——预填充变量

# langchain_core/prompts/base.py:279-300
def partial(self, **kwargs: str | Callable[[], str]) -> BasePromptTemplate:
    prompt_dict = self.__dict__.copy()
    prompt_dict["input_variables"] = list(
        set(self.input_variables).difference(kwargs)      # 去掉已填充的变量
    )
    prompt_dict["partial_variables"] = {**self.partial_variables, **kwargs}
    return type(self)(**prompt_dict)                       # 返回新实例

from datetime import datetime

prompt = PromptTemplate.from_template("{date} 的新闻：{topic}")

# 静态 partial
p1 = prompt.partial(date="2024-01-01")
print(p1.format(topic="AI"))
# 2024-01-01 的新闻：AI

# Callable partial（延迟求值）
p2 = prompt.partial(date=lambda: datetime.now().strftime("%Y-%m-%d"))
print(p2.format(topic="AI"))
# 2026-03-01 的新闻：AI  （运行时动态获取日期）

关键在 _merge_partial_and_user_variables（base.py:295-300）：

def _merge_partial_and_user_variables(self, **kwargs: Any) -> dict[str, Any]:
    partial_kwargs = {
        k: v if not callable(v) else v()   # 如果是 Callable，调用它
        for k, v in self.partial_variables.items()
    }
    return {**partial_kwargs, **kwargs}     # 用户变量覆盖 partial 变量

Prompt 就是 Runnable

BasePromptTemplate 继承自 RunnableSerializable[dict, PromptValue]：

# langchain_core/prompts/base.py:39-41
class BasePromptTemplate(
    RunnableSerializable[dict, PromptValue], ABC, Generic[FormatOutputType]
):

所以 Prompt 模板天生支持 invoke、batch、stream、| 管道操作：

# langchain_core/prompts/base.py:206-229
def invoke(
    self, input: dict, config: RunnableConfig | None = None, **kwargs: Any
) -> PromptValue:
    config = ensure_config(config)
    if self.metadata:
        config["metadata"] = {**config["metadata"], **self.metadata}
    if self.tags:
        config["tags"] += self.tags
    return self._call_with_config(
        self._format_prompt_with_error_handling,    # 实际执行 format_prompt
        input,
        config,
        run_type="prompt",
        serialized=self._serialized,
    )

invoke 的核心是 _format_prompt_with_error_handling（base.py:195-197），它先调 _validate_input 校验输入，再调 format_prompt 产出 PromptValue。

| 管道操作让 Prompt 可以直接和 ChatModel 串联：

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "你是{role}"),
    ("human", "{question}"),
])

# 假设有一个 ChatModel
# chain = prompt | model | parser
# 这条链：dict → PromptValue → AIMessage → 解析结果

小结

概念	关键类	核心方法	产出
统一接口	`PromptValue`	`to_string()` / `to_messages()`	桥接 Prompt ↔ Model
字符串模板	`PromptTemplate`	`format()` → `str`	`StringPromptValue`
聊天模板	`ChatPromptTemplate`	`format_messages()` → `list[BaseMessage]`	`ChatPromptValue`
格式化	`DEFAULT_FORMATTER_MAPPING`	f-string / mustache / jinja2	格式化后的字符串
预填充	`partial()`	支持 Callable 延迟求值	新的模板实例
Runnable	`BasePromptTemplate`	`invoke()` / `	`

下一篇的问题： ChatPromptTemplate 的 __init__ 接收 5 种输入格式——字符串、元组、BaseMessage、MessagePromptTemplate、dict。它们分别是怎么被转换成消息模板的？消息模板的继承体系又是怎样的？

LangChain Prompts 深度解析（二）：ChatPromptTemplate 与消息模板体系

一个能跑的例子——5 种输入格式

from langchain_core.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
)
from langchain_core.messages import SystemMessage

# 5 种格式全演示
template = ChatPromptTemplate.from_messages([
    # 格式 1: BaseMessage 直传
    SystemMessage(content="你是一个翻译助手"),

    # 格式 2: BaseMessagePromptTemplate 直传
    HumanMessagePromptTemplate.from_template("请把以下内容翻译成{language}："),

    # 格式 3: 元组 (角色, 模板)
    ("human", "{text}"),

    # 格式 4: placeholder 元组
    ("placeholder", "{history}"),

    # 格式 5: 裸字符串（等价于 ("human", ...)）
    # "另外一句话"  # 为了不重复 human，这里注释掉
])

print(template.input_variables)
# ['language', 'text']

print(template.optional_variables)
# ['history']

result = template.invoke({
    "language": "英文",
    "text": "你好世界",
    "history": [("ai", "好的，我来翻译"), ("human", "谢谢")],
})

for msg in result.to_messages():
    print(f"{msg.__class__.__name__}: {msg.content}")
# SystemMessage: 你是一个翻译助手
# HumanMessage: 请把以下内容翻译成英文：
# HumanMessage: 你好世界
# AIMessage: 好的，我来翻译
# HumanMessage: 谢谢

5 种格式，最终全部变成 BaseMessage 或 BaseMessagePromptTemplate 对象。谁负责这个转换？

_convert_to_message_template——入口分发

这是 ChatPromptTemplate.__init__ 调用的核心函数：

# langchain_core/prompts/chat.py:1407-1480
def _convert_to_message_template(
    message: MessageLikeRepresentation,
    template_format: PromptTemplateFormat = "f-string",
) -> BaseMessage | BaseMessagePromptTemplate | BaseChatPromptTemplate:
    if isinstance(message, (BaseMessagePromptTemplate, BaseChatPromptTemplate)):
        message_ = message                          # 分支 1: 模板直传

    elif isinstance(message, BaseMessage):
        message_ = message                          # 分支 2: 消息直传

    elif isinstance(message, str):
        message_ = _create_template_from_message_type(
            "human", message, template_format=template_format
        )                                            # 分支 3: str → ("human", str)

    elif isinstance(message, (tuple, dict)):
        if isinstance(message, dict):
            # dict 提取 role/content
            message_type_str = message["role"]
            template = message["content"]
        else:
            # tuple 解包
            message_type_str, template = message

        if isinstance(message_type_str, str):
            message_ = _create_template_from_message_type(
                message_type_str, template, template_format=template_format
            )                                        # 分支 4a: 字符串角色
        elif hasattr(message_type_str, "model_fields"):
            message_type = message_type_str.model_fields["type"].default
            message_ = _create_template_from_message_type(
                message_type, template, template_format=template_format
            )                                        # 分支 4b: Message 类作为角色
        else:
            message_ = message_type_str(
                prompt=PromptTemplate.from_template(cast("str", template))
            )                                        # 分支 4c: 自定义类
    else:
        raise NotImplementedError(f"Unsupported message type: {type(message)}")

    return message_

ASCII 分发流程图

              _convert_to_message_template(message)
                            |
          +-----------------+-----------------+
          |                 |                 |
  BaseMessagePrompt    BaseMessage         str
  Template / Base      ───────►           ───────►
  ChatPromptTemplate   直接返回       ("human", str)
  ───────►                                   |
  直接返回                                    |
                                             ▼
                       tuple / dict ──► 解包 (role, template)
                            |
              +-------------+-------------+
              |             |             |
        role 是 str   role 有 model_   role 是其他类
              |        fields 属性        |
              ▼             ▼             ▼
    _create_template    提取 type     role(prompt=
    _from_message_type  默认值再调    PromptTemplate
    (role, template)    同一函数     .from_template)

_create_template_from_message_type——角色到模板的映射

# langchain_core/prompts/chat.py:1330-1404
def _create_template_from_message_type(
    message_type: str,
    template: str | list,
    template_format: PromptTemplateFormat = "f-string",
) -> BaseMessagePromptTemplate:

    if message_type in {"human", "user"}:
        message = HumanMessagePromptTemplate.from_template(template, ...)

    elif message_type in {"ai", "assistant"}:
        message = AIMessagePromptTemplate.from_template(template, ...)

    elif message_type == "system":
        message = SystemMessagePromptTemplate.from_template(template, ...)

    elif message_type == "placeholder":
        if isinstance(template, str):
            # ("placeholder", "{history}") → 提取变量名，optional=True
            var_name = template[1:-1]           # 去掉花括号
            message = MessagesPlaceholder(variable_name=var_name, optional=True)
        else:
            # ("placeholder", ["{history}", False]) → 显式 optional
            var_name_wrapped, is_optional = template
            var_name = var_name_wrapped[1:-1]
            message = MessagesPlaceholder(variable_name=var_name, optional=is_optional)
    else:
        raise ValueError(f"Unexpected message type: {message_type}")

    return message

角色字符串	生成的模板类
`"human"` / `"user"`	`HumanMessagePromptTemplate`
`"ai"` / `"assistant"`	`AIMessagePromptTemplate`
`"system"`	`SystemMessagePromptTemplate`
`"placeholder"`	`MessagesPlaceholder`

注意 placeholder 的特殊处理：

("placeholder", "{history}") → MessagesPlaceholder(variable_name="history", optional=True)
("placeholder", ["{history}", False]) → MessagesPlaceholder(variable_name="history", optional=False)

BaseMessagePromptTemplate 层次体系

继承关系图

Serializable (ABC)
    └── BaseMessagePromptTemplate (message.py:16)
            │   ├── format_messages(**kwargs) → list[BaseMessage]  [抽象]
            │   └── input_variables: list[str]  [抽象属性]
            │
            ├── MessagesPlaceholder (chat.py:52)
            │       直接插入消息列表，不做模板格式化
            │
            ├── BaseStringMessagePromptTemplate (chat.py:225)  [ABC]
            │   │   ├── prompt: StringPromptTemplate
            │   │   ├── format(**kwargs) → BaseMessage  [抽象]
            │   │   └── format_messages → [self.format(**kwargs)]
            │   │
            │   └── ChatMessagePromptTemplate (chat.py:353)
            │           ├── role: str（自定义角色）
            │           └── format → ChatMessage(content=..., role=self.role)
            │
            └── _StringImageMessagePromptTemplate (chat.py:396)
                    │   prompt: StringPromptTemplate | list[String|Image|DictPromptTemplate]
                    │   ├── format → BaseMessage（支持多模态 content blocks）
                    │   └── _msg_class: type[BaseMessage]
                    │
                    ├── HumanMessagePromptTemplate (chat.py:663)
                    │       _msg_class = HumanMessage
                    │
                    ├── AIMessagePromptTemplate (chat.py:672)
                    │       _msg_class = AIMessage
                    │
                    └── SystemMessagePromptTemplate (chat.py:681)
                            _msg_class = SystemMessage

BaseMessagePromptTemplate——抽象基类

# langchain_core/prompts/message.py:16
class BaseMessagePromptTemplate(Serializable, ABC):

    @abstractmethod
    def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
        """Format messages from kwargs."""

    @property
    @abstractmethod
    def input_variables(self) -> list[str]:
        """Input variables for this prompt template."""

所有消息模板都必须实现 format_messages（返回消息列表）和 input_variables（声明需要哪些变量）。

BaseStringMessagePromptTemplate——单文本消息

# langchain_core/prompts/chat.py:225-335
class BaseStringMessagePromptTemplate(BaseMessagePromptTemplate, ABC):
    prompt: StringPromptTemplate           # 持有一个字符串模板

    @classmethod
    def from_template(cls, template: str, ...) -> Self:
        prompt = PromptTemplate.from_template(template, ...)
        return cls(prompt=prompt, **kwargs)

    def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
        return [self.format(**kwargs)]     # 格式化后包装为单元素列表

    @property
    def input_variables(self) -> list[str]:
        return self.prompt.input_variables # 委托给内部 prompt

format 是抽象的，由子类决定生成什么类型的消息。

_StringImageMessagePromptTemplate——支持多模态

# langchain_core/prompts/chat.py:396
class _StringImageMessagePromptTemplate(BaseMessagePromptTemplate):
    prompt: (
        StringPromptTemplate
        | list[StringPromptTemplate | ImagePromptTemplate | DictPromptTemplate]
    )
    _msg_class: type[BaseMessage]

prompt 可以是单个文本模板，也可以是包含文本、图片、dict 的列表。format 方法会根据类型分别处理：

# langchain_core/prompts/chat.py:583-612
def format(self, **kwargs: Any) -> BaseMessage:
    if isinstance(self.prompt, StringPromptTemplate):
        text = self.prompt.format(**kwargs)
        return self._msg_class(content=text, ...)      # 纯文本

    content: list = []
    for prompt in self.prompt:
        inputs = {var: kwargs[var] for var in prompt.input_variables}
        if isinstance(prompt, StringPromptTemplate):
            content.append({"type": "text", "text": prompt.format(**inputs)})
        elif isinstance(prompt, ImagePromptTemplate):
            content.append({"type": "image_url", "image_url": prompt.format(**inputs)})
        elif isinstance(prompt, DictPromptTemplate):
            content.append(prompt.format(**inputs))
    return self._msg_class(content=content, ...)       # 多模态 content blocks

Human / AI / System MessagePromptTemplate

三个子类极其简单，只需指定 _msg_class：

# langchain_core/prompts/chat.py:663-687
class HumanMessagePromptTemplate(_StringImageMessagePromptTemplate):
    _msg_class: type[BaseMessage] = HumanMessage

class AIMessagePromptTemplate(_StringImageMessagePromptTemplate):
    _msg_class: type[BaseMessage] = AIMessage

class SystemMessagePromptTemplate(_StringImageMessagePromptTemplate):
    _msg_class: type[BaseMessage] = SystemMessage

ChatMessagePromptTemplate——自定义角色

# langchain_core/prompts/chat.py:353-371
class ChatMessagePromptTemplate(BaseStringMessagePromptTemplate):
    role: str

    def format(self, **kwargs: Any) -> BaseMessage:
        text = self.prompt.format(**kwargs)
        return ChatMessage(content=text, role=self.role, ...)

用于需要自定义角色名的场景（比如 "narrator"、"tool" 等）。

MessagesPlaceholder 详解

MessagesPlaceholder 是一个特殊的消息模板，它不做模板格式化，而是直接把一个消息列表"注入"到 Prompt 中。最常见的用途是插入聊天历史。

# langchain_core/prompts/chat.py:52-217
class MessagesPlaceholder(BaseMessagePromptTemplate):
    variable_name: str                              # 变量名
    optional: bool = False                          # 是否可选
    n_messages: PositiveInt | None = None           # 最大消息数

    def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
        value = (
            kwargs.get(self.variable_name, [])      # optional 时默认空列表
            if self.optional
            else kwargs[self.variable_name]          # 非 optional 必须提供
        )
        if not isinstance(value, list):
            raise ValueError(...)
        value = convert_to_messages(value)           # 统一转换为 BaseMessage
        if self.n_messages:
            value = value[-self.n_messages :]        # 保留最后 n 条
        return value

    @property
    def input_variables(self) -> list[str]:
        return [self.variable_name] if not self.optional else []
        # optional 时不要求用户提供，返回空列表

关键细节

convert_to_messages 会将元组 ("human", "Hi") 转为 HumanMessage(content="Hi")
n_messages 可以限制注入的消息数量（截取最后 N 条）
optional=True 时，input_variables 返回空列表——这意味着它不会出现在必填变量中

带 chat history 的完整示例

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", "你是一个有记忆的助手"),
    MessagesPlaceholder("history", optional=True),
    ("human", "{input}"),
])

# 第一轮：没有历史
result1 = prompt.invoke({"input": "你好"})
print([m.__class__.__name__ for m in result1.to_messages()])
# ['SystemMessage', 'HumanMessage']

# 第二轮：有历史
result2 = prompt.invoke({
    "input": "还记得我说过什么吗？",
    "history": [
        ("human", "你好"),
        ("ai", "你好！有什么可以帮你的？"),
    ],
})
print([m.__class__.__name__ for m in result2.to_messages()])
# ['SystemMessage', 'HumanMessage', 'AIMessage', 'HumanMessage']

ChatPromptTemplate.init 自动推断

回顾 __init__ 中自动推断变量的逻辑：

# langchain_core/prompts/chat.py:976-993
input_vars: set[str] = set()
optional_variables: set[str] = set()
partial_vars: dict[str, Any] = {}
for _message in messages_:
    if isinstance(_message, MessagesPlaceholder) and _message.optional:
        partial_vars[_message.variable_name] = []      # optional → partial，默认空列表
        optional_variables.add(_message.variable_name)
    elif isinstance(_message, (BaseChatPromptTemplate, BaseMessagePromptTemplate)):
        input_vars.update(_message.input_variables)    # 收集所有必填变量

还有 validate_input_variables model_validator（chat.py:1046-1098）做了额外处理：

# chat.py:1067-1080
for message in messages:
    if isinstance(message, (BaseMessagePromptTemplate, BaseChatPromptTemplate)):
        input_vars.update(message.input_variables)
    if isinstance(message, MessagesPlaceholder):
        if message.optional and message.variable_name not in values["partial_variables"]:
            values["partial_variables"][message.variable_name] = []
            optional_variables.add(message.variable_name)
        if message.variable_name not in input_types:
            input_types[message.variable_name] = list[AnyMessage]   # 设置类型提示

MessagesPlaceholder 的变量会自动被标记为 list[AnyMessage] 类型，这在 get_input_schema() 中会反映到 Pydantic schema 上。

add 模板组合

ChatPromptTemplate.__add__ 支持 5 种重载：

# langchain_core/prompts/chat.py:1006-1044
def __add__(self, other: Any) -> ChatPromptTemplate:
    partials = {**self.partial_variables}
    if hasattr(other, "partial_variables") and other.partial_variables:
        partials.update(other.partial_variables)

    if isinstance(other, ChatPromptTemplate):
        # ChatPromptTemplate + ChatPromptTemplate → 合并 messages
        return ChatPromptTemplate(messages=self.messages + other.messages).partial(**partials)

    if isinstance(other, (BaseMessagePromptTemplate, BaseMessage, BaseChatPromptTemplate)):
        # + 单个消息模板/消息
        return ChatPromptTemplate(messages=[*self.messages, other]).partial(**partials)

    if isinstance(other, (list, tuple)):
        # + 消息列表
        other_ = ChatPromptTemplate.from_messages(other)
        return ChatPromptTemplate(messages=self.messages + other_.messages).partial(**partials)

    if isinstance(other, str):
        # + 字符串 → 当作 human 消息
        prompt = HumanMessagePromptTemplate.from_template(other)
        return ChatPromptTemplate(messages=[*self.messages, prompt]).partial(**partials)

BaseMessagePromptTemplate.__add__（message.py:84-97）也支持组合——它先把自己包装成 ChatPromptTemplate，再调用上面的 __add__：

# langchain_core/prompts/message.py:84-97
def __add__(self, other: Any) -> ChatPromptTemplate:
    prompt = ChatPromptTemplate(messages=[self])
    return prompt.__add__(other)

# 用法
from langchain_core.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate

prompt = (
    SystemMessagePromptTemplate.from_template("你是{role}")
    + HumanMessagePromptTemplate.from_template("{question}")
)
# 等价于 ChatPromptTemplate.from_messages([("system", "你是{role}"), ("human", "{question}")])

小结

概念	核心函数/类	位置
输入分发	`_convert_to_message_template`	chat.py:1407-1480
角色映射	`_create_template_from_message_type`	chat.py:1330-1404
基类	`BaseMessagePromptTemplate`	message.py:16
单文本消息	`BaseStringMessagePromptTemplate`	chat.py:225
多模态消息	`_StringImageMessagePromptTemplate`	chat.py:396
消息占位符	`MessagesPlaceholder`	chat.py:52
自动推断变量	`__init__` + `validate_input_variables`	chat.py:976-1098
模板组合	`__add__`	chat.py:1006 / message.py:84

下一篇的问题： LangChain 提供了 Few-Shot 模板让你在 Prompt 中插入示例，还有 ImagePromptTemplate 处理多模态输入。它们内部是怎么组织示例并格式化的？StructuredPrompt 又是如何自动调用 with_structured_output 的？

LangChain Prompts 深度解析（三）：Few-Shot、Image 与特殊模板

一个能跑的例子

from langchain_core.prompts import (
    FewShotPromptTemplate,
    FewShotChatMessagePromptTemplate,
    ChatPromptTemplate,
    PromptTemplate,
)

# ---- FewShotPromptTemplate（字符串版）----
examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
]

example_prompt = PromptTemplate.from_template("Word: {word}\nAntonym: {antonym}")

few_shot = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input.",
    suffix="Word: {input}\nAntonym:",
    input_variables=["input"],
)

print(few_shot.format(input="big"))
# Give the antonym of every input.
#
# Word: happy
# Antonym: sad
#
# Word: tall
# Antonym: short
#
# Word: big
# Antonym:

# ---- FewShotChatMessagePromptTemplate（聊天版）----
chat_examples = [
    {"input": "2+2", "output": "4"},
    {"input": "2+3", "output": "5"},
]

example_prompt_chat = ChatPromptTemplate.from_messages([
    ("human", "{input}"),
    ("ai", "{output}"),
])

few_shot_chat = FewShotChatMessagePromptTemplate(
    examples=chat_examples,
    example_prompt=example_prompt_chat,
)

final_prompt = ChatPromptTemplate.from_messages([
    ("system", "你是一个计算器"),
    few_shot_chat,
    ("human", "{input}"),
])

result = final_prompt.invoke({"input": "4+4"})
for msg in result.to_messages():
    print(f"{msg.__class__.__name__}: {msg.content}")
# SystemMessage: 你是一个计算器
# HumanMessage: 2+2
# AIMessage: 4
# HumanMessage: 2+3
# AIMessage: 5
# HumanMessage: 4+4

_FewShotPromptTemplateMixin——共享逻辑

FewShotPromptTemplate 和 FewShotChatMessagePromptTemplate 都继承了这个 mixin：

# langchain_core/prompts/few_shot.py:33-117
class _FewShotPromptTemplateMixin(BaseModel):
    examples: list[dict] | None = None              # 静态示例列表
    example_selector: BaseExampleSelector | None = None  # 动态选择器

    @model_validator(mode="before")
    @classmethod
    def check_examples_and_selector(cls, values: dict) -> Any:
        examples = values.get("examples")
        example_selector = values.get("example_selector")
        if examples and example_selector:
            raise ValueError("Only one of 'examples' and 'example_selector' should be provided")
        if examples is None and example_selector is None:
            raise ValueError("One of 'examples' and 'example_selector' should be provided")
        return values

    def _get_examples(self, **kwargs: Any) -> list[dict]:
        if self.examples is not None:
            return self.examples                    # 静态：直接返回
        if self.example_selector is not None:
            return self.example_selector.select_examples(kwargs)  # 动态：选择器选择
        raise ValueError(...)

核心设计：二选一——要么提供 examples（静态列表），要么提供 example_selector（动态选择器）。check_examples_and_selector validator 在初始化时强制校验这一约束。

FewShotPromptTemplate——字符串版 Few-Shot

# langchain_core/prompts/few_shot.py:120-253
class FewShotPromptTemplate(_FewShotPromptTemplateMixin, StringPromptTemplate):

    example_prompt: PromptTemplate                  # 每个 example 的模板
    suffix: str                                     # 放在 examples 后面
    example_separator: str = "\n\n"                 # 分隔符
    prefix: str = ""                                # 放在 examples 前面
    template_format: Literal["f-string", "jinja2"] = "f-string"

format() 方法

# langchain_core/prompts/few_shot.py:179-205
def format(self, **kwargs: Any) -> str:
    kwargs = self._merge_partial_and_user_variables(**kwargs)
    # 1. 获取 examples
    examples = self._get_examples(**kwargs)
    # 2. 只保留 example_prompt 需要的字段
    examples = [
        {k: e[k] for k in self.example_prompt.input_variables} for e in examples
    ]
    # 3. 格式化每个 example
    example_strings = [
        self.example_prompt.format(**example) for example in examples
    ]
    # 4. 拼接：prefix + examples + suffix
    pieces = [self.prefix, *example_strings, self.suffix]
    template = self.example_separator.join([piece for piece in pieces if piece])
    # 5. 最终格式化（替换 prefix/suffix 中的变量）
    return DEFAULT_FORMATTER_MAPPING[self.template_format](template, **kwargs)

ASCII 流程图

format(input="big")
    │
    ▼
_get_examples()
    │
    ▼
examples = [{"word":"happy","antonym":"sad"}, {"word":"tall","antonym":"short"}]
    │
    ▼ example_prompt.format(**example) for each
    │
example_strings = ["Word: happy\nAntonym: sad", "Word: tall\nAntonym: short"]
    │
    ▼ join with example_separator ("\n\n")
    │
template = "Give the antonym...\n\nWord: happy\n...\n\nWord: tall\n...\n\nWord: {input}\nAntonym:"
    │
    ▼ DEFAULT_FORMATTER_MAPPING["f-string"](template, input="big")
    │
"Give the antonym...\n\nWord: happy\nAntonym: sad\n\nWord: tall\nAntonym: short\n\nWord: big\nAntonym:"

FewShotChatMessagePromptTemplate——聊天版 Few-Shot

# langchain_core/prompts/few_shot.py:255-477
class FewShotChatMessagePromptTemplate(
    BaseChatPromptTemplate, _FewShotPromptTemplateMixin
):
    input_variables: list[str] = Field(default_factory=list)
    example_prompt: BaseMessagePromptTemplate | BaseChatPromptTemplate

format_messages() 方法

# langchain_core/prompts/few_shot.py:390-409
def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
    examples = self._get_examples(**kwargs)
    examples = [
        {k: e[k] for k in self.example_prompt.input_variables} for e in examples
    ]
    # 每个 example → example_prompt.format_messages → 展平到一个列表
    return [
        message
        for example in examples
        for message in self.example_prompt.format_messages(**example)
    ]

关键区别：format_messages 返回的是 list[BaseMessage]，每个 example 可能生成多条消息（比如一问一答），全部展平到结果列表中。

对比表

	FewShotPromptTemplate	FewShotChatMessagePromptTemplate
继承	`StringPromptTemplate`	`BaseChatPromptTemplate`
example_prompt 类型	`PromptTemplate`	`BaseMessagePromptTemplate` / `BaseChatPromptTemplate`
核心方法	`format()` → `str`	`format_messages()` → `list[BaseMessage]`
输出	拼接后的字符串	消息列表
适用场景	传统 LLM（文本补全）	ChatModel（消息格式）
prefix/suffix	有	无（由外层 ChatPromptTemplate 控制）

ImagePromptTemplate

# langchain_core/prompts/image.py:16-158
class ImagePromptTemplate(BasePromptTemplate[ImageURL]):
    template: dict = Field(default_factory=dict)     # 注意：是 dict，不是字符串！
    template_format: PromptTemplateFormat = "f-string"

template 是一个字典，比如 {"url": "{image_url}"}。format() 返回 ImageURL（一个 TypedDict）：

# langchain_core/prompts/image.py:84-132
def format(self, **kwargs: Any) -> ImageURL:
    formatted = {}
    for k, v in self.template.items():
        if isinstance(v, str):
            formatted[k] = DEFAULT_FORMATTER_MAPPING[self.template_format](v, **kwargs)
        else:
            formatted[k] = v
    url = kwargs.get("url") or formatted.get("url")
    detail = kwargs.get("detail") or formatted.get("detail")
    if not url:
        raise ValueError("Must provide url.")
    output: ImageURL = {"url": url}
    if detail:
        output["detail"] = cast("Literal['auto', 'low', 'high']", detail)
    return output

多模态消息示例

from langchain_core.prompts import HumanMessagePromptTemplate

# 创建包含文本和图片的多模态消息模板
multimodal = HumanMessagePromptTemplate.from_template([
    {"type": "text", "text": "请描述这张图片：{description_hint}"},
    {"type": "image_url", "image_url": "{image_url}"},
])

msg = multimodal.format(
    description_hint="关注颜色",
    image_url="https://example.com/photo.jpg",
)
print(msg.content)
# [
#   {'type': 'text', 'text': '请描述这张图片：关注颜色'},
#   {'type': 'image_url', 'image_url': {'url': 'https://example.com/photo.jpg'}}
# ]

内部流程：from_template 收到列表后，为文本创建 PromptTemplate，为图片创建 ImagePromptTemplate，保存在 _StringImageMessagePromptTemplate.prompt 列表中。format 时遍历列表，分别格式化后组合成 content blocks。

DictPromptTemplate

# langchain_core/prompts/dict.py:18-151
class DictPromptTemplate(RunnableSerializable[dict, dict]):
    """注意：它不是 BasePromptTemplate！它是 RunnableSerializable[dict, dict]"""
    template: dict[str, Any]
    template_format: Literal["f-string", "mustache"]

DictPromptTemplate 是一个特殊存在——它不产出 PromptValue，而是 dict → dict 的 Runnable。它递归地在 dict 值中查找和替换模板变量：

# langchain_core/prompts/dict.py:100-115
def _get_input_variables(template: dict, template_format) -> list[str]:
    input_variables = []
    for v in template.values():
        if isinstance(v, str):
            input_variables += get_template_variables(v, template_format)
        elif isinstance(v, dict):
            input_variables += _get_input_variables(v, template_format)  # 递归
        elif isinstance(v, (list, tuple)):
            for x in v:
                if isinstance(x, str):
                    input_variables += get_template_variables(x, template_format)
                elif isinstance(x, dict):
                    input_variables += _get_input_variables(x, template_format)  # 递归
    return list(set(input_variables))

主要用于 _StringImageMessagePromptTemplate.from_template 处理任意 dict 类型的 content block。

StructuredPrompt（@beta）

# langchain_core/prompts/structured.py:28-183
@beta()
class StructuredPrompt(ChatPromptTemplate):
    schema_: dict | type                        # 结构化输出的 schema
    structured_output_kwargs: dict[str, Any] = Field(default_factory=dict)

StructuredPrompt 继承 ChatPromptTemplate，关键重写了 pipe() 方法：

# langchain_core/prompts/structured.py:149-183
def pipe(self, *others, name=None) -> RunnableSerializable:
    if (others and isinstance(others[0], BaseLanguageModel)) or hasattr(
        others[0], "with_structured_output"
    ):
        return RunnableSequence(
            self,
            others[0].with_structured_output(      # 自动调用 with_structured_output！
                self.schema_, **self.structured_output_kwargs
            ),
            *others[1:],
            name=name,
        )
    raise NotImplementedError("Structured prompts need to be piped to a language model.")

当你用 | 把 StructuredPrompt 连接到一个模型时，它不是简单地传递 PromptValue，而是会自动把模型替换成 model.with_structured_output(schema_)。这意味着你不需要手动调用 with_structured_output。

from pydantic import BaseModel
from langchain_core.prompts import StructuredPrompt

class Answer(BaseModel):
    name: str
    score: int

prompt = StructuredPrompt(
    [("system", "Extract info"), ("human", "{text}")],
    schema_=Answer,
)

# 假设有模型 model
# chain = prompt | model
# 等价于:
# chain = ChatPromptTemplate(...) | model.with_structured_output(Answer)

__or__（structured.py:138-147）也被重写，委托给 pipe：

def __or__(self, other) -> RunnableSerializable:
    return self.pipe(other)

小结

模板类型	核心类	输入	输出
字符串 Few-Shot	`FewShotPromptTemplate`	`examples` + `example_prompt`	`str`
聊天 Few-Shot	`FewShotChatMessagePromptTemplate`	`examples` + `example_prompt`	`list[BaseMessage]`
图片	`ImagePromptTemplate`	`template: dict`	`ImageURL`
字典	`DictPromptTemplate`	`template: dict`	`dict`（递归格式化）
结构化	`StructuredPrompt`	`messages` + `schema_`	自动接 `with_structured_output`
共享 mixin	`_FewShotPromptTemplateMixin`	`examples` / `example_selector`	二选一

下一篇的问题： 从用户调用 invoke({"topic": "AI"}) 开始，到 ChatModel 拿到消息列表，中间经历了多少步？变量提取的安全机制是什么？Prompt 和 ChatModel 之间的 _convert_input 桥梁又是怎么工作的？

LangChain Prompts 深度解析（四）：源码架构——从 format 到 invoke 的完整链路

一个能跑的例子——手动走一遍全链路

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "你是{role}"),
    ("human", "{question}"),
])

# 手动模拟 invoke 的每一步
input_dict = {"role": "翻译官", "question": "Hello 怎么翻译？"}

# Step 1: _validate_input（base.py:159-193）
validated = prompt._validate_input(input_dict)
print(f"验证通过: {validated}")

# Step 2: format_messages（chat.py:1169-1195）
messages = prompt.format_messages(**validated)
print(f"消息列表: {messages}")

# Step 3: format_prompt → ChatPromptValue（chat.py:722-731）
prompt_value = prompt.format_prompt(**validated)
print(f"PromptValue 类型: {type(prompt_value).__name__}")

# Step 4: to_messages()（给 ChatModel 用）
final_messages = prompt_value.to_messages()
print(f"最终消息: {final_messages}")

# 实际调用时，invoke 一步到位：
result = prompt.invoke(input_dict)
assert result.to_messages() == final_messages
print("invoke 结果一致 ✓")

输出：

验证通过: {'role': '翻译官', 'question': 'Hello 怎么翻译？'}
消息列表: [SystemMessage(content='你是翻译官'), HumanMessage(content='Hello 怎么翻译？')]
PromptValue 类型: ChatPromptValue
最终消息: [SystemMessage(content='你是翻译官'), HumanMessage(content='Hello 怎么翻译？')]
invoke 结果一致 ✓

BasePromptTemplate.invoke 源码走读

# langchain_core/prompts/base.py:206-229
def invoke(
    self, input: dict, config: RunnableConfig | None = None, **kwargs: Any
) -> PromptValue:
    config = ensure_config(config)                  # 确保 config 不为 None
    if self.metadata:
        config["metadata"] = {**config["metadata"], **self.metadata}
    if self.tags:
        config["tags"] += self.tags
    return self._call_with_config(
        self._format_prompt_with_error_handling,     # 核心函数
        input,
        config,
        run_type="prompt",                           # LangSmith 追踪类型
        serialized=self._serialized,
    )

_call_with_config 是 Runnable 框架提供的通用方法（在 Runnables 系列中讲过），它负责 callback 管理和异常处理。实际工作在 _format_prompt_with_error_handling：

# langchain_core/prompts/base.py:195-197
def _format_prompt_with_error_handling(self, inner_input: dict) -> PromptValue:
    inner_input_ = self._validate_input(inner_input)    # 校验输入
    return self.format_prompt(**inner_input_)             # 格式化

_validate_input——输入校验

# langchain_core/prompts/base.py:159-193
def _validate_input(self, inner_input: Any) -> dict:
    if not isinstance(inner_input, dict):
        if len(self.input_variables) == 1:
            var_name = self.input_variables[0]
            inner_input_ = {var_name: inner_input}      # 单变量快捷方式！
        else:
            raise TypeError(f"Expected mapping type as input...")
    else:
        inner_input_ = inner_input

    missing = set(self.input_variables).difference(inner_input_)
    if missing:
        raise KeyError(
            f"Input to {self.__class__.__name__} is missing variables {missing}. "
            f"Expected: {self.input_variables}"
            f"Received: {list(inner_input_.keys())}"
            f"\nNote: if you intended {{{example_key}}} to be part of the string"
            " and not a variable, please escape it with double curly braces..."
        )
    return inner_input_

两个重要细节：

单变量快捷方式：如果模板只有一个变量，可以直接传值而不是字典

prompt = ChatPromptTemplate.from_messages([("human", "{question}")])
prompt.invoke("什么是 Python？")  # 等价于 prompt.invoke({"question": "什么是 Python？"})

友好的错误信息：缺少变量时会提示是否忘了转义花括号

两条 format_prompt 路径

Path A: StringPromptTemplate → StringPromptValue

# langchain_core/prompts/string.py:321-330
class StringPromptTemplate(BasePromptTemplate, ABC):
    def format_prompt(self, **kwargs: Any) -> PromptValue:
        return StringPromptValue(text=self.format(**kwargs))

PromptTemplate.format() → 格式化字符串 → StringPromptValue(text=...)

Path B: BaseChatPromptTemplate → ChatPromptValue

# langchain_core/prompts/chat.py:722-731
class BaseChatPromptTemplate(BasePromptTemplate, ABC):
    def format_prompt(self, **kwargs: Any) -> ChatPromptValue:
        messages = self.format_messages(**kwargs)
        return ChatPromptValue(messages=messages)

ChatPromptTemplate.format_messages() → 消息列表 → ChatPromptValue(messages=...)

ChatPromptTemplate.format_messages 详解

# langchain_core/prompts/chat.py:1169-1195
def format_messages(self, **kwargs: Any) -> list[BaseMessage]:
    kwargs = self._merge_partial_and_user_variables(**kwargs)   # 合并 partial 变量
    result = []
    for message_template in self.messages:
        if isinstance(message_template, BaseMessage):
            result.extend([message_template])                   # BaseMessage 直传
        elif isinstance(message_template, (BaseMessagePromptTemplate, BaseChatPromptTemplate)):
            message = message_template.format_messages(**kwargs) # 格式化
            result.extend(message)                               # 展平加入
        else:
            raise ValueError(f"Unexpected input: {message_template}")
    return result

ASCII 完整调用链

invoke({"role": "翻译官", "question": "你好"})
    │
    ▼
_call_with_config(
    _format_prompt_with_error_handling, input, config
)
    │
    ▼
_format_prompt_with_error_handling(inner_input)
    ├── _validate_input(inner_input)             ← 校验 + 单变量快捷方式
    │       返回 validated_dict
    │
    └── format_prompt(**validated_dict)
            │
            ▼  [BaseChatPromptTemplate 路径]
        format_messages(**kwargs)
            │
            ├── _merge_partial_and_user_variables  ← 合并 partial（Callable 在此调用）
            │
            └── for message_template in self.messages:
                    │
                    ├── BaseMessage ──────────► 直接放入 result
                    │
                    ├── SystemMessagePromptTemplate
                    │       └── format_messages(**kwargs)
                    │               └── [format(**kwargs)]
                    │                       └── prompt.format(**kwargs)
                    │                               └── DEFAULT_FORMATTER_MAPPING["f-string"]
                    │                                       ("你是{role}", role="翻译官")
                    │                                       → "你是翻译官"
                    │                       → SystemMessage(content="你是翻译官")
                    │
                    ├── HumanMessagePromptTemplate
                    │       └── (同上流程) → HumanMessage(content="你好")
                    │
                    └── MessagesPlaceholder
                            └── format_messages(**kwargs)
                                    └── kwargs["history"] → convert_to_messages
                                            → [HumanMessage(...), AIMessage(...)]
            │
            ▼
        ChatPromptValue(messages=[SystemMessage(...), HumanMessage(...), ...])

模板变量提取机制

get_template_variables（string.py:254-306）根据不同格式使用不同策略：

# langchain_core/prompts/string.py:254-306
def get_template_variables(template: str, template_format: str) -> list[str]:
    if template_format == "jinja2":
        input_variables = _get_jinja2_variables_from_template(template)
        # 使用 jinja2 的 meta.find_undeclared_variables

    elif template_format == "f-string":
        input_variables = {
            v for _, v, _, _ in Formatter().parse(template) if v is not None
        }
        # 使用 Python 标准库 string.Formatter().parse()

    elif template_format == "mustache":
        input_variables = mustache_template_vars(template)
        # 使用 token 流解析

    return sorted(input_variables)

f-string 解析

from string import Formatter

# Formatter().parse() 返回 (literal_text, field_name, format_spec, conversion) 元组
list(Formatter().parse("你好 {name}，今天是 {date}"))
# [('你好 ', 'name', '', None), ('，今天是 ', 'date', '', None)]

jinja2 解析

# langchain_core/prompts/string.py:98-107
def _get_jinja2_variables_from_template(template: str) -> set[str]:
    env = SandboxedEnvironment()
    ast = env.parse(template)
    return meta.find_undeclared_variables(ast)    # 自动找出所有未声明变量

mustache 解析

# langchain_core/prompts/string.py:123-150
def mustache_template_vars(template: str) -> set[str]:
    variables: set[str] = set()
    section_depth = 0
    for type_, key in mustache.tokenize(template):
        if type_ == "end":
            section_depth -= 1
        elif type_ in {"variable", "section", "inverted section", "no escape"} \
             and key != "." and section_depth == 0:
            variables.add(key.split(".")[0])     # 只取顶层变量
        if type_ in {"section", "inverted section"}:
            section_depth += 1
    return variables

安全机制

f-string：阻止属性访问和数字参数

# langchain_core/prompts/string.py:282-304
if template_format == "f-string":
    for var in input_variables:
        # 阻止 "obj.attr" 和 "obj[0]" 等危险模式
        if "." in var or "[" in var or "]" in var:
            raise ValueError(
                f"Invalid variable name {var!r} in f-string template. "
                f"Variable names cannot contain attribute access (.) or indexing ([])."
            )

        # 阻止纯数字变量名（会被当作位置参数）
        if var.isdigit():
            raise ValueError(
                f"Invalid variable name {var!r} in f-string template. "
                f"Variable names cannot be all digits..."
            )

为什么？因为 {obj.__class__.__init__.__globals__} 这样的模板可以泄露全局变量。LangChain 从源头阻止属性访问。

StrictFormatter：拒绝位置参数

# langchain_core/utils/formatting.py:8-48
class StrictFormatter(Formatter):
    def vformat(self, format_string: str, args: Sequence, kwargs: Mapping[str, Any]) -> str:
        if len(args) > 0:
            raise ValueError(
                "No arguments should be provided, "
                "everything should be passed as keyword arguments."
            )
        return super().vformat(format_string, args, kwargs)

StrictFormatter 确保所有变量都用关键字参数传入，拒绝 {0}、{1} 这样的位置参数。

jinja2：SandboxedEnvironment

# langchain_core/prompts/string.py:67-70
def jinja2_formatter(template: str, /, **kwargs: Any) -> str:
    return SandboxedEnvironment().from_string(template).render(**kwargs)

使用 Jinja2 的沙箱环境，限制属性/方法访问。但文档反复警告：不要接受不可信来源的 jinja2 模板。

安全层次总结

层次	机制	位置
变量提取	阻止 `.attr`、`[idx]`、纯数字变量名	string.py:282-304
格式化	`StrictFormatter` 拒绝位置参数	formatting.py:42-48
jinja2	`SandboxedEnvironment` 沙箱	string.py:70
mustache	仅支持简单变量替换	string.py:110-120

_convert_input——Prompt 与 ChatModel 的衔接

当 Prompt 产出 PromptValue 后，它被传给 ChatModel。ChatModel 的 _convert_input 负责统一入口：

# langchain_core/language_models/chat_models.py:375-386
def _convert_input(self, model_input: LanguageModelInput) -> PromptValue:
    if isinstance(model_input, PromptValue):
        return model_input                                    # Prompt 产出的，直接用
    if isinstance(model_input, str):
        return StringPromptValue(text=model_input)            # 裸字符串
    if isinstance(model_input, Sequence):
        return ChatPromptValue(messages=convert_to_messages(model_input))  # 消息列表
    raise ValueError(f"Invalid input type {type(model_input)}...")

LanguageModelInput 的类型定义：

# langchain_core/language_models/base.py:122
LanguageModelInput = PromptValue | str | Sequence[MessageLikeRepresentation]

三种输入都行，但通过 Prompt 模板的管道是最标准的路径。

完整数据流全景图

用户输入 dict
    │
    ▼
BasePromptTemplate.invoke(input)
    ├── ensure_config(config)
    └── _call_with_config(_format_prompt_with_error_handling, ...)
            │
            ▼
        _validate_input(inner_input)
            │   ├── 单变量快捷方式：non-dict → {var: value}
            │   └── 缺失变量 → KeyError（带转义提示）
            │
            ▼
        format_prompt(**validated_input)
            │
    ┌───────┴───────┐
    │               │
StringPrompt    ChatPrompt
Template        Template
    │               │
    ▼               ▼
format()      format_messages()
    │           ├── merge partial vars
    │           └── for each message_template:
    │                   ├── BaseMessage → 直传
    │                   ├── *MessagePromptTemplate
    │                   │       └── prompt.format(**kwargs)
    │                   │               └── DEFAULT_FORMATTER_MAPPING[fmt]
    │                   └── MessagesPlaceholder
    │                           └── convert_to_messages(value)
    │               │
    ▼               ▼
StringPrompt   ChatPromptValue
Value              │
    │               │
    └───────┬───────┘
            │
            ▼  PromptValue
            │
    ChatModel._convert_input(prompt_value)
            │   → prompt_value（直接传递）
            │
            ▼
    prompt_value.to_messages()
            │
            ▼
    [SystemMessage, HumanMessage, ...]
            │
            ▼
    ChatModel._generate(messages, ...)
            │
            ▼
        AIMessage
            │
            ▼
    OutputParser.invoke(ai_message)
            │
            ▼
      结构化结果（str / dict / Pydantic Model）

五个系列总览

系列	主题	核心问题
Messages	消息系统	6 种消息类型、content blocks、tool calls
Runnables	可组合执行框架	invoke/batch/stream、RunnableSequence、`\|` 管道
ChatModel	模型调用	_generate、streaming、caching、rate limiting
Outputs+Parsers	输出解析	Generation → AIMessage → 结构化数据
Prompts	模板系统	dict → PromptValue → messages → ChatModel

五个系列闭环：

  用户 dict
      │
      ▼
  ┌─────────┐    PromptValue    ┌───────────┐    AIMessage     ┌──────────┐
  │ Prompts │ ─────────────────►│ ChatModel │ ────────────────►│ Parsers  │
  │ 模板系统 │                   │  模型调用  │                   │ 输出解析  │
  └─────────┘                   └───────────┘                   └──────────┘
      ▲                              │                               │
      │                              │                               ▼
      │         Messages             │                          结构化结果
      │      （消息是数据流的载体）      │
      │                              │
      └──────── Runnables（管道编排）──┘

Messages 是贯穿始终的数据结构。Runnables 是把所有组件串起来的管道框架。Prompts 负责输入端的构造，ChatModel 负责推理，Parsers 负责输出端的解析。

到此，LangChain 核心层的五大模块全部拆解完毕。

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

AI应用新范式：RAG+Workflow+Agent，解锁企业智能升级密码！

2048 AI社区

学生高效听课秘诀这款转写软件真的离不开

有一次数学课，老师讲解析几何的三种解题方法，我埋头疯狂抄写，全程没敢抬头，结果笔记写了满满一页，却因为没听清老师的讲解，不知道每种方法的适用场景，课后做题时依旧无从下手，对着杂乱的笔记发呆，只能熬夜翻课本、问同学，浪费了大量时间。有一次英语课，老师临时补充了十几个高频短语，还讲解了每个短语的用法和例句，我用话袋AI录音后，它不仅把短语、用法、例句完整转写下来，还自动分类整理，方便我课后背诵、复习，