[LangChain语言模型组件的设计与实现]基于Completion模型的提示词模板

我们知道提示词是调用语言模型最为核心的输入，提示词的质量直接影响答案的质量，所以我们不论抬高提示词的重要性都不过分。提示词在LangChain中通过`PromptValue`类型表示，我们一般利用预定义的提示词模板来生成它。提示词模板也以一个`Runnable`对象的形式存在，由于提示词模板生成的`PromptValue`正好使模型的输入，所有两者正好组合成一个LCEL链。

JaydenAI

675人浏览 · 2026-03-05 09:01:02

JaydenAI · 2026-03-05 09:01:02 发布

我们知道提示词是调用语言模型最为核心的输入，提示词的质量直接影响答案的质量，所以我们不论抬高提示词的重要性都不过分。提示词在LangChain中通过PromptValue类型表示，我们一般利用预定义的提示词模板来生成它。提示词模板也以一个Runnable对象的形式存在，由于提示词模板生成的PromptValue正好使模型的输入，所有两者正好组合成一个LCEL链。LangChain中的提示词模板设计很多的类型，它们之间的关系基本提下在如下的UML类图中，本篇文章主要介绍基于Completion模型的提示词模板，下一篇将会介绍Chat模型的提示词模板。

Alternative Text

1. BasePromptTemplate

抽象类BasePromptTemplate继承自RunnableSerializable。作为输入的字典用于提供格式化模板时替换占位符的变量值，而输出就是表示提示词的PromptValue对象。BasePromptTemplate通过定义抽象方法format_prompt将格式化实现下放给它的子类，重写的invoke/ainvoke方法会调用此方法。BasePromptTemplate是一个泛型类型，作为泛型参数的FormatOutputType是抽象方法format格式化生成的目标类型。

FormatOutputType = TypeVar("FormatOutputType")

class BasePromptTemplate(
    RunnableSerializable[dict, PromptValue], ABC, Generic[FormatOutputType])：

    @override
    def invoke(
        self, input: dict, config: RunnableConfig | None = None, **kwargs: Any
    ) -> PromptValue:

    @override
    async def ainvoke(
        self, input: dict, config: RunnableConfig | None = None, **kwargs: Any
    ) -> PromptValue:

    @abstractmethod
    def format_prompt(self, **kwargs: Any) -> PromptValue

    @abstractmethod
    def format(self, **kwargs: Any) -> FormatOutputType
    async def aformat(self, **kwargs: Any) -> FormatOutputType

我们接下来看看BasePromptTemplate定义了哪些字段/属性成员。提示模板中利用占位符定义了一系列变量，有的变量在格式化时比如提供对应的值，input_variables字段返回它们的名称列表。有的变量在格式化时可以忽略（比如针对MessagesPlaceholder的变量），它们的名称存储在optional_variables字段中。input_types字段则定义了变量的类型（默认为字符串）。除了在格式化的时候提供变量值，我们也可以利用partial_variables对变量进行预先填充。metadata/tags提供的元数据和标签会自动附加到跟踪数据中。

class BasePromptTemplate(
    RunnableSerializable[dict, PromptValue], ABC, Generic[FormatOutputType]):

    input_variables: list[str]
    optional_variables: list[str] = Field(default=[])
    input_types: builtins.dict[str, Any] = Field(default_factory=dict, exclude=True)
    partial_variables: Mapping[str, Any] = Field(default_factory=dict)

    metadata: builtins.dict[str, Any] | None = None
    tags: list[str] | None = None

    output_parser: BaseOutputParser | None = None

BasePromptTemplate的output_parser字段用于指定解析语言模型输出的BaseOutputParser对象。有人可以会觉得奇怪，BasePromptTemplate用于生成模型的输入，为何需要定义这个用于处理模型输出的字段。在 LangChain 的设计哲学中，提示词模型是输入数据与预期结构之间的桥梁。在BasePromptTemplate中定义output_parser主要基于以下几个深层次原因：

实现“端到端”的类型安全：将OutputParser包含在提示词模板中，能够使整个LCEL链实现闭环的类型检查。如果没有这个成员，提示词模板只能声明它产生的字符串格式，而无法声明“我期望LLM返回什么样的数据结构”。
自动化指令注入：许多OutputParser拥有自动生成格式化指令的能力。当我们将它绑定到提示词模板时，可以调用它的get_format_instructions方法。这会生成类似“请返回JSON格式，必须包含 ‘name’ 和 ‘age’ 字段…”的文本。通过将两者定义在一起，可以自动把解析器要求的“咒语”塞进生成提示词中。
支持with_types和自动文档：在LangServe等部署工具中，系统需要知道 API 的返回值。携带OutputParser能推断出该链条的输出类型。
逻辑一致性：提示词模板是输出的 “源头” ，OutputParser的解析逻辑通常是高度依赖提示词内容的。如果在提示词里要求模型返回CSV，我们必须配一个CSVOutputParser。将它们绑定在一起符合 “高内聚” 的原则，当你更换提示词模板时，往往也需要更换或调整配套的解析逻辑。

总之，OutputParser绑定在提示词模板中是为了让后者具备“自解释”能力。它不仅告诉 LLM 该做什么，还规定了结果应该具有怎样的结构。

BasePromptTemplate还定义了如下三个有用的方法。partial方法用于向partial_variables字段中填充/覆盖预定义变量。以便序列化，dict方法自身连同指定的关键词参数转换成一个字典。它会调用Pydantic的model_dump方法，并手动插入_type字段。这是为了确保从文件重新加载时，LangChain知道该实例化哪个子类。save方法以JSON或YAML形式将模板配置保存到本地磁盘，具体的格式取决于指定的文件扩展名（ “.json” 、 “.yaml” 或者 “.yml” ）。值得一提的是，如果partial_variables字段存储了预定义变量，该方法会报错。

class BasePromptTemplate(
    RunnableSerializable[dict, PromptValue], ABC, Generic[FormatOutputType]):

    def partial(self, **kwargs: str | Callable[[], str]) -> BasePromptTemplate
    def dict(self, **kwargs: Any) -> dict
    def save(self, file_path: Path | str) -> None

2. StringPromptTemplate

我们可以简单地将StringPromptValue和ChatPromptValue视为Completion模型和Chat模型的提示词，StringPromptTemplate就是生成StringPromptValue的提示词模板类型的基类，它是继承BasePromptTemplate的抽象类。如下面的代码片段所示，它定义抽象方法format旨在完成基于字符串的格式化，重写的format_prompt/aformat_prompt方法会调用此方法，并利用生成的格式化文本创建返回的StringPromptValue。StringPromptTemplate还定义了pretty_repr和pretty_print方法以可读性的形式输出模板内容。

class StringPromptTemplate(BasePromptTemplate, ABC):
    def format_prompt(self, **kwargs: Any) -> PromptValue:
        return StringPromptValue(text=self.format(**kwargs))
    async def aformat_prompt(self, **kwargs: Any) -> PromptValue:
        return StringPromptValue(text=await self.aformat(**kwargs))

    @override
    @abstractmethod
    def format(self, **kwargs: Any) -> str

    def pretty_repr(
        self,
        html: bool = False,  # noqa: FBT001,FBT002
    ) -> str
    def pretty_print(self) -> None

3. PromptTemplate

PromptTemplate继承自StringPromptTemplate。template_format字段表示的模板格式同时决定了format方法采用的模板引擎，template字段表示模板字符串应该根据它进行定义。布尔类型的validate_template字段决定了在实例化时是否检查模板中的变量是否与input_variables匹配。在大部分情况下，我们都是调用类方法from_template根据指定的模板、模板格式和预填充变量来创建PromptTemplate对象。如果模板存储在文本文件中，我们还可以调用另一个类方法from_file。

PromptTemplateFormat = Literal["f-string", "mustache", "jinja2"]

class PromptTemplate(StringPromptTemplate):
    template: str
    template_format: PromptTemplateFormat = "f-string"
    validate_template: bool = False
    def format(self, **kwargs: Any) -> str

    @classmethod
    def from_file(
        cls,
        template_file: str | Path,
        encoding: str | None = None,
        **kwargs: Any,
    ) -> PromptTemplate

    @classmethod
    def from_template(
        cls,
        template: str,
        *,
        template_format: PromptTemplateFormat = "f-string",
        partial_variables: dict[str, Any] | None = None,
        **kwargs: Any,
    ) -> PromptTemplate

template_format字段表示的木模板格式具有三种选项，分别对应所用的三种模板引擎。默认值选项 “f-string” 模拟了Python原生的 f-string 语法，使用单花括号（{variable}）作为占位符。这是最常用、最直观的格式，它简单且高效，适合绝大多数静态结构的提示词，如下是一个使用此模板格式的例子。

from langchain_core.prompts import PromptTemplate
from langchain_core.prompt_values import StringPromptValue

prompt = PromptTemplate.from_template(
    template= "请用{adjective}的语气，写一段关于{topic}的介绍。",
    template_format="f-string",
    partial_variables={"adjective": "幽默"})
assert prompt.format(topic="量子力学") == "请用幽默的语气，写一段关于量子力学的介绍。"

promptValue: StringPromptValue = prompt.invoke(input={"topic": "量子力学"}) # type: ignore
assert promptValue.text == "请用幽默的语气，写一段关于量子力学的介绍。"

Jinja2是Python中最强大的模板引擎。它不仅能填充变量，还支持if/else条件判断、for循环和复杂的逻辑运算，它适合逻辑复杂的提示词格式化。这种模板格式的变量定义在双层花括号中（{{variable}} ），控制逻辑则定义 “{% %}” 中。使用这种模板格式，需要确保执行环境安装了Jinja2，如果希望深入了解这种模板引擎，可以访问其官方站点。

from langchain_core.prompts import PromptTemplate

template = """
你是一个向导。
我们要去的地方是{{ location }}。
{% if weather %}今天的具体天气是{{ weather }}，请给出针对性建议。
{% else %}天气情况未知，请提醒用户注意查天气。
{% endif %}
"""
prompt = PromptTemplate.from_template(
    template=template, 
    template_format="jinja2"
)
    
print(prompt.format(location="北京", weather="大雨"))
print(prompt.format(location="北京", weather=""))
prompt.pretty_print()

在上面这个例子中，我们定义了一个包含location和weather两个变量的模板，对于是否利用变量weather指定了具体的天气，格式化会输出如下所示的不同内容。

你是一个向导。
我们要去的地方是北京。
今天的具体天气是大雨，请给出针对性建议。

你是一个向导。
我们要去的地方是北京。
天气情况未知，请提醒用户注意查天气。

你是一个向导。
我们要去的地方是{location}。
今天的具体天气是{weather}，请给出针对性建议。

Mustache是一种 “无逻辑” 的模板语法，在前端开发中非常流行。它通过简单的标签来处理变量和列表。在跨语言通用性方面表象良好，如果你的提示词需要在 Python、JavaScript 等多端共享，Mustache是个好选择。在语法层面，Mustache也使用双层花括号定义变量，如果相对这种模板引擎做进一步了解，可以访问其GitHub站点。

from langchain_core.prompts import PromptTemplate
template = """
你好{{name}}，你的待办事项有：
{{#items}} 
- {{.}}
{{/items}}
"""

prompt = PromptTemplate.from_template(
    template=template,
    template_format="mustache"
)

print(prompt.format(name="老王", items=["开会", "写代码", "喂猫"]))
prompt.pretty_print()

在上面的演示程序中，我们采用Mustache语法定义了一个绑定到列表的模板，格式化后会生成如下的输出。

你好老王，你的待办事项有：
- 开会
- 写代码
- 喂猫

你好{name}，你的待办事项有：
- {items}

4. ImagePromptTemplate

从命名可以看出，ImagePromptTemplate代表生成ImagePromptValue的提示词模板。从如下的代码片段可以看出，它继承BasePromptTemplate[ImageURL]类型，意味着它实现的format方法会生成一个ImageURL对象，ImagePromptValue对象正式根据此对象创建而成。

class ImagePromptTemplate(BasePromptTemplate[ImageURL]):
    template: dict = Field(default_factory=dict)
    template_format: PromptTemplateFormat = "f-string"

    def format_prompt(self, **kwargs: Any) -> PromptValue:
        return ImagePromptValue(image_url=self.format(**kwargs))

    async def aformat_prompt(self, **kwargs: Any) -> PromptValue:
        return ImagePromptValue(image_url=await self.aformat(**kwargs))

    def format(
        self,
        **kwargs: Any,
    ) -> ImageURL

    async def aformat(self, **kwargs: Any) -> ImageURL

由于ImageURL具有“url”和“detail”两个属性，所以它的template字段返回的字典应该包含这两个Key（“detail”可选），对应的Value则是用于格式化生成这两个字段的模板，定义模板采用的语法应该与template_format字段相匹配。format会提取对应的模板并采用对应的模板引擎生成url和detail，并据此创建返回的ImageURL对象。下面的程序演示了ImagePromptTemplate的使用方式。

from langchain_core.prompts.image import ImagePromptTemplate
template = {
    "url":"http://img.resources.com/avatar/{{user_id}}.jpg",
    "detail":"{% if vip %}high{% else %}auto{% endif %}"
}
prompt = ImagePromptTemplate (template=template, template_format="jinja2")
input = {"user_id": "jayden", "vip": True}
imageUrl = prompt.format(**input)
promptValue = prompt.invoke(input)
assert imageUrl == promptValue.image_url # type: ignore
print(promptValue.to_messages())

# output
# [HumanMessage(content=[{'detail': 'high', 'url': 'http://img.resources.com/avatar/jayden.jpg'}], additional_kwargs={}, response_metadata={})]