Skip to main content
Open In ColabOpen on GitHub

ChatOpenAI

本 Notebook 提供了使用 OpenAI 聊天模型 的快速入门指南。如需了解 ChatOpenAI 所有功能和配置的详细文档,请前往 API 参考

OpenAI 拥有多种聊天模型。您可以在 OpenAI 文档 中找到有关其最新模型、成本、上下文窗口和支持的输入类型的信息。

Azure OpenAI

请注意,某些 OpenAI 模型也可以通过 Microsoft Azure 平台 访问。要使用 Azure OpenAI 服务,请使用 AzureChatOpenAI 集成

概述

集成详情

类别本地可序列化JS 支持包下载量包最新版本
ChatOpenAIlangchain-openaibetaPyPI - DownloadsPyPI - Version

模型特性

工具调用结构化输出JSON 模式图片输入音频输入视频输入Token 级别流式传输原生异步Token 使用量跟踪Logprobs

设置

要访问 OpenAI 模型,您需要创建一个 OpenAI 账户,获取 API 密钥,并安装 langchain-openai 集成包。

凭证

请访问 https://platform.openai.com 注册 OpenAI 并生成 API 密钥。完成此操作后,请设置 OPENAI_API_KEY 环境变量:

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

如果你想获得模型调用的自动化追踪,你也可以通过取消下面这行注释来设置你的 LangSmith API 密钥:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

安装

LangChain OpenAI 集成位于 langchain-openai 包中:

%pip install -qU langchain-openai

实例化

现在我们可以实例化我们的模型对象并生成聊天补全:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# api_key="...", # if you prefer to pass api key in directly instaed of using env vars
# base_url="...",
# organization="...",
# other params...
)
API Reference:ChatOpenAI

调用

messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-63219b22-03e3-4561-8cc4-78b7c7c3a3ca-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
print(ai_msg.content)
J'adore la programmation.

链接

我们可以像这样将模型与提示模板链接起来:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)

chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
API Reference:ChatPromptTemplate
AIMessage(content='Ich liebe das Programmieren.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-350585e1-16ca-4dad-9460-3d9e7e49aaf1-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})

工具调用

OpenAI 提供了一个 工具调用(我们在此互换使用“工具调用”和“函数调用”)API,可让您描述工具及其参数,并让模型返回一个 JSON 对象,其中包含要调用的工具和该工具的输入。工具调用对于构建使用工具的链和代理,以及更普遍地从模型获取结构化输出极其有用。

ChatOpenAI.bind_tools()

借助 ChatOpenAI.bind_tools,我们可以轻松地将 Pydantic 类、字典模式、LangChain 工具甚至函数作为工具传递给模型。在底层,这些工具会被转换为 OpenAI 工具模式,其格式如下:

{
"name": "...",
"description": "...",
"parameters": {...} # JSONSchema
}

并在每次模型调用时传递。

from pydantic import BaseModel, Field


class GetWeather(BaseModel):
"""Get the current weather in a given location"""

location: str = Field(..., description="The city and state, e.g. San Francisco, CA")


llm_with_tools = llm.bind_tools([GetWeather])
ai_msg = llm_with_tools.invoke(
"what is the weather like in San Francisco",
)
ai_msg
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-1617c9b2-dda5-4120-996b-0333ed5992e2-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

strict=True

Requires langchain-openai>=0.1.21

截至 2024 年 8 月 6 日,OpenAI 在调用工具时支持 strict 参数,该参数将强制模型遵守工具参数模式。在此处了解更多信息:https://platform.openai.com/docs/guides/function-calling

注意: 如果 strict=True,工具定义也将被验证,并且只接受一部分 JSON schema。关键是,schema 不能包含可选参数(即具有默认值的参数)。在此处阅读有关支持的 schema 类型的完整文档:https://platform.openai.com/docs/guides/structured-outputs/supported-schemas。

llm_with_tools = llm.bind_tools([GetWeather], strict=True)
ai_msg = llm_with_tools.invoke(
"what is the weather like in San Francisco",
)
ai_msg
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5e3356a9-132d-4623-8e73-dd5a898cf4a6-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})

AIMessage.tool_calls

请注意,AIMessage 具有 tool_calls 属性。它以一种标准化的 ToolCall 格式包含信息,这种格式不依赖于具体的模型提供商。

ai_msg.tool_calls
[{'name': 'GetWeather',
'args': {'location': 'San Francisco, CA'},
'id': 'call_jUqhd8wzAIzInTJl72Rla8ht',
'type': 'tool_call'}]

有关绑定工具和工具调用输出的更多信息,请参阅 tool calling 文档。

结构化输出与工具调用

OpenAI 的 结构化输出 功能可以与工具调用同时使用。模型将生成工具调用或符合所需模式的响应。请参见下例:

from langchain_openai import ChatOpenAI
from pydantic import BaseModel


def get_weather(location: str) -> None:
"""Get weather at a location."""
return "It's sunny."


class OutputSchema(BaseModel):
"""Schema for response."""

answer: str
justification: str


llm = ChatOpenAI(model="gpt-4.1")

structured_llm = llm.bind_tools(
[get_weather],
response_format=OutputSchema,
strict=True,
)

# Response contains tool calls:
tool_call_response = structured_llm.invoke("What is the weather in SF?")

# structured_response.additional_kwargs["parsed"] contains parsed output
structured_response = structured_llm.invoke(
"What weighs more, a pound of feathers or a pound of gold?"
)
API Reference:ChatOpenAI

Responses API

需要 langchain-openai>=0.3.9

OpenAI 支持一个面向构建 agentic 应用程序的 Responses API。它包含一套 内置工具,包括网页和文件搜索。它还支持 对话状态 管理,允许您在不显式传入先前消息的情况下继续对话线索,以及 推理过程 的输出。

当使用这些功能之一时,ChatOpenAI 将路由到 Responses API。您也可以在实例化 ChatOpenAI 时指定 use_responses_api=True

note

langchain-openai >= 0.3.26 允许用户在使用 Responses API 时选择使用更新的 AIMessage 格式。设置

llm = ChatOpenAI(model="...", output_version="responses/v1")

会将推理摘要、内置工具调用和其他响应项的输出格式化到消息的 content 字段,而不是 additional_kwargs。我们推荐新应用程序使用此格式。

要触发网页搜索,请将 {"type": "web_search_preview"} 传递给模型,就像传递其他工具一样。

tip

您也可以将内置工具作为调用参数传递:

llm.invoke("...", tools=[{"type": "web_search_preview"}])
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")

tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What was a positive news story from today?")
API Reference:ChatOpenAI

请注意,响应包含结构化的 内容块,其中包含响应文本和引用其来源的 OpenAI 注解。输出消息还将包含来自任何工具调用的信息:

response.content
[{'id': 'ws_685d997c1838819e8a2cbf66059ddd5c0f6f330a19127ac1',
'action': {'query': 'positive news stories today', 'type': 'search'},
'status': 'completed',
'type': 'web_search_call'},
{'type': 'text',
'text': "On June 25, 2025, the James Webb Space Telescope made a groundbreaking discovery by directly imaging a previously unknown exoplanet. This young gas giant, approximately the size of Saturn, orbits a star smaller than our Sun, located about 110 light-years away in the constellation Antlia. This achievement marks the first time Webb has identified an exoplanet not previously known, expanding our understanding of distant worlds. ([straitstimes.com](https://www.straitstimes.com/world/while-you-were-sleeping-5-stories-you-might-have-missed-june-26-2025?utm_source=openai))\n\nAdditionally, in the realm of conservation, a significant milestone was achieved with the successful translocation of seventy southern white rhinos from South Africa to Rwanda's Akagera National Park. This initiative represents the first international translocation from Platinum Rhino, a major captive breeding operation, and is seen as a substantial opportunity to safeguard the future of the white rhino species. ([conservationoptimism.org](https://conservationoptimism.org/7-stories-of-optimism-this-week-17-06-25-23-06-25/?utm_source=openai))\n\nThese developments highlight positive strides in both scientific exploration and wildlife conservation efforts. ",
'annotations': [{'end_index': 572,
'start_index': 429,
'title': 'While You Were Sleeping: 5 stories you might have missed, June 26, 2025 | The Straits Times',
'type': 'url_citation',
'url': 'https://www.straitstimes.com/world/while-you-were-sleeping-5-stories-you-might-have-missed-june-26-2025?utm_source=openai'},
{'end_index': 1121,
'start_index': 990,
'title': '7 stories of optimism this week (17.06.25-23.06.25) - Conservation Optimism',
'type': 'url_citation',
'url': 'https://conservationoptimism.org/7-stories-of-optimism-this-week-17-06-25-23-06-25/?utm_source=openai'}],
'id': 'msg_685d997f6b94819e8d981a2b441470420f6f330a19127ac1'}]
tip

您可以使用 response.text() 方法将响应的文本内容恢复为字符串。例如,可以流式传输响应文本:

for token in llm_with_tools.stream("..."):
print(token.text(), end="|")

有关更多详细信息,请参阅流式处理指南

图片生成

需要 langchain-openai>=0.3.19

要触发图片生成,请像使用其他工具一样,将 {"type": "image_generation"} 传递给模型。

tip

您也可以将内置工具作为调用参数传递:

llm.invoke("...", tools=[{"type": "image_generation"}])
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")

tool = {"type": "image_generation", "quality": "low"}

llm_with_tools = llm.bind_tools([tool])

ai_message = llm_with_tools.invoke(
"Draw a picture of a cute fuzzy cat with an umbrella"
)
API Reference:ChatOpenAI
import base64

from IPython.display import Image

image = next(
item for item in ai_message.content if item["type"] == "image_generation_call"
)
Image(base64.b64decode(image["result"]), width=200)

文件搜索

要触发文件搜索,请将一个文件搜索工具传递给模型,就像传递其他工具一样。您需要填充一个 OpenAI 管理的向量库,并在工具定义中包含该向量库 ID。更多详情请参阅 OpenAI 文档

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")

openai_vector_store_ids = [
"vs_...", # your IDs here
]

tool = {
"type": "file_search",
"vector_store_ids": openai_vector_store_ids,
}
llm_with_tools = llm.bind_tools([tool])

response = llm_with_tools.invoke("What is deep research by OpenAI?")
print(response.text())
API Reference:ChatOpenAI
Deep Research by OpenAI is a newly launched agentic capability within ChatGPT designed to conduct complex, multi-step research tasks on the internet autonomously. It synthesizes large amounts of online information into comprehensive, research analyst-level reports, accomplishing in tens of minutes what would typically take a human many hours. This capability is powered by an upcoming OpenAI o3 model that is optimized for web browsing and data analysis, allowing it to search, interpret, and analyze massive amounts of text, images, and PDFs from the internet, while dynamically adjusting its approach based on the information it finds.

Key features of Deep Research include:
- Independent discovery, reasoning, and consolidation of insights from across the web.
- Ability to use browser and Python programming tools for data analysis and graph plotting.
- Full documentation of outputs with clear citations and a summary of its reasoning process, making it easy to verify and reference.
- Designed to provide thorough, precise, and reliable research especially useful for knowledge-intensive domains such as finance, science, policy, and engineering. It is also valuable for individuals seeking personalized and detailed product research.

It uses reinforcement learning techniques to plan and execute multi-step information-gathering tasks, reacting to real-time information by backtracking or pivoting its search when necessary. Deep Research can browse the open web and user-uploaded files, integrates visual data such as images and graphs into its reports, and cites specific source passages to support its conclusions.

The goal behind Deep Research is to enhance knowledge synthesis, which is essential for creating new knowledge, marking a significant step toward the development of Artificial General Intelligence (AGI) capable of producing novel scientific research.

Users can access Deep Research via ChatGPT by selecting the "deep research" option in the message composer, entering their query, and optionally attaching files or spreadsheets. The research process can take from 5 to 30 minutes, during which users can continue with other tasks. The final output is delivered as a richly detailed and well-documented report within the chat interface.

Currently, Deep Research is available to Pro users with plans to expand access further to Plus, Team, and Enterprise users. It currently supports research using open web sources and uploaded files, with future plans to connect to specialized subscription or internal data sources for even more robust research outputs.

Though powerful, Deep Research has limitations such as occasional hallucinations, difficulty distinguishing authoritative information from rumors, and some formatting or citation issues at launch, which are expected to improve with usage and time.

In summary, Deep Research is a highly advanced AI research assistant capable of automating extensive, in-depth knowledge work by synthesizing vast amounts of online data into comprehensive, credible reports, designed to save users significant time and effort on complex research tasks.

如同 网页搜索 一样,响应内容会包含带引文的内容块:

[block["type"] for block in response.content]
['file_search_call', 'text']
text_block = next(block for block in response.content if block["type"] == "text")

text_block["annotations"][:2]
[{'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
'filename': 'deep_research_blog.pdf',
'index': 3121,
'type': 'file_citation'},
{'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
'filename': 'deep_research_blog.pdf',
'index': 3121,
'type': 'file_citation'}]

它还将包含来自内置工具调用的信息:

response.content[0]
{'id': 'fs_685d9e7d48408191b9e34ad359069ede019138cfaaf3cea8',
'queries': ['deep research by OpenAI'],
'status': 'completed',
'type': 'file_search_call'}

电脑使用

ChatOpenAI 支持 "computer-use-preview" 模型,这是一个为内置电脑使用工具设计的专用模型。要启用它,请像传递其他工具一样传递一个电脑使用工具

目前,电脑使用的工具输出存在于消息的 content 字段中。要回复电脑使用工具的调用,请构建一个 ToolMessage,在其 additional_kwargs 中包含 {"type": "computer_call_output"}。消息的内容将是一个截图。下面,我们演示一个简单的例子。

首先,加载两个截图:

import base64


def load_png_as_base64(file_path):
with open(file_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
return encoded_string.decode("utf-8")


screenshot_1_base64 = load_png_as_base64(
"/path/to/screenshot_1.png"
) # perhaps a screenshot of an application
screenshot_2_base64 = load_png_as_base64(
"/path/to/screenshot_2.png"
) # perhaps a screenshot of the Desktop
from langchain_openai import ChatOpenAI

# Initialize model
llm = ChatOpenAI(
model="computer-use-preview",
truncation="auto",
output_version="responses/v1",
)

# Bind computer-use tool
tool = {
"type": "computer_use_preview",
"display_width": 1024,
"display_height": 768,
"environment": "browser",
}
llm_with_tools = llm.bind_tools([tool])

# Construct input message
input_message = {
"role": "user",
"content": [
{
"type": "text",
"text": (
"Click the red X to close and reveal my Desktop. "
"Proceed, no confirmation needed."
),
},
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_1_base64}",
},
],
}

# Invoke model
response = llm_with_tools.invoke(
[input_message],
reasoning={
"generate_summary": "concise",
},
)
API Reference:ChatOpenAI

响应将在其 content 中包含对 computer-use 工具的调用:

response.content
[{'id': 'rs_685da051742c81a1bb35ce46a9f3f53406b50b8696b0f590',
'summary': [{'text': "Clicking red 'X' to show desktop",
'type': 'summary_text'}],
'type': 'reasoning'},
{'id': 'cu_685da054302481a1b2cc43b56e0b381706b50b8696b0f590',
'action': {'button': 'left', 'type': 'click', 'x': 14, 'y': 38},
'call_id': 'call_zmQerFBh4PbBE8mQoQHkfkwy',
'pending_safety_checks': [],
'status': 'completed',
'type': 'computer_call'}]

接下来,我们使用以下属性构建一个 ToolMessage:

  1. 它有一个与 computer-call 的 call_id 匹配的 tool_call_id
  2. 它的 additional_kwargs 中包含 {"type": "computer_call_output"}
  3. 其内容为 image_urlinput_image 输出块(有关格式,请参阅 OpenAI 文档)。
from langchain_core.messages import ToolMessage

tool_call_id = next(
item["call_id"] for item in response.content if item["type"] == "computer_call"
)

tool_message = ToolMessage(
content=[
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_2_base64}",
}
],
# content=f"data:image/png;base64,{screenshot_2_base64}", # <-- also acceptable
tool_call_id=tool_call_id,
additional_kwargs={"type": "computer_call_output"},
)
API Reference:ToolMessage

我们现在可以使用消息历史再次调用模型:

messages = [
input_message,
response,
tool_message,
]

response_2 = llm_with_tools.invoke(
messages,
reasoning={
"generate_summary": "concise",
},
)
response_2.text()
'VS Code has been closed, and the desktop is now visible.'

我们可以不传回整个序列,而是使用 previous_response_id:

previous_response_id = response.response_metadata["id"]

response_2 = llm_with_tools.invoke(
[tool_message],
previous_response_id=previous_response_id,
reasoning={
"generate_summary": "concise",
},
)
response_2.text()
'The VS Code window is closed, and the desktop is now visible. Let me know if you need any further assistance.'

代码解释器

OpenAI 实现了一个代码解释器工具,用于支持代码的沙盒生成和执行。

使用示例:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="o4-mini", output_version="responses/v1")

llm_with_tools = llm.bind_tools(
[
{
"type": "code_interpreter",
# Create a new container
"container": {"type": "auto"},
}
]
)
response = llm_with_tools.invoke(
"Write and run code to answer the question: what is 3^3?"
)
API Reference:ChatOpenAI

请注意,上述命令创建了一个新容器。我们也可以指定一个现有的容器 ID:

code_interpreter_calls = [
item for item in response.content if item["type"] == "code_interpreter_call"
]
assert len(code_interpreter_calls) == 1
container_id = code_interpreter_calls[0]["container_id"]

llm_with_tools = llm.bind_tools(
[
{
"type": "code_interpreter",
# Use an existing container
"container": container_id,
}
]
)

远程 MCP

OpenAI 实现了一个 远程 MCP 工具,允许模型生成的调用到 MCP 服务器。

使用示例:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="o4-mini", output_version="responses/v1")

llm_with_tools = llm.bind_tools(
[
{
"type": "mcp",
"server_label": "deepwiki",
"server_url": "https://mcp.deepwiki.com/mcp",
"require_approval": "never",
}
]
)
response = llm_with_tools.invoke(
"What transport protocols does the 2025-03-26 version of the MCP "
"spec (modelcontextprotocol/modelcontextprotocol) support?"
)
API Reference:ChatOpenAI
MCP 批准

OpenAI 有时会在与远程 MCP 服务器共享数据之前请求批准。

在上面的命令中,我们指示模型永远不要要求批准。我们还可以将模型配置为始终要求批准,或为特定工具始终要求批准:

llm_with_tools = llm.bind_tools(
[
{
"type": "mcp",
"server_label": "deepwiki",
"server_url": "https://mcp.deepwiki.com/mcp",
"require_approval": {
"always": {
"tool_names": ["read_wiki_structure"]
}
}
}
]
)
response = llm_with_tools.invoke(
"What transport protocols does the 2025-03-26 version of the MCP "
"spec (modelcontextprotocol/modelcontextprotocol) support?"
)

响应可能随后包含类型为 "mcp_approval_request" 的块。

要提交对批准请求的批准,请将其结构化为输入消息中的内容块:

approval_message = {
"role": "user",
"content": [
{
"type": "mcp_approval_response",
"approve": True,
"approval_request_id": block["id"],
}
for block in response.content
if block["type"] == "mcp_approval_request"
]
}

next_response = llm_with_tools.invoke(
[approval_message],
# continue existing thread
previous_response_id=response.response_metadata["id"]
)

管理对话状态

Responses API 支持 对话状态的管理。

手动管理状态

您可以通过手动管理状态或使用 LangGraph 来管理状态,这与其他聊天模型一样:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")

tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])

first_query = "What was a positive news story from today?"
messages = [{"role": "user", "content": first_query}]

response = llm_with_tools.invoke(messages)
response_text = response.text()
print(f"{response_text[:100]}... {response_text[-100:]}")
API Reference:ChatOpenAI
On June 25, 2025, the James Webb Space Telescope made a groundbreaking discovery by directly imaging... exploration and environmental conservation, reflecting positive developments in science and nature.
second_query = (
"Repeat my question back to me, as well as the last sentence of your answer."
)

messages.extend(
[
response,
{"role": "user", "content": second_query},
]
)
second_response = llm_with_tools.invoke(messages)
print(second_response.text())
Your question was: "What was a positive news story from today?"

The last sentence of my answer was: "These stories highlight significant advancements in both space exploration and environmental conservation, reflecting positive developments in science and nature."
tip

您可以使用 LangGraph 来管理您在各种后端(包括内存和 Postgres)中的对话线程。请参阅此教程开始。

传递 previous_response_id

在使用 Responses API 时,LangChain 消息将在其元数据中包含一个“id”字段。将此 ID 传递给后续调用将继续对话。请注意,从计费角度来看,这等同于手动传递消息。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4.1-mini",
output_version="responses/v1",
)
response = llm.invoke("Hi, I'm Bob.")
print(response.text())
API Reference:ChatOpenAI
Hi Bob! How can I assist you today?
second_response = llm.invoke(
"What is my name?",
previous_response_id=response.response_metadata["id"],
)
print(second_response.text())
You mentioned that your name is Bob. How can I help you today, Bob?

ChatOpenAI 还可以使用消息序列中的最后一个响应自动指定 previous_response_id

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4.1-mini",
output_version="responses/v1",
use_previous_response_id=True,
)
API Reference:ChatOpenAI

如果我们设置 use_previous_response_id=True,输入消息将从请求负载中删除,直到最新的响应为止,并且 previous_response_id 将使用最新响应的 ID 来设置。

也就是说,

llm.invoke(
[
HumanMessage("Hello"),
AIMessage("Hi there!", response_metadata={"id": "resp_123"}),
HumanMessage("How are you?"),
]
)

等价于:

llm.invoke([HumanMessage("How are you?")], previous_response_id="resp_123")

推理输出

一些 OpenAI 模型会生成单独的文本内容来说明其推理过程。有关详细信息,请参阅 OpenAI 的推理文档

OpenAI 可以返回模型推理的摘要(尽管它不公开原始的推理 token)。要将 ChatOpenAI 配置为返回此摘要,请指定 reasoning 参数。如果设置了此参数,ChatOpenAI 将自动路由到 Responses API。

from langchain_openai import ChatOpenAI

reasoning = {
"effort": "medium", # 'low', 'medium', or 'high'
"summary": "auto", # 'detailed', 'auto', or None
}

llm = ChatOpenAI(model="o4-mini", reasoning=reasoning, output_version="responses/v1")
response = llm.invoke("What is 3^3?")

# Output
response.text()
API Reference:ChatOpenAI
'3³ = 3 × 3 × 3 = 27.'
# Reasoning
for block in response.content:
if block["type"] == "reasoning":
for summary in block["summary"]:
print(summary["text"])
**Calculating the power of three**

The user is asking about 3 raised to the power of 3. That's a pretty simple calculation! I know that 3^3 equals 27, so I can say, "3 to the power of 3 equals 27." I might also include a quick explanation that it's 3 multiplied by itself three times: 3 × 3 × 3 = 27. So, the answer is definitely 27.

微调

您可以通过传入相应的 modelName 参数来调用微调后的 OpenAI 模型。

这通常采用 ft:{OPENAI_MODEL_NAME}:{ORG_NAME}::{MODEL_ID} 的形式。例如:

fine_tuned_model = ChatOpenAI(
temperature=0, model_name="ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR"
)

fine_tuned_model.invoke(messages)
AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0f39b30e-c56e-4f3b-af99-5c948c984146-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})

多模态输入(图像、PDF、音频)

OpenAI 拥有支持多模态输入的模型。您可以将图像、PDF 或音频传递给这些模型。有关如何在 LangChain 中执行此操作的更多信息,请参阅多模态输入文档。

您可以在OpenAI 的文档中查看支持不同模态的模型列表。

对于所有模态,LangChain 同时支持其跨提供商标准和 OpenAI 的原生内容块格式。

要将多模态数据传递给 ChatOpenAI,请创建一个包含数据的内容块,并将其合并到消息中,例如如下所示:

message = {
"role": "user",
"content": [
{
"type": "text",
# 根据需要更新提示
"text": "请描述一下(图像/PDF/音频...)",
},
content_block,
],
}

有关内容块的示例,请参见下方。

图像

请参阅此处如何操作指南中的示例。

URL:

# LangChain 格式
content_block = {
"type": "image",
"source_type": "url",
"url": url_string,
}

# OpenAI Chat Completions 格式
content_block = {
"type": "image_url",
"image_url": {"url": url_string},
}

内联 Base64 数据:

# LangChain 格式
content_block = {
"type": "image",
"source_type": "base64",
"data": base64_string,
"mime_type": "image/jpeg",
}

# OpenAI Chat Completions 格式
content_block = {
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_string}",
},
}
PDF

注意:OpenAI 要求为 PDF 输入指定文件名。使用 LangChain 的格式时,请包含 filename 键。

在此处阅读更多信息

请参阅此处如何操作指南中的示例。

内联 Base64 数据:

# LangChain 格式
content_block = {
"type": "file",
"source_type": "base64",
"data": base64_string,
"mime_type": "application/pdf",
"filename": "my-file.pdf",
}

# OpenAI Chat Completions 格式
content_block = {
"type": "file",
"file": {
"filename": "my-file.pdf",
"file_data": f"data:application/pdf;base64,{base64_string}",
}
}
音频

请参阅支持的模型,例如 "gpt-4o-audio-preview"

请参阅此处如何操作指南中的示例。

内联 Base64 数据:

# LangChain 格式
content_block = {
"type": "audio",
"source_type": "base64",
"mime_type": "audio/wav", # 或适当的 mime-type
"data": base64_string,
}

# OpenAI Chat Completions 格式
content_block = {
"type": "input_audio",
"input_audio": {"data": base64_string, "format": "wav"},
}

预测输出

info

需要 langchain-openai>=0.2.6

一些 OpenAI 模型(例如其 gpt-4ogpt-4o-mini 系列)支持 预测输出,它允许您提前传入大模型预期输出的已知部分,以减少延迟。这对于编辑文本或代码等场景非常有用,在这些场景中,模型输出的只有一小部分会发生变化。

以下是一个示例:

code = """
/// <summary>
/// Represents a user with a first name, last name, and username.
/// </summary>
public class User
{
/// <summary>
/// Gets or sets the user's first name.
/// </summary>
public string FirstName { get; set; }

/// <summary>
/// Gets or sets the user's last name.
/// </summary>
public string LastName { get; set; }

/// <summary>
/// Gets or sets the user's username.
/// </summary>
public string Username { get; set; }
}
"""

llm = ChatOpenAI(model="gpt-4o")
query = (
"Replace the Username property with an Email property. "
"Respond only with code, and with no markdown formatting."
)
response = llm.invoke(
[{"role": "user", "content": query}, {"role": "user", "content": code}],
prediction={"type": "content", "content": code},
)
print(response.content)
print(response.response_metadata)
/// <summary>
/// Represents a user with a first name, last name, and email.
/// </summary>
public class User
{
/// <summary>
/// Gets or sets the user's first name.
/// </summary>
public string FirstName { get; set; }

/// <summary>
/// Gets or sets the user's last name.
/// </summary>
public string LastName { get; set; }

/// <summary>
/// Gets or sets the user's email.
/// </summary>
public string Email { get; set; }
}
{'token_usage': {'completion_tokens': 226, 'prompt_tokens': 166, 'total_tokens': 392, 'completion_tokens_details': {'accepted_prediction_tokens': 49, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 107}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_45cf54deae', 'finish_reason': 'stop', 'logprobs': None}

请注意,目前预测会按额外 token 计费,这可能会增加您的使用量和成本,以换取延迟的降低。

音频生成(预览版)

info

需要 langchain-openai>=0.2.3

OpenAI 推出了新的 音频生成功能,允许您使用 gpt-4o-audio-preview 模型进行音频输入和输出。

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
model="gpt-4o-audio-preview",
temperature=0,
model_kwargs={
"modalities": ["text", "audio"],
"audio": {"voice": "alloy", "format": "wav"},
},
)

output_message = llm.invoke(
[
("human", "Are you made by OpenAI? Just answer yes or no"),
]
)
API Reference:ChatOpenAI

output_message.additional_kwargs['audio'] 将包含一个字典,其中包含:

{
'data': '<audio data b64-encoded',
'expires_at': 1729268602,
'id': 'audio_67127d6a44348190af62c1530ef0955a',
'transcript': 'Yes.'
}

格式将是 model_kwargs['audio']['format'] 中传递的格式。

在 openai expires_at 到期之前,我们也可以将此消息连同音频数据作为消息历史的一部分传回模型。

note

输出音频存储在 AIMessage.additional_kwargsaudio 键下,但输入内容块在 HumanMessage.content 列表中使用 input_audio 类型和键进行类型化。

有关更多信息,请参阅 OpenAI 的 音频文档

history = [
("human", "Are you made by OpenAI? Just answer yes or no"),
output_message,
("human", "And what is your name? Just give your name."),
]
second_output_message = llm.invoke(history)

Flex 处理

OpenAI 提供了多种服务层级。"flex" 层级为请求提供更低的价格,但缺点是响应可能需要更长时间,并且资源可能并非始终可用。这种方法最适合非关键任务,包括模型测试、数据增强或可以异步运行的作业。

要使用它,请使用 service_tier="flex" 初始化模型:

llm = ChatOpenAI(model="o4-mini", service_tier="flex")

请注意,这是一项 Beta 功能,仅适用于部分模型。有关更多详细信息,请参阅 OpenAI 文档

API 参考

有关 ChatOpenAI 所有功能和配置的详细文档,请访问 API 参考