ChatHuggingFace
这将帮助您开始使用 langchain_huggingface chat 模型。有关 ChatHuggingFace 所有功能和配置的详细文档,请前往 API 参考。要查看 Hugging Face 支持的模型列表,请访问 此页面。
概览
集成详情
集成详情
| 类 | 包 | 本地 | 可序列化 | JS 支持 | 包下载 | 包最新 |
|---|---|---|---|---|---|---|
| ChatHuggingFace | langchain-huggingface | ✅ | beta | ❌ |
模型功能
| 工具调用 | 结构化输出 | JSON 模式 | 图像输入 | 音频输入 | 视频输入 | Token 级流式传输 | 原生异步 | Token 使用量 | Logprobs |
|---|---|---|---|---|---|---|---|---|---|
| ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ |
设置
要访问 Hugging Face 模型,您需要创建 Hugging Face 账户、获取 API 密钥,并安装 langchain-huggingface 集成包。
凭证
生成一个 Hugging Face 访问令牌,并将其作为环境变量 HUGGINGFACEHUB_API_TOKEN 存储。
import getpass
import os
if not os.getenv("HUGGINGFACEHUB_API_TOKEN"):
os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass.getpass("Enter your token: ")
安装
| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |
|---|---|---|---|---|---|---|
| ChatHuggingFace | langchain_huggingface | ✅ | ❌ | ❌ |
模型特性
| Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
|---|---|---|---|---|---|---|---|---|---|
| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
设置
要访问 langchain_huggingface 模型,您需要创建一个 Hugging Face 账户,获取 API 密钥,并安装 langchain_huggingface 集成包。
凭证
您需要将 Hugging Face Access Token 保存为环境变量:HUGGINGFACEHUB_API_TOKEN。
import getpass
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass.getpass(
"Enter your Hugging Face API key: "
)
%pip install --upgrade --quiet langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2 bitsandbytes accelerate
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
实例化
你可以通过两种方式实例化一个 ChatHuggingFace 模型:从 HuggingFaceEndpoint 实例化,或者从 HuggingFacePipeline 实例化。
HuggingFaceEndpoint
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
repo_id="deepseek-ai/DeepSeek-R1-0528",
task="text-generation",
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
provider="auto", # let Hugging Face choose the best provider for you
)
chat_model = ChatHuggingFace(llm=llm)
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/isaachershenson/.cache/huggingface/token
Login successful
现在,让我们利用 Inference Providers 在特定的第三方提供商处运行模型
llm = HuggingFaceEndpoint(
repo_id="deepseek-ai/DeepSeek-R1-0528",
task="text-generation",
provider="hyperbolic", # set your provider here
# provider="nebius",
# provider="together",
)
chat_model = ChatHuggingFace(llm=llm)
HuggingFacePipeline
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline
llm = HuggingFacePipeline.from_model_id(
model_id="HuggingFaceH4/zephyr-7b-beta",
task="text-generation",
pipeline_kwargs=dict(
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
),
)
chat_model = ChatHuggingFace(llm=llm)
config.json: 0%| | 0.00/638 [00:00<?, ?B/s]
model.safetensors.index.json: 0%| | 0.00/23.9k [00:00<?, ?B/s]
Downloading shards: 0%| | 0/8 [00:00<?, ?it/s]
model-00001-of-00008.safetensors: 0%| | 0.00/1.89G [00:00<?, ?B/s]
model-00002-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00003-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00004-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00005-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00006-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00007-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00008-of-00008.safetensors: 0%| | 0.00/816M [00:00<?, ?B/s]
Loading checkpoint shards: 0%| | 0/8 [00:00<?, ?it/s]
generation_config.json: 0%| | 0.00/111 [00:00<?, ?B/s]
实例化与量化
要运行模型的量化版本,可以按如下方式指定 bitsandbytes 的量化配置:
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype="float16",
bnb_4bit_use_double_quant=True,
)
并将其作为 model_kwargs 的一部分传递给 HuggingFacePipeline:
llm = HuggingFacePipeline.from_model_id(
model_id="HuggingFaceH4/zephyr-7b-beta",
task="text-generation",
pipeline_kwargs=dict(
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
return_full_text=False,
),
model_kwargs={"quantization_config": quantization_config},
)
chat_model = ChatHuggingFace(llm=llm)
调用
from langchain_core.messages import (
HumanMessage,
SystemMessage,
)
messages = [
SystemMessage(content="You're a helpful assistant"),
HumanMessage(
content="What happens when an unstoppable force meets an immovable object?"
),
]
ai_msg = chat_model.invoke(messages)
print(ai_msg.content)
According to the popular phrase and hypothetical scenario, when an unstoppable force meets an immovable object, a paradoxical situation arises as both forces are seemingly contradictory. On one hand, an unstoppable force is an entity that cannot be stopped or prevented from moving forward, while on the other hand, an immovable object is something that cannot be moved or displaced from its position.
In this scenario, it is un
API 参考
有关 ChatHuggingFace 所有功能和配置的详细文档,请访问 API参考:https://python.langchain.com/api_reference/huggingface/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html
API 参考
如需了解 ChatHuggingFace 所有功能和配置的详细文档,请访问 API 参考:https://python.langchain.com/api_reference/huggingface/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html
Related
- Chat model conceptual guide
- Chat model how-to guides