FastEmbed by Qdrant

FastEmbed 由 Qdrant 开发，是一个轻量级、快速的 Python 库，专为生成 embedding 而构建。

量化模型权重

ONNX Runtime，无 PyTorch 依赖

CPU 优先设计

数据并行化，用于编码大型数据集。

依赖

要将 FastEmbed 与 LangChain 结合使用，请安装 fastembed Python 包。

%pip install --upgrade --quiet  fastembed

from langchain_community.embeddings.fastembed import FastEmbedEmbeddings

model_name: str (默认值: "BAAI/bge-small-en-v1.5")

要使用的 FastEmbedding 模型名称。您可以在此处找到支持的模型列表：here。
max_length: int (默认值: 512)

token 的最大数量。值 > 512 的行为未知。
cache_dir: Optional[str] (默认值: None)

缓存目录的路径。默认为父目录下的 local_cache。
threads: Optional[int] (默认值: None)

单个 onnxruntime 会话可使用的线程数。
doc_embed_type: Literal["default", "passage"] (默认值: "default")

"default": 使用 FastEmbed 的默认嵌入方法。

"passage": 在嵌入前将文本加上 "passage" 前缀。
batch_size: int (默认值: 256)

编码的批处理大小。值越高，内存占用越多，但速度更快。
parallel: Optional[int] (默认值: None)

如果 >1，将使用数据并行编码，建议用于大型数据集的离线编码。如果为 0，则使用所有可用核心。如果为 None，则不使用数据并行处理，而是使用默认的 onnxruntime 线程处理。

embeddings = FastEmbedEmbeddings()

document_embeddings = embeddings.embed_documents(
    ["This is a document", "This is some other document"]
)

query_embeddings = embeddings.embed_query("This is a query")