GreenNodeRetriever
GreenNode是一家全球人工智能解决方案提供商,也是 NVIDIA 首选合作伙伴,为美国、MENA 和 APAC 地区的各行业企业提供从基础设施到应用的全栈人工智能能力。GreenNode 运行在 世界一流的基础设施 (LEED Gold, TIA‑942, Uptime Tier III) 上,通过一套全面的 AI 服务赋能企业、初创公司和研究人员。
本 Notebook 提供了一个关于如何开始使用 GreenNodeRerank retriever 的指南。它使您能够使用内置连接器或通过集成自己的数据源来执行文档搜索,并利用 GreenNode 的重排(reranking)能力来提高相关性。
集成详情
- 提供商:GreenNode Serverless AI
- 模型类型:重排(Reranking)模型
- 主要用例:根据语义相关性对搜索结果进行重排
- 可用模型:包括 BAAI/bge-reranker-v2-m3 和其他高性能重排模型
- 评分:返回相关性分数,用于根据查询匹配度对文档候选进行重新排序
设置
要访问 GreenNode 模型,您需要创建一个 GreenNode 账户,获取一个 API 密钥,并安装 langchain-greennode 集成包。
凭证
前往此页面注册 GreenNode AI Platform 并生成 API 密钥。完成此操作后,请设置 GREENNODE_API_KEY 环境变量:
import getpass
import os
if not os.getenv("GREENNODE_API_KEY"):
os.environ["GREENNODE_API_KEY"] = getpass.getpass("Enter your GreenNode API key: ")
如果你想从单个查询中获取自动化跟踪,你也可以通过取消注释以下内容来设置你的 LangSmith API 密钥:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
安装
此检索器位于 langchain-greennode 包中:
%pip install -qU langchain-greennode
Note: you may need to restart the kernel to use updated packages.
实例化
GreenNodeRerank 类可以通过 API 密钥和模型名称的可选参数进行实例化:
from langchain_greennode import GreenNodeRerank
# Initialize the embeddings model
reranker = GreenNodeRerank(
# api_key="YOUR_API_KEY", # You can pass the API key directly
model="BAAI/bge-reranker-v2-m3", # The default embedding model
top_n=3,
)
用法
重新排序搜索结果
重新排序模型通过根据语义相关性来改进和重新排序初步的搜索结果,从而增强检索增强生成(RAG)工作流程。下面的示例演示了如何将 GreenNodeRerank 与基础检索器集成以提高检索文档的质量。
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_greennode import GreenNodeEmbeddings
# Initialize the embeddings model
embeddings = GreenNodeEmbeddings(
# api_key="YOUR_API_KEY", # You can pass the API key directly
model="BAAI/bge-m3" # The default embedding model
)
# Prepare documents (finance/economics domain)
docs = [
Document(
page_content="Inflation represents the rate at which the general level of prices for goods and services rises"
),
Document(
page_content="Central banks use interest rates to control inflation and stabilize the economy"
),
Document(
page_content="Cryptocurrencies like Bitcoin operate on decentralized blockchain networks"
),
Document(
page_content="Stock markets are influenced by corporate earnings, investor sentiment, and economic indicators"
),
]
# Create a vector store and a base retriever
vector_store = FAISS.from_documents(docs, embeddings)
base_retriever = vector_store.as_retriever(search_kwargs={"k": 4})
rerank_retriever = ContextualCompressionRetriever(
base_compressor=reranker, base_retriever=base_retriever
)
# Perform retrieval with reranking
query = "How do central banks fight rising prices?"
results = rerank_retriever.get_relevant_documents(query)
results
/var/folders/bs/g52lln652z11zjp98qf9wcy40000gn/T/ipykernel_96362/2544494776.py:41: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 1.0. Use :meth:`~invoke` instead.
results = rerank_retriever.get_relevant_documents(query)
[Document(metadata={'relevance_score': 0.125}, page_content='Central banks use interest rates to control inflation and stabilize the economy'),
Document(metadata={'relevance_score': 0.004913330078125}, page_content='Inflation represents the rate at which the general level of prices for goods and services rises'),
Document(metadata={'relevance_score': 1.6689300537109375e-05}, page_content='Cryptocurrencies like Bitcoin operate on decentralized blockchain networks')]
直接使用
GreenNodeRerank 类可以独立使用,根据相关性分数对检索到的文档进行重排序。当初始检索步骤(例如,关键字或向量搜索)返回大量候选文档,而需要第二个模型使用更复杂的语义理解来优化结果时,此功能尤其有用。该类接受一个查询和一份候选文档列表,并根据预测的相关性返回一个重新排序的列表。
test_documents = [
Document(
page_content="Carson City is the capital city of the American state of Nevada."
),
Document(
page_content="Washington, D.C. (also known as simply Washington or D.C.) is the capital of the United States."
),
Document(
page_content="Capital punishment has existed in the United States since beforethe United States was a country."
),
Document(
page_content="The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan."
),
]
test_query = "What is the capital of the United States?"
results = reranker.rerank(test_documents, test_query)
results
[{'index': 1, 'relevance_score': 1.0},
{'index': 0, 'relevance_score': 0.01165771484375},
{'index': 3, 'relevance_score': 0.0012054443359375}]
在链中
GreenNodeRerank 在 LangChain RAG 管道中无缝工作。以下是一个使用 GreenNodeRerank 创建简单 RAG 链的示例:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_greennode import ChatGreenNode
# Initialize LLM
llm = ChatGreenNode(model="deepseek-ai/DeepSeek-R1-Distill-Qwen-32B")
# Create a prompt template
prompt = ChatPromptTemplate.from_template(
"""
Answer the question based only on the following context:
Context:
{context}
Question: {question}
"""
)
# Format documents function
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
# Create RAG chain
rag_chain = (
{"context": rerank_retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
# Run the chain
answer = rag_chain.invoke("How do central banks fight rising prices?")
answer
'\n\nCentral banks combat rising prices, or inflation, by adjusting interest rates. By raising interest rates, they increase the cost of borrowing, which discourages spending and investment. This reduction in demand helps slow down the rate of price increases, thereby controlling inflation and contributing to economic stability.'
API 参考
有关 GreenNode Serverless AI API 的更多详细信息, 请访问 GreenNode Serverless AI 文档。
Related
- Retriever conceptual guide
- Retriever how-to guides