Kuzu

Kùzu 是一个可嵌入、可扩展、速度极快的图数据库。它采用宽松的 MIT 许可协议，您可以在此处查看其源代码。

Kùzu 的主要特点：

性能与可扩展性：实现了现代、先进的图连接算法。

易用性：设置和启动非常简单，因为它没有服务器（嵌入式架构）。

互操作性：可以方便地从外部列式格式、CSV、JSON 和关系数据库扫描和复制数据。

结构化属性图模型：实现了属性图模型，并增加了结构。

Cypher 支持：允许使用声明式查询语言 Cypher 方便地查询图数据。

要开始使用 Kuzu，请访问其文档。

设置

Kùzu 是一个嵌入式数据库（它在进程内运行），因此无需管理服务器。请安装以下依赖项以开始使用：

pip install -U langchain-kuzu langchain-openai langchain-experimental

这将安装 Kùzu 以及它与 LangChain 的集成，以及 OpenAI Python 包，以便我们可以使用 OpenAI 的 LLM。如果您想使用其他 LLM 提供商，可以安装 LangChain 提供的相应 Python 包。

下面介绍如何首先在本地计算机上创建 Kùzu 数据库并连接到它：

import kuzu

db = kuzu.Database("test_db")
conn = kuzu.Connection(db)

创建 `KuzuGraph`

Kùzu 与 LangChain 的集成，使得从非结构化文本创建和更新图表，以及通过利用 LangChain 的 LLM 链的功能的 Text2Cypher 管道查询图表变得非常方便。首先，我们创建一个 KuzuGraph 对象，该对象结合了我们上面创建的数据库对象和 KuzuGraph 构造函数。

from langchain_kuzu.graphs.kuzu_graph import KuzuGraph

graph = KuzuGraph(db, allow_dangerous_requests=True)

假设我们想将以下文本转换为图表：

text = "Tim Cook is the CEO of Apple. Apple has its headquarters in California."

我们将使用 LLMGraphTransformer 来利用 LLM 从文本中提取节点和关系。为了让图谱更有用，我们将定义以下 schema，以便 LLM 只提取匹配 schema 的节点和关系。

# Define schema
allowed_nodes = ["Person", "Company", "Location"]
allowed_relationships = [
    ("Person", "IS_CEO_OF", "Company"),
    ("Company", "HAS_HEADQUARTERS_IN", "Location"),
]

LLMGraphTransformer 类提供了一种将文本转换为图文档列表的便捷方法。

from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_openai import ChatOpenAI

# Define the LLMGraphTransformer
llm_transformer = LLMGraphTransformer(
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0, api_key=OPENAI_API_KEY),  # noqa: F821
    allowed_nodes=allowed_nodes,
    allowed_relationships=allowed_relationships,
)

documents = [Document(page_content=text)]
graph_documents = llm_transformer.convert_to_graph_documents(documents)

API Reference:Document | LLMGraphTransformer | ChatOpenAI

graph_documents[:2]

[GraphDocument(nodes=[Node(id='Tim Cook', type='Person', properties={}), Node(id='Apple', type='Company', properties={}), Node(id='California', type='Location', properties={})], relationships=[Relationship(source=Node(id='Tim Cook', type='Person', properties={}), target=Node(id='Apple', type='Company', properties={}), type='IS_CEO_OF', properties={}), Relationship(source=Node(id='Apple', type='Company', properties={}), target=Node(id='California', type='Location', properties={}), type='HAS_HEADQUARTERS_IN', properties={})], source=Document(metadata={}, page_content='Tim Cook is the CEO of Apple. Apple has its headquarters in California.'))]

然后，我们可以调用上面定义的 KuzuGraph 对象的 add_graph_documents 方法，将图文档摄取到 Kùzu 数据库中。 include_source 参数被设置为 True，以便我们还能在每个实体节点和它所属的源文档之间创建关系。

# Add the graph document to the graph
graph.add_graph_documents(
    graph_documents,
    include_source=True,
)

创建 `KuzuQAChain`

要通过 Text2Cypher 管道查询图谱，我们可以定义一个 KuzuQAChain 对象。然后，通过连接到上面定义的存储在 test_db 目录中的现有数据库，来调用该链进行查询。

from langchain_kuzu.chains.graph_qa.kuzu import KuzuQAChain

# Create the KuzuQAChain with verbosity enabled to see the generated Cypher queries
chain = KuzuQAChain.from_llm(
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0.3, api_key=OPENAI_API_KEY),  # noqa: F821
    graph=graph,
    verbose=True,
    allow_dangerous_requests=True,
)

请注意，我们将 temperature 设置得比零稍高一些，以避免 LLM 在其响应中过于简洁。

让我们使用 QA 链来问一些问题。

chain.invoke("Who is the CEO of Apple?")

[1m> Entering new KuzuQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person)-[:IS_CEO_OF]->(c:Company {id: 'Apple'}) RETURN p[0m
Full Context:
[32;1m[1;3m[{'p': {'_id': {'offset': 0, 'table': 1}, '_label': 'Person', 'id': 'Tim Cook', 'type': 'entity'}}][0m

[1m> Finished chain.[0m

{'query': 'Who is the CEO of Apple?',
 'result': 'Tim Cook is the CEO of Apple.'}

chain.invoke("Where is Apple headquartered?")

[1m> Entering new KuzuQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (c:Company {id: 'Apple'})-[:HAS_HEADQUARTERS_IN]->(l:Location) RETURN l[0m
Full Context:
[32;1m[1;3m[{'l': {'_id': {'offset': 0, 'table': 2}, '_label': 'Location', 'id': 'California', 'type': 'entity'}}][0m

[1m> Finished chain.[0m

{'query': 'Where is Apple headquartered?',
 'result': 'Apple is headquartered in California.'}

刷新图谱 schema

如果您变异或更新了图谱，可以检查用于 Text2Cypher 链生成 Cypher 语句的刷新后的 schema 信息。您无需每次手动调用 refresh_schema()，因为它在调用链时会自动调用。

graph.refresh_schema()

print(graph.get_schema)

Node properties: [{'properties': [('id', 'STRING'), ('type', 'STRING')], 'label': 'Person'}, {'properties': [('id', 'STRING'), ('type', 'STRING')], 'label': 'Location'}, {'properties': [('id', 'STRING'), ('text', 'STRING'), ('type', 'STRING')], 'label': 'Chunk'}, {'properties': [('id', 'STRING'), ('type', 'STRING')], 'label': 'Company'}]
Relationships properties: [{'properties': [], 'label': 'HAS_HEADQUARTERS_IN'}, {'properties': [('label', 'STRING'), ('triplet_source_id', 'STRING')], 'label': 'MENTIONS_Chunk_Person'}, {'properties': [('label', 'STRING'), ('triplet_source_id', 'STRING')], 'label': 'MENTIONS_Chunk_Location'}, {'properties': [], 'label': 'IS_CEO_OF'}, {'properties': [('label', 'STRING'), ('triplet_source_id', 'STRING')], 'label': 'MENTIONS_Chunk_Company'}]
Relationships: ['(:Company)-[:HAS_HEADQUARTERS_IN]->(:Location)', '(:Chunk)-[:MENTIONS_Chunk_Person]->(:Person)', '(:Chunk)-[:MENTIONS_Chunk_Location]->(:Location)', '(:Person)-[:IS_CEO_OF]->(:Company)', '(:Chunk)-[:MENTIONS_Chunk_Company]->(:Company)']

为 Cypher 和答案生成使用单独的 LLM

您可以单独指定 cypher_llm 和 qa_llm，以便为 Cypher 生成和答案生成使用不同的 LLM。

chain = KuzuQAChain.from_llm(
    cypher_llm=ChatOpenAI(temperature=0, model="gpt-4o-mini"),
    qa_llm=ChatOpenAI(temperature=0, model="gpt-4"),
    graph=graph,
    verbose=True,
    allow_dangerous_requests=True,
)

chain.invoke("Who is the CEO of Apple?")

[1m> Entering new KuzuQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person)-[:IS_CEO_OF]->(c:Company {id: 'Apple'}) RETURN p.id, p.type[0m
Full Context:
[32;1m[1;3m[{'p.id': 'Tim Cook', 'p.type': 'entity'}][0m

[1m> Finished chain.[0m

{'query': 'Who is the CEO of Apple?',
 'result': 'Tim Cook is the CEO of Apple.'}

设置​

创建 KuzuGraph​

创建 KuzuQAChain​

刷新图谱 schema​

为 Cypher 和答案生成使用单独的 LLM​

设置

创建 `KuzuGraph`

创建 `KuzuQAChain`

刷新图谱 schema

为 Cypher 和答案生成使用单独的 LLM