Neo4j
Neo4j 是由
Neo4j, Inc开发的图形数据库管理系统。
Neo4j存储的数据元素是节点、连接它们的边,以及节点和边的属性。由其开发者描述为符合 ACID 的事务数据库,具有原生图形存储和处理能力,Neo4j提供了一个基于 GNU 通用公共许可证修改版许可的非开源“社区版”,并提供具有在线备份和高可用性扩展的闭源商业许可证。Neo 也根据闭源商业条款许可了带有这些扩展的Neo4j。
本笔记本展示了如何使用 LLM 为图形数据库提供自然语言界面,您可以使用
Cypher查询语 言进行查询。
Cypher 是一种声明式图形查询语言,允许以属性图进行富有表现力和高效的数据查询。
设置
您将需要一个正在运行的 Neo4j 实例。一个选择是使用他们 Aura 云服务中的免费 Neo4j 数据库实例。您也可以使用 Neo4j Desktop 应用程序 在本地运行数据库,或者运行一个 docker 容器。
您可以通过运行以下脚本来运行本地 docker 容器:
docker run \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-d \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS=\[\"apoc\"\] \
neo4j:latest
如果您使用的是 docker 容器,则需要等待几秒钟让数据库启动。
from langchain_neo4j import GraphCypherQAChain, Neo4jGraph
from langchain_openai import ChatOpenAI
graph = Neo4jGraph(url="bolt://localhost:7687", username="neo4j", password="password")
本指南中,我们默认使用 OpenAI 模型。
import getpass
import os
if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
填充数据库
假设您的数据库是空的,您可以使用 Cypher 查询语言进行填充。以下 Cypher 语句是幂等的,这意味着无论您运行一次还是多次,数据库信息都将保持不变。
graph.query(
"""
MERGE (m:Movie {name:"Top Gun", runtime: 120})
WITH m
UNWIND ["Tom Cruise", "Val Kilmer", "Anthony Edwards", "Meg Ryan"] AS actor
MERGE (a:Actor {name:actor})
MERGE (a)-[:ACTED_IN]->(m)
"""
)
[]
刷新图谱模式信息
如果数据库的模式发生变化,您可以刷新生成 Cypher 语句所需的模式信息。
graph.refresh_schema()
print(graph.schema)
Node properties:
Movie {runtime: INTEGER, name: STRING}
Actor {name: STRING}
Relationship properties:
The relationships:
(:Actor)-[:ACTED_IN]->(:Movie)
增强型 Schema 信息
选择增强型 Schema 版本后,系统可以自动扫描数据库中的示例值并计算一些分布度量。例如,如果节点属性的唯一值少于 10 个,我们将返回 Schema 中的所有可能值。否则,每个节点和关系属性将仅返回一个示例值。
enhanced_graph = Neo4jGraph(
url="bolt://localhost:7687",
username="neo4j",
password="password",
enhanced_schema=True,
)
print(enhanced_graph.schema)
Node properties:
- **Movie**
- `runtime`: INTEGER Min: 120, Max: 120
- `name`: STRING Available options: ['Top Gun']
- **Actor**
- `name`: STRING Available options: ['Tom Cruise', 'Val Kilmer', 'Anthony Edwards', 'Meg Ryan']
Relationship properties:
The relationships:
(:Actor)-[:ACTED_IN]->(:Movie)
查询图谱
我们现在可以使用图谱的 Cypher QA 工具来查询图谱了。
chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0), graph=graph, verbose=True, allow_dangerous_requests=True
)
chain.invoke({"query": "Who played in Top Gun?"})
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name[0m
Full Context:
[32;1m[1;3m[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}][0m
[1m> Finished chain.[0m
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.'}
限制结果数量
您可以使用 top_k 参数来限制 Cypher QA Chain 的结果数量。
默认值为 10。
chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
top_k=2,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name[0m
Full Context:
[32;1m[1;3m[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}][0m
[1m> Finished chain.[0m
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer played in Top Gun.'}
返回中间结果
您可以使用 return_intermediate_steps 参数从 Cypher QA Chain 返回中间步骤。
chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
return_intermediate_steps=True,
allow_dangerous_requests=True,
)
result = chain.invoke({"query": "Who played in Top Gun?"})
print(f"Intermediate steps: {result['intermediate_steps']}")
print(f"Final answer: {result['result']}")
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name[0m
Full Context:
[32;1m[1;3m[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}][0m
[1m> Finished chain.[0m
Intermediate steps: [{'query': "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)\nWHERE m.name = 'Top Gun'\nRETURN a.name"}, {'context': [{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}]}]
Final answer: Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.
直接返回结果
您可以使用 return_direct 参数从 Cypher QA Chain 直接返回结果。
chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
return_direct=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name[0m
[1m> Finished chain.[0m
{'query': 'Who played in Top Gun?',
'result': [{'a.name': 'Tom Cruise'},
{'a.name': 'Val Kilmer'},
{'a.name': 'Anthony Edwards'},
{'a.name': 'Meg Ryan'}]}
添加 Cypher 生成提示中的示例
您可以为特定问题定义 LLM 要生成的 Cypher 语句
from langchain_core.prompts.prompt import PromptTemplate
CYPHER_GENERATION_TEMPLATE = """Task:Generate Cypher statement to query a graph database.
Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
Schema:
{schema}
Note: Do not include any explanations or apologies in your responses.
Do not respond to any questions that might ask anything else than for you to construct a Cypher statement.
Do not include any text except the generated Cypher statement.
Examples: Here are a few examples of generated Cypher statements for particular questions:
# How many people played in Top Gun?
MATCH (m:Movie {{name:"Top Gun"}})<-[:ACTED_IN]-()
RETURN count(*) AS numberOfActors
The question is:
{question}"""
CYPHER_GENERATION_PROMPT = PromptTemplate(
input_variables=["schema", "question"], template=CYPHER_GENERATION_TEMPLATE
)
chain = GraphCypherQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
verbose=True,
cypher_prompt=CYPHER_GENERATION_PROMPT,
allow_dangerous_requests=True,
)
chain.invoke({"query": "How many people played in Top Gun?"})
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (m:Movie {name:"Top Gun"})<-[:ACTED_IN]-()
RETURN count(*) AS numberOfActors[0m
Full Context:
[32;1m[1;3m[{'numberOfActors': 4}][0m
[1m> Finished chain.[0m
{'query': 'How many people played in Top Gun?',
'result': 'There were 4 actors in Top Gun.'}
为 Cypher 查询和答案生成使用独立的 LLM
您可以使用 cypher_llm 和 qa_llm 参数来定义不同的 LLM
chain = GraphCypherQAChain.from_llm(
graph=graph,
cypher_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo"),
qa_llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo-16k"),
verbose=True,
allow_dangerous_requests=True,
)
chain.invoke({"query": "Who played in Top Gun?"})
[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WHERE m.name = 'Top Gun'
RETURN a.name[0m
Full Context:
[32;1m[1;3m[{'a.name': 'Tom Cruise'}, {'a.name': 'Val Kilmer'}, {'a.name': 'Anthony Edwards'}, {'a.name': 'Meg Ryan'}][0m
[1m> Finished chain.[0m
{'query': 'Who played in Top Gun?',
'result': 'Tom Cruise, Val Kilmer, Anthony Edwards, and Meg Ryan played in Top Gun.'}