SQL Server
Azure SQL 提供了一个专用的 Vector data type,可简化关系数据库中向量嵌入的创建、存储和查询。这消除了对单独向量数据库和相关集成的需求,提高了解决方案的安全性,同时降低了总体复杂性。
Azure SQL 是一项强大的服务,结合了 可伸缩性、安全性和高可用性,提供了现代数据库解决方案的所有优势。它利用复杂的查询优化器和企业级功能,在执行传统 SQL 查询的同时执行向量相似性搜索,从而增强数据分析和决策制定。
在以下链接中了解更多有关使用 Intelligent applications with Azure SQL Database 的信息
本笔记本演示了如何利用此集成的 SQL vector database 来存储文档,并使用余弦(余弦距离)、L2(欧几里得距离)和 IP(内积)执行向量搜索查询,以查找与查询向量接近的文档。
设置
安装 langchain-sqlserver Python 包。
代码位于一个名为 langchain-sqlserver 的集成包中。
!pip install langchain-sqlserver==0.1.1
凭证
运行此笔记本不需要任何凭证,只需确保您已下载 langchain_sqlserver 包。
如果您想获得一流的自动化模型调用跟踪,还可以通过取消注释下面的行来设置您的 LangSmith API 密钥:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
初始化
from langchain_sqlserver import SQLServer_VectorStore
在 Azure 门户中,于数据库设置下查找你的 Azure SQL 数据库连接字符串。
详细信息:连接到 Azure SQL 数据库 - Python
import os
import pyodbc
# Define your SQLServer Connection String
_CONNECTION_STRING = (
"Driver={ODBC Driver 18 for SQL Server};"
"Server=<YOUR_DBSERVER>.database.windows.net,1433;"
"Database=test;"
"TrustServerCertificate=yes;"
"Connection Timeout=60;"
"LongAsMax=yes;"
)
# Connection string can vary:
# "mssql+pyodbc://<username>:<password><servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server" -> With Username and Password specified
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server&Trusted_connection=yes" -> Uses Trusted connection
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server" -> Uses EntraID connection
# "mssql+pyodbc://<servername>/<dbname>?driver=ODBC+Driver+18+for+SQL+Server&Trusted_connection=no" -> Uses EntraID connection
在此示例中,我们使用 Azure OpenAI 来生成 embeddings,但您也可以使用 LangChain 中提供的不同 embeddings。
您可以在 Azure 门户上按照此 指南 部署一个 Azure OpenAI 实例。一旦您的实例运行起来,请确保您知道实例的名称和密钥。您可以在 Azure 门户的实例“密钥和终结点”部分找到密钥。
!pip install langchain-openai
# Import the necessary Libraries
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
# Set your AzureOpenAI details
azure_endpoint = "https://<YOUR_ENDPOINT>.openai.azure.com/"
azure_deployment_name_embedding = "text-embedding-3-small"
azure_deployment_name_chatcompletion = "chatcompletion"
azure_api_version = "2023-05-15"
azure_api_key = "YOUR_KEY"
# Use AzureChatOpenAI for chat completions
llm = AzureChatOpenAI(
azure_endpoint=azure_endpoint,
azure_deployment=azure_deployment_name_chatcompletion,
openai_api_version=azure_api_version,
openai_api_key=azure_api_key,
)
# Use AzureOpenAIEmbeddings for embeddings
embeddings = AzureOpenAIEmbeddings(
azure_endpoint=azure_endpoint,
azure_deployment=azure_deployment_name_embedding,
openai_api_version=azure_api_version,
openai_api_key=azure_api_key,
)
管理向量存储
from langchain_community.vectorstores.utils import DistanceStrategy
from langchain_sqlserver import SQLServer_VectorStore
# Initialize the vector store
vector_store = SQLServer_VectorStore(
connection_string=_CONNECTION_STRING,
distance_strategy=DistanceStrategy.COSINE, # optional, if not provided, defaults to COSINE
embedding_function=embeddings, # you can use different embeddings provided in LangChain
embedding_length=1536,
table_name="langchain_test_table", # using table with a custom name
)
添加项到向量存储
## we will use some artificial data for this example
query = [
"I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.",
"The candy is just red , No flavor . Just plan and chewy . I would never buy them again",
"Arrived in 6 days and were so stale i could not eat any of the 6 bags!!",
"Got these on sale for roughly 25 cents per cup, which is half the price of my local grocery stores, plus they rarely stock the spicy flavors. These things are a GREAT snack for my office where time is constantly crunched and sometimes you can't escape for a real meal. This is one of my favorite flavors of Instant Lunch and will be back to buy every time it goes on sale.",
"If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!",
"We had trouble finding this locally - delivery was fast, no more hunting up and down the flour aisle at our local grocery stores.",
"Too much of a good thing? We worked this kibble in over time, slowly shifting the percentage of Felidae to national junk-food brand until the bowl was all natural. By this time, the cats couldn't keep it in or down. What a mess. We've moved on.",
"Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.",
"The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2"
" smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.",
"I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!",
"Been taking Coconut Oil for YEARS. This is the best on the retail market. I wish it was in glass, but this is the one.",
]
query_metadata = [
{"id": 1, "summary": "Good Quality Dog Food"},
{"id": 8, "summary": "Nasty No flavor"},
{"id": 4, "summary": "stale product"},
{"id": 11, "summary": "Great value and convenient ramen"},
{"id": 5, "summary": "Great for the kids!"},
{"id": 2, "summary": "yum falafel"},
{"id": 9, "summary": "Nearly killed the cats"},
{"id": 6, "summary": "Price cannot be correct"},
{"id": 3, "summary": "Taste is neutral, quantity is DECEITFUL!"},
{"id": 7, "summary": "This stuff is great"},
{"id": 10, "summary": "The reviews don't lie"},
]
vector_store.add_texts(texts=query, metadatas=query_metadata)
[1, 8, 4, 11, 5, 2, 9, 6, 3, 7, 10]
查询向量存储
一旦创建了向量存储并添加了相关文档,您将最有可能希望在链或代理运行时查询它。
执行简单的相似性搜索可以这样做:
# Perform a similarity search between the embedding of the query and the embeddings of the documents
simsearch_result = vector_store.similarity_search("Good reviews", k=3)
print(simsearch_result)
[Document(metadata={'id': 1, 'summary': 'Good Quality Dog Food'}, page_content='I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.'), Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!")]
筛选支持:
向量存储支持一组可以应用于文档元数据字段的过滤器。此功能使开发人员和数据分析师能够优化他们的查询,确保搜索结果准确地符合他们的需求。通过根据特定的元数据属性应用过滤器,用户可以限制其搜索范围,只关注最相关的数据子集。
# hybrid search -> filter for cases where id not equal to 1.
hybrid_simsearch_result = vector_store.similarity_search(
"Good reviews", k=3, filter={"id": {"$ne": 1}}
)
print(hybrid_simsearch_result)
[Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!"), Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.')]
相似性搜索及得分:
如果你想执行相似性搜索并获取相应的得分,可以运行:
simsearch_with_score_result = vector_store.similarity_search_with_score(
"Not a very good product", k=12
)
print(simsearch_with_score_result)
[(Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), 0.651870006770711), (Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), 0.6908952973052638), (Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), 0.7360955776468822), (Document(metadata={'id': 1, 'summary': 'Good Quality Dog Food'}, page_content='I have bought several of the Vitality canned dog food products and have found them all to be of good quality. The product looks more like a stew than a processed meat and it smells better. My Labrador is finicky and she appreciates this product better than most.'), 0.7408823529514486), (Document(metadata={'id': 9, 'summary': 'Nearly killed the cats'}, page_content="Too much of a good thing? We worked this kibble in over time, slowly shifting the percentage of Felidae to national junk-food brand until the bowl was all natural. By this time, the cats couldn't keep it in or down. What a mess. We've moved on."), 0.782995248991772), (Document(metadata={'id': 7, 'summary': 'This stuff is great'}, page_content='I have used this product in smoothies for my son and he loves it. Additionally, I use this oil in the shower as a skin conditioner and it has made my skin look great. Some of the stretch marks on my belly has disappeared quickly. Highly recommend!!!'), 0.7912681479906212), (Document(metadata={'id': 2, 'summary': 'yum falafel'}, page_content='We had trouble finding this locally - delivery was fast, no more hunting up and down the flour aisle at our local grocery stores.'), 0.809213468778896), (Document(metadata={'id': 10, 'summary': "The reviews don't lie"}, page_content='Been taking Coconut Oil for YEARS. This is the best on the retail market. I wish it was in glass, but this is the one.'), 0.8281482301097155), (Document(metadata={'id': 5, 'summary': 'Great for the kids!'}, page_content="If you are looking for a less messy version of licorice for the children, then be sure to try these! They're soft, easy to chew, and they don't get your hands all sticky and gross in the car, in the summer, at the beach, etc. We love all the flavos and sometimes mix these in with the chocolate to have a very nice snack! Great item, great price too, highly recommend!"), 0.8283754326400574), (Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.'), 0.8323967822635847), (Document(metadata={'id': 11, 'summary': 'Great value and convenient ramen'}, page_content="Got these on sale for roughly 25 cents per cup, which is half the price of my local grocery stores, plus they rarely stock the spicy flavors. These things are a GREAT snack for my office where time is constantly crunched and sometimes you can't escape for a real meal. This is one of my favorite flavors of Instant Lunch and will be back to buy every time it goes on sale."), 0.8387189489406939)]
如需执行 Azure SQL 向量存储上的各种搜索的完整列表,请参阅 API 参考。
相似性搜索:当您已拥有想要搜索的嵌入时
# if you already have embeddings you want to search on
simsearch_by_vector = vector_store.similarity_search_by_vector(
[-0.0033353185281157494, -0.017689190804958344, -0.01590404286980629, ...]
)
print(simsearch_by_vector)
[Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.')]
# Similarity Search with Score if you already have embeddings you want to search on
simsearch_by_vector_with_score = vector_store.similarity_search_by_vector_with_score(
[-0.0033353185281157494, -0.017689190804958344, -0.01590404286980629, ...]
)
print(simsearch_by_vector_with_score)
[(Document(metadata={'id': 8, 'summary': 'Nasty No flavor'}, page_content='The candy is just red , No flavor . Just plan and chewy . I would never buy them again'), 0.9648153551769503), (Document(metadata={'id': 4, 'summary': 'stale product'}, page_content='Arrived in 6 days and were so stale i could not eat any of the 6 bags!!'), 0.9655108580341948), (Document(metadata={'id': 3, 'summary': 'Taste is neutral, quantity is DECEITFUL!'}, page_content='The taste of these white cheddar flat breads is like a regular cracker - which is not bad, except that I bought them because I wanted a cheese taste.<br /><br />What was a HUGE disappointment? How misleading the packaging of the box is. The photo on the box (I bought these in store) makes it look like it is full of long flatbreads (expanding the length and width of the box). Wrong! The plastic tray that holds the crackers is about 2 smaller all around - leaving you with about 15 or so small flatbreads.<br /><br />What is also bad about this is that the company states they use biodegradable and eco-friendly packaging. FAIL! They used a HUGE box for a ridiculously small amount of crackers. Not ecofriendly at all.<br /><br />Would I buy these again? No - I feel ripped off. The other crackers (like Sesame Tarragon) give you a little<br />more bang for your buck and have more flavor.'), 0.9840511208615808), (Document(metadata={'id': 6, 'summary': 'Price cannot be correct'}, page_content='Hey, the description says 360 grams - that is roughly 13 ounces at under $4.00 per can. No way - that is the approximate price for a 100 gram can.'), 0.9915737524649991)]
从向量存储中删除项目
按 ID 删除行
# delete row by id
vector_store.delete(["3", "7"])
True