DingoDB
DingoDB 是一个分布式多模向量数据库,结合了数据湖和向量数据库的特点,可以存储任何类型和大小的数据(键值对、PDF、音频、视频等)。它具备实时低延迟的处理能力,能够实现快速洞察和响应,并能高效地进行即时分析和处理多模态数据。
您需要安装 langchain-community,命令为 pip install -qU langchain-community 才能使用此集成。
本笔记演示如何使用与 DingoDB 向量数据库相关的 funcionalidades。
要运行,您需要有一个 已启动并运行的 DingoDB 实例。
%pip install --upgrade --quiet dingodb
# or install latest:
%pip install --upgrade --quiet git+https://git@github.com/dingodb/pydingo.git
我们想使用 OpenAIEmbeddings,所以我们必须获取 OpenAI API 密钥。
import getpass
import os
if "OPENAI_API_KEY" not in os.environ:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
OpenAI API Key:········
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Dingo
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
loader = TextLoader("../../how_to/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
API Reference:TextLoader
from dingodb import DingoDB
index_name = "langchain_demo"
dingo_client = DingoDB(user="", password="", host=["127.0.0.1:13000"])
# First, check if our index already exists. If it doesn't, we create it
if (
index_name not in dingo_client.get_index()
and index_name.upper() not in dingo_client.get_index()
):
# we create a new index, modify to your own
dingo_client.create_index(
index_name=index_name, dimension=1536, metric_type="cosine", auto_id=False
)
# The OpenAI embedding model `text-embedding-ada-002 uses 1536 dimensions`
docsearch = Dingo.from_documents(
docs, embeddings, client=dingo_client, index_name=index_name
)
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Dingo
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
query = "What did the president say about Ketanji Brown Jackson"
docs = docsearch.similarity_search(query)
print(docs[0].page_content)
向现有索引添加更多文本
您可以使用 add_texts 函数将更多文本嵌入并更新到现有的 Dingo 索引中。
vectorstore = Dingo(embeddings, "text", client=dingo_client, index_name=index_name)
vectorstore.add_texts(["More text!"])