Skip to main content
Open In ColabOpen on GitHub

如何通过并行化对文本进行摘要

LLM 可以对文本进行摘要,或者从中提取所需的其他信息,包括大量文本。在许多情况下,特别是当文本量相对于模型的上下文窗口大小而言很大时,将摘要任务分解为更小的组件会很有帮助(或有必要)。

Map-reduce 代表了实现此目标的一类策略。其思想是将文本分解为“子文档”,并首先使用 LLM 将每个子文档映射到单独的摘要。然后,我们将这些摘要进行归约或合并,形成一个单一的全局摘要。

请注意,映射步骤通常针对输入文档进行并行化。当子文档的理解不依赖于先前的上下文时,此策略尤其有效。例如,在摘要许多较短文档的语料库时。

LangGraph 构建在 langchain-core 之上,支持 map-reduce 工作流程,非常适合解决此问题:

  • LangGraph 允许流式传输单个步骤(例如连续摘要),从而能够更好地控制执行;
  • LangGraph 的 checkpointing 支持错误恢复、通过人工干预工作流程进行扩展,以及更轻松地集成到会话式应用程序中。
  • LangGraph 的实现易于修改和扩展。

下面,我们将演示如何通过 map-reduce 策略对文本进行摘要。

加载聊天模型

我们先加载一个聊天模型:

pip install -qU "langchain[google-genai]"
import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gemini-2.0-flash", model_provider="google_genai")

加载文档

首先我们将加载文档。我们将使用 WebBaseLoader 来加载一篇博客文章,并将文档分割成更小的子文档。

from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
docs = loader.load()

split_docs = text_splitter.split_documents(docs)
print(f"Generated {len(split_docs)} documents.")
Created a chunk of size 1003, which is longer than the specified 1000
``````output
Generated 14 documents.

创建图

###映射步骤 首先,我们定义与映射步骤相关的提示,并通过 chain 将其与 LLM 相关联:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

map_prompt = ChatPromptTemplate.from_messages(
[("human", "Write a concise summary of the following:\\n\\n{context}")]
)

map_chain = map_prompt | llm | StrOutputParser()

缩小步骤

我们还定义了一个链,该链接收文档映射结果并将其缩小为单个输出。

reduce_template = """
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary
of the main themes.
"""

reduce_prompt = ChatPromptTemplate([("human", reduce_template)])

reduce_chain = reduce_prompt | llm | StrOutputParser()

通过 LangGraph 进行编排

下面我们将实现一个简单的应用程序,该应用程序将对文档列表进行摘要处理,然后使用上述提示进行归约。

当文本相对于 LLM 的上下文窗口较长时,映射-归约流特别有用。对于长文本,我们需要一种机制来确保归约步骤中需要摘要的上下文不会超出模型的上下文窗口大小。在这里,我们实现摘要的递归“折叠”:输入根据令牌限制进行分区,并对分区生成摘要。此步骤将重复进行,直到摘要的总长度在所需限制内,从而能够对任意长度的文本进行摘要。

我们需要安装 langgraph

pip install -qU langgraph
import operator
from typing import Annotated, List, Literal, TypedDict

from langchain.chains.combine_documents.reduce import (
acollapse_docs,
split_list_of_docs,
)
from langchain_core.documents import Document
from langgraph.constants import Send
from langgraph.graph import END, START, StateGraph

token_max = 1000


def length_function(documents: List[Document]) -> int:
"""Get number of tokens for input contents."""
return sum(llm.get_num_tokens(doc.page_content) for doc in documents)


# This will be the overall state of the main graph.
# It will contain the input document contents, corresponding
# summaries, and a final summary.
class OverallState(TypedDict):
# Notice here we use the operator.add
# This is because we want combine all the summaries we generate
# from individual nodes back into one list - this is essentially
# the "reduce" part
contents: List[str]
summaries: Annotated[list, operator.add]
collapsed_summaries: List[Document]
final_summary: str


# This will be the state of the node that we will "map" all
# documents to in order to generate summaries
class SummaryState(TypedDict):
content: str


# Here we generate a summary, given a document
async def generate_summary(state: SummaryState):
response = await map_chain.ainvoke(state["content"])
return {"summaries": [response]}


# Here we define the logic to map out over the documents
# We will use this an edge in the graph
def map_summaries(state: OverallState):
# We will return a list of `Send` objects
# Each `Send` object consists of the name of a node in the graph
# as well as the state to send to that node
return [
Send("generate_summary", {"content": content}) for content in state["contents"]
]


def collect_summaries(state: OverallState):
return {
"collapsed_summaries": [Document(summary) for summary in state["summaries"]]
}


# Add node to collapse summaries
async def collapse_summaries(state: OverallState):
doc_lists = split_list_of_docs(
state["collapsed_summaries"], length_function, token_max
)
results = []
for doc_list in doc_lists:
results.append(await acollapse_docs(doc_list, reduce_chain.ainvoke))

return {"collapsed_summaries": results}


# This represents a conditional edge in the graph that determines
# if we should collapse the summaries or not
def should_collapse(
state: OverallState,
) -> Literal["collapse_summaries", "generate_final_summary"]:
num_tokens = length_function(state["collapsed_summaries"])
if num_tokens > token_max:
return "collapse_summaries"
else:
return "generate_final_summary"


# Here we will generate the final summary
async def generate_final_summary(state: OverallState):
response = await reduce_chain.ainvoke(state["collapsed_summaries"])
return {"final_summary": response}


# Construct the graph
# Nodes:
graph = StateGraph(OverallState)
graph.add_node("generate_summary", generate_summary) # same as before
graph.add_node("collect_summaries", collect_summaries)
graph.add_node("collapse_summaries", collapse_summaries)
graph.add_node("generate_final_summary", generate_final_summary)

# Edges:
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "collect_summaries")
graph.add_conditional_edges("collect_summaries", should_collapse)
graph.add_conditional_edges("collapse_summaries", should_collapse)
graph.add_edge("generate_final_summary", END)

app = graph.compile()

LangGraph 允许绘制图表结构,以帮助可视化其功能:

from IPython.display import Image

Image(app.get_graph().draw_mermaid_png())

调用图

运行应用程序时,我们可以流式传输图以观察其步骤顺序。下面,我们将仅打印出步骤的名称。

请注意,由于图中存在循环,因此在执行时指定 recursion_limit 会很有帮助。当超过指定限制时,这将引发一个特定的错误。

async for step in app.astream(
{"contents": [doc.page_content for doc in split_docs]},
{"recursion_limit": 10},
):
print(list(step.keys()))
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['generate_summary']
['collect_summaries']
['collapse_summaries']
['collapse_summaries']
['generate_final_summary']
print(step)
{'generate_final_summary': {'final_summary': 'The consolidated summary of the main themes from the provided documents highlights the advancements and applications of large language models (LLMs) in artificial intelligence, particularly in autonomous agents and software development. Key themes include:\n\n1. **Integration of LLMs**: LLMs play a crucial role in enabling autonomous agents to perform complex tasks through advanced reasoning and decision-making techniques, such as Chain of Thought (CoT) and Tree of Thoughts.\n\n2. **Memory Management**: The categorization of memory into sensory, short-term, and long-term types parallels machine learning concepts, with short-term memory facilitating in-context learning and long-term memory enhanced by external storage solutions.\n\n3. **Tool Use and APIs**: Autonomous agents utilize external APIs to expand their capabilities, demonstrating adaptability and improved problem-solving skills.\n\n4. **Search Algorithms**: Various approximate nearest neighbor search algorithms, including Locality-Sensitive Hashing (LSH) and FAISS, are discussed for enhancing search efficiency in high-dimensional spaces.\n\n5. **Neuro-Symbolic Architectures**: The integration of neuro-symbolic systems, such as the MRKL framework, combines expert modules with LLMs to improve problem-solving, particularly in complex tasks.\n\n6. **Challenges and Innovations**: The documents address challenges like hallucination and inefficient planning in LLMs, alongside innovative methods such as Chain of Hindsight (CoH) and Algorithm Distillation (AD) for performance enhancement.\n\n7. **Software Development Practices**: The use of LLMs in software development is explored, particularly in creating structured applications like a Super Mario game using the model-view-controller (MVC) architecture, emphasizing task management, component organization, and documentation.\n\n8. **Limitations of LLMs**: Constraints such as finite context length and challenges in long-term planning are acknowledged, along with concerns regarding the reliability of natural language as an interface.\n\nOverall, the integration of LLMs and neuro-symbolic architectures signifies a significant evolution in AI, with ongoing research focused on enhancing planning, memory management, and problem-solving capabilities across various applications.'}}

后续步骤

请查阅 LangGraph 文档,了解使用 LangGraph 构建的详细信息,包括 此指南 中关于 LangGraph 中 map-reduce 的详细介绍。

请参阅摘要 操作指南 以了解更多摘要策略,包括那些专门为处理大量文本而设计的策略。

另请参阅 此教程 以获取有关摘要的更多详细信息。