
如何快速實現REST API集成以優化業務流程
? ? ? ?從RAG流程的角度來看,查詢重寫是一種預檢索方法。請注意,此圖大致說明了查詢重寫在RAG中的位置。在下面的部分中,我們將介紹一些重寫常見算法的改進過程。
查詢重寫是對齊查詢和文檔語義的關鍵技術。例如:
讓我們深入研究一下這些方法的細節。
論文《Precise Zero-Shot Dense Retrieval without Relevance Labels》[1]提出了一種基于假設文檔嵌入(HyDE)的方法,主要過程如圖2所示:
該過程主要分為四個步驟:
1.使用LLM基于查詢生成k個假設文檔。這些生成的文件可能不是事實,也可能包含錯誤,但它們應該于相關文件相似。此步驟的目的是通過LLM解釋用戶的查詢。
2.將生成的假設文檔輸入編碼器,將其映射到密集向量f(dk),編碼器具有過濾功能,過濾掉假設文檔中的噪聲。這里,dk表示第k個生成的文檔,f表示編碼器操作。
3.使用給定的公式計算以下k個矢量的平均值,
我們還可以將原始查詢q視為一個可能的假設:
4.使用向量v從文檔庫中檢索答案。如步驟3中所建立的,該向量保存來自用戶的查詢和所需答案模式的信息,這可以提高回憶。
我對HyDE的理解如圖3所示。HyDE的目標是生成假設文檔,以便最終查詢向量v與向量空間中的實際文檔盡可能緊密地對齊。
? ? ? HyDE在LlamaIndex和Langchain中都支持。以下以LlamaIndex為例說明HyDE的實現過程。
將文件[2]放在YOUR_DIR_PATH中。測試代碼如下(安裝的LlamaIndex I版本為0.10.12):
import osos.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from llama_index.core import VectorStoreIndex, SimpleDirectoryReaderfrom llama_index.core.indices.query.query_transform import HyDEQueryTransformfrom llama_index.core.query_engine import TransformQueryEngine
# Load documents, build the VectorStoreIndexdir_path = "YOUR_DIR_PATH"documents = SimpleDirectoryReader(dir_path).load_data()index = VectorStoreIndex.from_documents(documents)
query_str = "what did paul graham do after going to RISD"
# Query without transformation: The same query string is used for embedding lookup and also summarization.query_engine = index.as_query_engine()response = query_engine.query(query_str)
print('-' * 100)print("Base query:")print(response)
# Query with HyDE transformationhyde = HyDEQueryTransform(include_original=True)hyde_query_engine = TransformQueryEngine(query_engine, hyde)response = hyde_query_engine.query(query_str)
print('-' * 100)print("After HyDEQueryTransform:")print(response)
首先,查看LlamaIndex中的默認HyDE提示[3]:
############################################# HYDE##############################################
HYDE_TMPL = ( "Please write a passage to answer the question\n" "Try to include as many key details as possible.\n" "\n" "\n" "{context_str}\n" "\n" "\n" 'Passage:"""\n')
DEFAULT_HYDE_PROMPT = PromptTemplate(HYDE_TMPL, prompt_type=PromptType.SUMMARY)
HyDEQueryTransform類的代碼如下所示:
def_run函數的目的是生成假設文檔,在def_run功能中添加了三條調試語句來監視假設文檔的內容。
class HyDEQueryTransform(BaseQueryTransform): """Hypothetical Document Embeddings (HyDE) query transform.
It uses an LLM to generate hypothetical answer(s) to a given query, and use the resulting documents as embedding strings.
As described in [Precise Zero-Shot Dense Retrieval without Relevance Labels] (https://arxiv.org/abs/2212.10496)
"""
def __init__( self, llm: Optional[LLMPredictorType] = None, hyde_prompt: Optional[BasePromptTemplate] = None, include_original: bool = True, ) -> None: """Initialize HyDEQueryTransform.
Args: llm_predictor (Optional[LLM]): LLM for generating hypothetical documents hyde_prompt (Optional[BasePromptTemplate]): Custom prompt for HyDE include_original (bool): Whether to include original query string as one of the embedding strings """ super().__init__()
self._llm = llm or Settings.llm self._hyde_prompt = hyde_prompt or DEFAULT_HYDE_PROMPT self._include_original = include_original
def _get_prompts(self) -> PromptDictType: """Get prompts.""" return {"hyde_prompt": self._hyde_prompt}
def _update_prompts(self, prompts: PromptDictType) -> None: """Update prompts.""" if "hyde_prompt" in prompts: self._hyde_prompt = prompts["hyde_prompt"]
def _run(self, query_bundle: QueryBundle, metadata: Dict) -> QueryBundle: """Run query transform.""" # TODO: support generating multiple hypothetical docs query_str = query_bundle.query_str hypothetical_doc = self._llm.predict(self._hyde_prompt, context_str=query_str) embedding_strs = [hypothetical_doc] if self._include_original: embedding_strs.extend(query_bundle.embedding_strs)
# The following three lines contain the added debug statements. print('-' * 100) print("Hypothetical doc:") print(embedding_strs)
return QueryBundle( query_str=query_str, custom_embedding_strs=embedding_strs, )
測試代碼的操作如下:
(llamaindex_010) Florian:~ Florian$ python /Users/Florian/Documents/test_hyde.py ----------------------------------------------------------------------------------------------------Base query:Paul Graham resumed his old life in New York after attending RISD. He became rich and continued his old patterns, but with new opportunities such as being able to easily hail taxis and dine at charming restaurants. He also started experimenting with a new kind of still life painting technique.----------------------------------------------------------------------------------------------------Hypothetical doc:["After attending the Rhode Island School of Design (RISD), Paul Graham went on to co-found Viaweb, an online store builder that was later acquired by Yahoo for $49 million. Following the success of Viaweb, Graham became an influential figure in the tech industry, co-founding the startup accelerator Y Combinator in 2005. Y Combinator has since become one of the most prestigious and successful startup accelerators in the world, helping launch companies like Dropbox, Airbnb, and Reddit. Graham is also known for his prolific writing on technology, startups, and entrepreneurship, with his essays being widely read and respected in the tech community. Overall, Paul Graham's career after RISD has been marked by innovation, success, and a significant impact on the startup ecosystem.", 'what did paul graham do after going to RISD']----------------------------------------------------------------------------------------------------After HyDEQueryTransform:After going to RISD, Paul Graham resumed his old life in New York, but now he was rich. He continued his old patterns but with new opportunities, such as being able to easily hail taxis and dine at charming restaurants. He also started to focus more on his painting, experimenting with a new technique. Additionally, he began looking for an apartment to buy and contemplated the idea of building a web app for making web apps, which eventually led him to start a new company called Aspra.
embedding_strs是一個包含兩個元素的列表。第一個是生成的假設文檔,第二個是原始查詢。它們被組合成一個列表,以便于矢量計算。
在這個例子中,HyDE通過準確地想象Paul Graham在RISD之后所做的事情,顯著提高了輸出質量(參見假設文檔)。這提高了嵌入質量和最終輸出。當然,HyDE也有一些失敗案例。感興趣的讀者可以訪問此網頁[4]進行測試。
HyDE似乎是無監督的,沒有在HyDE中訓練任何模型:生成模型和對比編碼器都保持不變。
總之,雖然HyDE引入了一種新的查詢重寫方法,但它確實有一些局限性。它不依賴于查詢嵌入相似性,而是強調一個文檔與另一個文檔的相似性。然而,如果語言模型不精通該主題,它可能并不總是產生最佳結果,這可能會導致錯誤的增加。
? ? ?這個想法來自于論文《Query Rewriting for Retrieval-Augmented Large Language Models》[5]。論文認為,在真實世界的場景中,原始查詢可能并不總是LLM檢索的最佳查詢。因此,本文建議我們應該首先使用LLM來重寫查詢,然后再進行檢索和生成答案,而不是直接從原始查詢中檢索內容并生成答案。論文思路如圖4(b)所示:
為了說明查詢重寫如何影響上下文檢索和預測性能,請考慮以下示例:查詢“The NBA champion of 2020 is the Los Angeles Lakers! Tell me what is langchain framework?”需要通過重寫準確處理的。
? ? ? 這是使用Langchain實現的,安裝所需的基本庫如下:
pip install langchainpip install openaipip install langchainhubpip install duckduckgo-searchpip install langchain_openai
環境配置和庫導入:
import osos.environ["OPENAI_API_KEY"] = "YOUR_OPEN_AI_KEY"
from langchain_community.utilities import DuckDuckGoSearchAPIWrapperfrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.runnables import RunnablePassthroughfrom langchain_openai import ChatOpenAI
構建一個鏈并執行簡單的查詢:
def june_print(msg, res): print('-' * 100) print(msg) print(res)
base_template = """Answer the users question based only on the following context:
<context>{context}</context>
Question: {question}"""
base_prompt = ChatPromptTemplate.from_template(base_template)
model = ChatOpenAI(temperature=0)
search = DuckDuckGoSearchAPIWrapper()
def retriever(query): return search.run(query)
chain = ( {"context": retriever, "question": RunnablePassthrough()} | base_prompt | model | StrOutputParser())
query = "The NBA champion of 2020 is the Los Angeles Lakers! Tell me what is langchain framework?"
june_print( 'The result of query:', chain.invoke(query))
june_print( 'The result of the searched contexts:', retriever(query))
結果如下:
(langchain) Florian:~ Florian$ python /Users/Florian/Documents/test_rewrite_retrieve_read.py ----------------------------------------------------------------------------------------------------The result of query:I'm sorry, but the context provided does not mention anything about the langchain framework.----------------------------------------------------------------------------------------------------The result of the searched contexts:The Los Angeles Lakers are the 2020 NBA Champions!Watch their championship celebration here!Subscribe to the NBA: https://on.nba.com/2JX5gSN Full Game Highli... Aug 4, 2023. The 2020 Los Angeles Lakers were truly one of the most complete teams over the decade. LeBron James' fourth championship was one of the biggest moments of his career. Only two players from the 2020 team remain on the Lakers. In the storied history of the NBA, few teams have captured the imagination of fans and left a lasting ... James had 28 points, 14 rebounds and 10 assists, and the Lakers beat the Miami Heat 106-93 on Sunday night to win the NBA finals in six games. James was also named Most Valuable Player of the NBA ... Portland Trail Blazers star Damian Lillard recently spoke about the 2020 NBA "bubble" playoffs and had an interesting perspective on the criticism the eventual winners, the Los Angeles Lakers, faced. But perhaps none were more surprising than Adebayo's opinion on the 2020 NBA Finals. The Heat were defeated by LeBron James and the Los Angeles Lakers in six games. Miller asked, "Tell me about ...
結果表明:基于搜索的上下文,關于“langchain”的信息非常少。
現在就開始構建重寫器來重寫搜索查詢。
rewrite_template = """Provide a better search query for \web search engine to answer the given question, end \the queries with ’**’. Question: \{x} Answer:"""rewrite_prompt = ChatPromptTemplate.from_template(rewrite_template)
def _parse(text): return text.strip("**")
rewriter = rewrite_prompt | ChatOpenAI(temperature=0) | StrOutputParser() | _parsejune_print( 'Rewritten query:', rewriter.invoke({"x": query}))
結果如下:
----------------------------------------------------------------------------------------------------Rewritten query:What is langchain framework and how does it work?
構造rewrite_retrieve_read_chain并利用重寫后的查詢。
rewrite_retrieve_read_chain = ( { "context": {"x": RunnablePassthrough()} | rewriter | retriever, "question": RunnablePassthrough(), } | base_prompt | model | StrOutputParser())
june_print( 'The result of the rewrite_retrieve_read_chain:', rewrite_retrieve_read_chain.invoke(query))
結果如下:
----------------------------------------------------------------------------------------------------The result of the rewrite_retrieve_read_chain:LangChain is a Python framework designed to help build AI applications powered by language models, particularly large language models (LLMs). It provides a generic interface to different foundation models, a framework for managing prompts, and a central interface to long-term memory, external data, other LLMs, and more. It simplifies the process of interacting with LLMs and can be used to build a wide range of applications, including chatbots that interact with users naturally.
到目前為止,通過重寫查詢,我們已經成功地獲得了正確的答案。
STEP-BACK PROMPING是一種簡單的提示技術,使LLM能夠從包含特定細節的實例中抽象、提取高級概念和基本原理。其思想是將“step-back問題”定義為從原始問題派生出的更抽象的問題。
例如,如果查詢包含大量細節,LLM很難檢索相關事實來解決任務。如圖5中的第一個例子所示,對于物理問題“如果溫度增加2倍,體積增加8倍,理想氣體的壓力P會發生什么?”在直接推理該問題時,LLM可能會偏離理想氣體定律的第一原理。
同樣,由于特定的時間范圍限制,“Estella Leopold在1954年8月至1954年11月期間上過哪所學校?”這個問題很難直接解決。
在這兩種情況下,提出更廣泛的問題可以幫助模型有效地回答特定的查詢。與其直接問“Estela Leopold在特定時間上了哪所學校”,我們可以問“Estella Leopoold的教育史”
這個更廣泛的主題涵蓋了原始問題,可以提供所有必要的信息來推斷“Estela Leopold在特定時間上過哪所學校”。需要注意的是,這些更廣泛的問題通常比原始的特定問題更容易回答。
從這種抽象中派生的推理有助于防止在圖5(左)所示的“思想鏈”中間步驟中出現錯誤。
總之,STEP-BACK PROMPING包括兩個基本步驟:
抽象:最初,我們提示LLM提出一個關于高級概念或原理的廣泛問題,而不是直接響應查詢。然后,我們檢索關于所述概念或原理的相關事實。
推理:LLM可以根據這些關于高級概念或原理的事實推導出原始問題的答案。我們稱之為抽象推理。
為了說明step-back提示如何影響上下文檢索和預測性能,這里是用Langchain實現的演示代碼。
環境配置和庫導入:
import osos.environ["OPENAI_API_KEY"] = "YOUR_OPEN_AI_KEY"
from langchain_core.output_parsers import StrOutputParserfrom langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplatefrom langchain_core.runnables import RunnableLambdafrom langchain_openai import ChatOpenAIfrom langchain_community.utilities import DuckDuckGoSearchAPIWrapper
構建一個鏈并執行原始查詢:
def june_print(msg, res): print('-' * 100) print(msg) print(res)
question = "was chatgpt around while trump was president?"
base_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.
{normal_context}
Original Question: {question}Answer:"""
base_prompt = ChatPromptTemplate.from_template(base_prompt_template)
search = DuckDuckGoSearchAPIWrapper(max_results=4)def retriever(query): return search.run(query)
base_chain = ( { # Retrieve context using the normal question (only the first 3 results) "normal_context": RunnableLambda(lambda x: x["question"]) | retriever, # Pass on the question "question": lambda x: x["question"], } | base_prompt | ChatOpenAI(temperature=0) | StrOutputParser())
june_print('The searched contexts of the original question:', retriever(question))june_print('The result of base_chain:', base_chain.invoke({"question": question}) )
結果如下:
(langchain) Florian:~ Florian$ python /Users/Florian/Documents/test_step_back.py ----------------------------------------------------------------------------------------------------The searched contexts of the original question:While impressive in many respects, ChatGPT also has some major flaws. ... [President's Name]," refused to write a poem about ex-President Trump, but wrote one about President Biden ... The company said GPT-4 recently passed a simulated law school bar exam with a score around the top 10% of test takers. By contrast, the prior version, GPT-3.5, scored around the bottom 10%. The ... These two moments show how Twitter's choices helped former President Trump. ... With ChatGPT, which launched to the public in late November, users can generate essays, stories and song lyrics ... Donald Trump is asked a question—say, whether he regrets his actions on Jan. 6—and he answers with something like this: " Let me tell you, there's nobody who loves this country more than me ...----------------------------------------------------------------------------------------------------The result of base_chain:Yes, ChatGPT was around while Trump was president. ChatGPT is an AI language model developed by OpenAI and was launched to the public in late November. It has the capability to generate essays, stories, and song lyrics. While it may have been used to write a poem about President Biden, it also has the potential to be used in various other contexts, including generating responses from hypothetical scenarios involving former President Trump.
結果顯然是不正確的。
開始構建step_back_question_chain和step_back_chain以獲得正確的結果。
# Few Shot Examplesexamples = [ { "input": "Could the members of The Police perform lawful arrests?", "output": "what can the members of The Police do?", }, { "input": "Jan Sindel’s was born in what country?", "output": "what is Jan Sindel’s personal history?", },]# We now transform these to example messagesexample_prompt = ChatPromptTemplate.from_messages( [ ("human", "{input}"), ("ai", "{output}"), ])few_shot_prompt = FewShotChatMessagePromptTemplate( example_prompt=example_prompt, examples=examples,)
step_back_prompt = ChatPromptTemplate.from_messages( [ ( "system", """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""", ), # Few shot examples few_shot_prompt, # New question ("user", "{question}"), ])step_back_question_chain = step_back_prompt | ChatOpenAI(temperature=0) | StrOutputParser()june_print('The step-back question:', step_back_question_chain.invoke({"question": question}))june_print('The searched contexts of the step-back question:', retriever(step_back_question_chain.invoke({"question": question})) )
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.
{normal_context}{step_back_context}
Original Question: {question}Answer:"""response_prompt = ChatPromptTemplate.from_template(response_prompt_template)
step_back_chain = ( { # Retrieve context using the normal question "normal_context": RunnableLambda(lambda x: x["question"]) | retriever, # Retrieve context using the step-back question "step_back_context": step_back_question_chain | retriever, # Pass on the question "question": lambda x: x["question"], } | response_prompt | ChatOpenAI(temperature=0) | StrOutputParser())
june_print('The result of step_back_chain:', step_back_chain.invoke({"question": question}) )
結果如下:
----------------------------------------------------------------------------------------------------The step-back question:When did ChatGPT become available?----------------------------------------------------------------------------------------------------The searched contexts of the step-back question:OpenAI released an early demo of ChatGPT on November 30, 2022, and the chatbot quickly went viral on social media as users shared examples of what it could do. Stories and samples included ... March 14, 2023 - Anthropic launched Claude, its ChatGPT alternative. March 20, 2023 - A major ChatGPT outage affects all users for several hours. March 21, 2023 - Google launched Bard, its ... The same basic models had been available on the API for almost a year before ChatGPT came out. In another sense, we made it more aligned with what humans want to do with it. A paid ChatGPT Plus subscription is available. (Image credit: OpenAI) ChatGPT is based on a language model from the GPT-3.5 series, which OpenAI says finished its training in early 2022.----------------------------------------------------------------------------------------------------The result of step_back_chain:No, ChatGPT was not around while Trump was president. ChatGPT was released to the public in late November, after Trump's presidency had ended. The references to ChatGPT in the context provided are all dated after Trump's presidency, such as the release of an early demo on November 30, 2022, and the launch of ChatGPT Plus subscription. Therefore, it is safe to say that ChatGPT was not around during Trump's presidency.
我們可以看到,通過將原始查詢“后退”到更抽象的問題,并使用抽象查詢和原始查詢進行檢索,LLM提高了其遵循正確推理路徑找到解決方案的能力。
正如Edsger W.Dijkstra所說,“抽象的目的不是模糊,而是創造一個新的語義層次,在這個層次上可以絕對精確。”
《Query2doc: Query Expansion with Large Language Models》[6]提出了Query2doc方法來進行query改寫,它使用LLM的一些提示生成偽文檔,然后將它們與原始查詢組合以創建新的查詢,如圖6所示:
在密集檢索中,新查詢表示為q+,是原始查詢(q)和偽文檔(d’)的簡單級聯,由[SEP]分隔:q+=concat(q,[SEP],d’)。
Query2doc認為,HyDE隱含地假設groundtruth文檔和偽文檔用不同的單詞表達相同的語義,這可能不適用于某些查詢。
Query2doc和HyDE之間的另一個區別是,Query2doc訓練有監督的密集檢索器,如論文所述。
? ? ? ?目前,在Langchain或LlamaIndex中未找到query2doc的實現。
《Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy》[7]提出了ITER-RETGEN方法,它使用生成的內容來指導檢索。它在檢索-讀取-檢索-讀取流中迭代地實現“檢索增強的生成”和“生成增強的檢索”。
如圖7所示,對于給定的問題q和檢索語料庫D={D},其中D表示一個段落,ITER-RETGEN連續執行T次檢索生成迭代。
在每次迭代t中,我們首先使用上一次迭代的生成yt-1,將其與q組合,并檢索前k個段落。接下來,我們提示LLM生成輸出yt,該輸出yt將檢索到的段落(表示為Dyt-1||q)和q合并到提示中。因此,每個迭代可以公式化如下:
最后的輸出yt將作為最終響應而產生。
? ? ? ?與Query2doc類似,目前,在Langchain或LlamaIndex中,尚未找到實現。
本文介紹了各種查詢重寫技術,包括一些技術的代碼演示。
在實踐中,這些查詢重寫方法都可以嘗試,使用哪種方法或方法組合取決于具體效果。
? ? ? 然而,無論采用何種重寫方法,調用LLM都需要進行一些性能權衡,這需要在實際使用中加以考慮。
此外,還有一些方法,如查詢路由、將查詢分解為多個子問題等,它們不屬于查詢重寫,但它們是預檢索方法,這些方法將來將有機會引入。
[1] https://arxiv.org/pdf/2212.10496.pdf
[2] https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
[3] https://github.com/run-llama/llama_index/blob/v0.10.12/llama-index-core/llama_index/core/prompts/default_prompts.py#L336
[4] https://docs.llamaindex.ai/en/stable/examples/query_transformations/HyDEQueryTransformDemo.html#failure-case-1-hyde-may-mislead-when-query-can-be-mis-interpreted-without-context
[5] https://arxiv.org/pdf/2305.14283.pdf
[6] https://arxiv.org/pdf/2303.07678.pdf
[7]?https://arxiv.org/pdf/2305.15294.pdf
文章轉自微信公眾號@ArronAI