第二部分 應(yīng)用挑戰(zhàn)

2.基本流程與相關(guān)技術(shù)

4)Prompt

在前面的內(nèi)容里,我們提到過要使用模型完成下游任務(wù),有兩種方式,一種是通過收集標(biāo)記樣本針對不同的任務(wù)進(jìn)行指令微調(diào),另一種方式便是大模型特有的,可以通過將指令以對話的方式提供給模型,期待模型能夠給我們返回預(yù)期的結(jié)果。相較于前者,后者具有更高的靈活性,使用成本也更低,因此,這一方式成了如今大語言模型區(qū)別于傳統(tǒng)NLP模型的重要標(biāo)志。在本章你將學(xué)習(xí)到:

1)Prompt,In-Context-Learning,Prompt engineering等相關(guān)概念

2)如何寫好一個(gè)Prompt及相關(guān)Prompt調(diào)試工具

3)基于Prompt催生的一些新的編程范式

Agent框架

在了解了Agent的基本概念,原理之后,我們這一節(jié)介紹一些有關(guān)Agent的知名框架和項(xiàng)目,通過這些項(xiàng)目可以快速構(gòu)建自己的Agent應(yīng)用。

編排框架類-(Langchain/llamaindex/Semantic Kernel)

如前面介紹,Agent本質(zhì)上是一種使用LLM替代人工制定目標(biāo)任務(wù)處理執(zhí)行流程的應(yīng)用形態(tài),因此,從RAG應(yīng)用衍生出來的編排框架,如Langchain,llamaindex,Semantic Kernel都隨著應(yīng)用復(fù)雜度的不斷提高,有了對agent以及mutiAgent應(yīng)用的支持。下面以langchain為例重點(diǎn)介紹一下編排框架類的agent應(yīng)用的相關(guān)細(xì)節(jié),其他框架更多使用介紹見下一章編排與集成。
在langchain中,agent 應(yīng)用構(gòu)建涉及到幾個(gè)核心概念:

1)Agent

其核心作用是與大模型交互,提供大模型當(dāng)前可以使用的工具,用戶的輸入,以及歷史過程中的執(zhí)行動(dòng)作及相關(guān)工具的輸出。大模型基于這些輸入,進(jìn)而獲得下一步行動(dòng)或發(fā)送給用戶的最終響應(yīng)(AgentActions 或 AgentFinish)。這里的行動(dòng)可以是指定一個(gè)工具和該工具的輸入。

 try:
# Call the LLM to see what to do.
output = self.agent.plan(
intermediate_steps,
callbacks=run_manager.get_child() if run_manager else None,
**inputs,
)
except Exception as e:
if not self.handle_parsing_errors:
raise e
text = str(e).split("`")[1]
observation = "Invalid or incomplete response"
output = AgentAction("_Exception", observation, text)
tool_run_kwargs = self.agent.tool_run_logging_kwargs()
observation = ExceptionTool().run(
output.tool,
verbose=self.verbose,
color=None,
callbacks=run_manager.get_child() if run_manager else None,
**tool_run_kwargs,
)
return [(output, observation)]

不難想到,要和大模型進(jìn)行溝通,自然需要Prompt,langchain默認(rèn)內(nèi)置了若干agent類型(agent type),如:Zero-shot ReAct,Structured input ReAct,OpenAI Functions,Self-ask with search等,也可以自定義agent。

2)Tools
作為LLM能力的擴(kuò)展,需要提供可被Agent調(diào)起的工具(可以理解為一個(gè)函數(shù)調(diào)用)。這里的工具必須滿足兩個(gè)條件:提供解決該任務(wù)合適的工具及一個(gè)有效的工具描述,讓大模型知道這個(gè)工具到底有啥功能,什么時(shí)候可以使用它。langchain默認(rèn)提供了大量的內(nèi)置工具,包含搜索,查庫,計(jì)算等,詳見:https://python.langchain.com/docs/integrations/tools/。

return Tool(
name="Calculator",
description="Useful for when you need to answer questions about math.",
func=LLMMathChain.from_llm(llm=llm).run,
coroutine=LLMMathChain.from_llm(llm=llm).arun,
)

3)Toolkits

langchain將一組相關(guān)的工具合并在一個(gè)toolkits中方便管理,比如對于一個(gè)網(wǎng)站的增刪改查。同樣,langchain提供了大量的toolkits的預(yù)置集合。在使用方法上沒有太多區(qū)別,都會(huì)打平成Tools提交給LLM。

tools = []
unwanted_tools = ["Get Issue", "Delete File", "Create File", "Create Pull Request"]

for tool in toolkit.get_tools():
if tool.name not in unwanted_tools:
tools.append(tool)
tools += [
Tool(
name="Search",
func=DuckDuckGoSearchRun().run,
description="useful for when you need to search the web",
)
]

4)AgentExecutor

Agent執(zhí)行的運(yùn)行環(huán)境,通過它來串聯(lián)agent的工作流程,可以認(rèn)為是一個(gè)無限循環(huán),并保證在執(zhí)行過程中可能出現(xiàn)的一些錯(cuò)誤及執(zhí)行兜底策略。下面是langchain默認(rèn)的最常見泛化運(yùn)行時(shí)的實(shí)現(xiàn)示例:

next_action = agent.get_action(...)
while next_action != AgentFinish:
observation = run(next_action)
next_action = agent.get_action(..., next_action, observation)
return next_action

除此之外,langchain還提供了一些特定模式的Agent執(zhí)行環(huán)境,比如Plan-and-execute Agent,AutoGPT等,其中Plan-and-execute 的使用方法如下:

from langchain.agents.tools import Tool
from langchain.chains import LLMMathChain
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain.utilities import DuckDuckGoSearchAPIWrapper
from langchain_experimental.plan_and_execute import (
PlanAndExecute,
load_agent_executor,
load_chat_planner,
)
search = DuckDuckGoSearchAPIWrapper()
llm = OpenAI(temperature=0)
llm_math_chain = LLMMathChain.from_llm(llm=llm, verbose=True)
tools = [
Tool(
name="Search",
func=search.run,
description="useful for when you need to answer questions about current events",
),
Tool(
name="Calculator",
func=llm_math_chain.run,
description="useful for when you need to answer questions about math",
),
]

model = ChatOpenAI(temperature=0)
planner = load_chat_planner(model)
executor = load_agent_executor(model, tools, verbose=True)
agent = PlanAndExecute(planner=planner, executor=executor)

agent.run(
"Who is the current prime minister of the UK? What is their current age raised to the 0.43 power?"
)

這里關(guān)鍵的planner的prompt如下:

SYSTEM_PROMPT = (
"Let's first understand the problem and devise a plan to solve the problem."
" Please output the plan starting with the header 'Plan:' "
"and then followed by a numbered list of steps. "
"Please make the plan the minimum number of steps required "
"to accurately complete the task. If the task is a question, "
"the final step should almost always be 'Given the above steps taken, "
"please respond to the users original question'. "
"At the end of your plan, say '<END_OF_PLAN>'"
)

基于以上的分析,想要定義一個(gè)自己的Agent應(yīng)用,就是需要自定義這些關(guān)鍵組件。

OpenAI原生Assistant

在早期,OpenAI僅僅對外提供生成和對話接口,外圍的編排框架需要完成輸出解析,需要基于應(yīng)用模式進(jìn)行檢索增強(qiáng)或者調(diào)用工具,隨著function call,code?interperter等能力的增強(qiáng),OpenAI將這些原本外部實(shí)現(xiàn)的Agent需要的核心功能放到了內(nèi)部,提供了Assistant接口,基于這個(gè)接口大大簡化了LLM應(yīng)用開發(fā)的模式,RAG及Agent應(yīng)用開發(fā)變得更加簡單。因此,在不久未來基于OpenAI原生的GPTs的Agent應(yīng)用將會(huì)大量出現(xiàn)。

在新的Assistant接口中,有這樣一些領(lǐng)域概念。

領(lǐng)域?qū)ο?/strong>解釋
Assistant使用 OpenAI 模型和調(diào)用工具的特定目的的Assistant,它有多個(gè)屬性,其中包括 tools 和 file_ids,分別對應(yīng) Tool 對象和 File 對象。
Thread對象表示一個(gè)聊天會(huì)話,它是有狀態(tài)的,就像 ChatGPT 網(wǎng)頁上的每個(gè)歷史記錄,我們可以對歷史記錄進(jìn)行重新對話,它包含了多個(gè) Message 對象。
Message表示一條聊天消息,分不同角色的消息,包括 user、assistant 和 tool 等。Message以列表形式存儲在Thread中。
Run表示一次指令執(zhí)行的過程,需要指定執(zhí)行命令的對象 Assistant 和聊天會(huì)話 Thread,一個(gè) Thread 可以創(chuàng)建多個(gè) Run。
Run Step對象表示執(zhí)行的步驟,一個(gè) Run 包含多個(gè) Run Step。查看”Run Step”可讓您了解助手是如何取得最終結(jié)果的。

其開發(fā)過程如下:

  1. 創(chuàng)建 Assistant,由于這個(gè) API 是 beta 版本,如果是通過 curl 調(diào)用 API 的話,需要在 header 中加上OpenAI-Beta: assistants=v1。
assistant = client.beta.assistants.create(
name="Data visualizer",
description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
model="gpt-4-1106-preview",
tools=[{"type": "code_interpreter"}],
file_ids=[file.id]
)

其中,tool參數(shù)可以最多128個(gè),也就是最多支持128個(gè)工具調(diào)用,其中OpenAI托管code_interpreter and retrieval兩種工具,第三方自定義工具采用functioncall。file_ids參數(shù)可指定綁定的文件,最多可以綁定20個(gè),單個(gè)不超過512MB,最大不能超過100GB。通過file創(chuàng)建接口可獲得file.id然后附在assistant創(chuàng)建接口上。同時(shí)也可以通過AssistantFile方式與具體Assitant關(guān)聯(lián),并且注意刪除AssistantFile并不會(huì)刪除原始的File對象,它只是刪除該File和Assistant之間的關(guān)聯(lián)。若要?jiǎng)h除文件,還需要進(jìn)一步利用file接口刪除。

curl https://api.openai.com/v1/assistants/asst_abc123/files \
-H 'Authorization: Bearer $OPENAI_API_KEY"' \
-H 'Content-Type: application/json' \
-H 'OpenAI-Beta: assistants=v1' \
-d '{
"file_id": "file-abc123"
}'

2.創(chuàng)建 Thread 和 Message,可以分開創(chuàng)建也可以一起創(chuàng)建。

thread = client.beta.threads.create(
messages=[
{
"role": "user",
"content": "Create 3 data visualizations based on the trends in this file.",
"file_ids": [file.id]
}
]
)

雖然thread沒有設(shè)置Message條數(shù)上限,但整體收受到context window限制。在當(dāng)前thread仍然可以關(guān)聯(lián)上傳文件。

3.創(chuàng)建 由 Assistant 和 Thread 組成的 Run,創(chuàng)建完 Run 后會(huì)自動(dòng)執(zhí)行 Thread 中的指令

run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
model="gpt-4-1106-preview",
instructions="additional instructions",
tools=[{"type": "code_interpreter"}, {"type": "retrieval"}]
)

4.輪詢 Run 狀態(tài),檢查是否為 completed,整個(gè)流轉(zhuǎn)狀態(tài)如圖:

狀態(tài)定義
queued新建Run或完成 required_action時(shí)會(huì)變?yōu)閝ueued狀態(tài)。進(jìn)而立即變?yōu)閕n_progress。
in_progress在in_progress,助手會(huì)使用模型和工具來執(zhí)行指令。可以通過檢查 ” Run Steps”來查看當(dāng)前RUN的進(jìn)度。
completedRun成功執(zhí)行。可以查看Assistant添加到Thread中所有消息,以及當(dāng)前RUN的所有步驟。您還可以通過向Thread添加更多用戶消息和創(chuàng)建另一個(gè)RUN來繼續(xù)對話。
requires_action使用Function calling時(shí),一旦模型確定了要調(diào)用的函數(shù)名稱和參數(shù),運(yùn)行就會(huì)轉(zhuǎn)入 required_action 狀態(tài)。然后,調(diào)用方必須運(yùn)行這些函數(shù),并在運(yùn)行繼續(xù)之前提交輸出。如果在過期時(shí)間戳(expires_at,大約為創(chuàng)建后 10 分鐘)前未提供輸出,運(yùn)行將轉(zhuǎn)入expired狀態(tài)。
expired如果調(diào)用方未將Function calling輸出在 expires_at 之前提交,運(yùn)行就會(huì)過期。此外,如果運(yùn)行時(shí)間過長,超過了 expires_at 中規(guī)定的時(shí)間,OpenAI系統(tǒng)就會(huì)使運(yùn)行過期。
cancelling可以使用 “取消運(yùn)行 “接口嘗試取消正在進(jìn)行的RUN。一旦嘗試取消成功,RUN的狀態(tài)將變?yōu)閏ancelled。嘗試取消但不保證一定能夠取消。
cancelledRUN成功取消。
failed可以通過查看RUN中的 last_error 對象來了解故障原因。失敗的時(shí)間戳將記錄在 failed_at 下。

這個(gè)過程是個(gè)異步過程,可以通過之前生成的assistant_id,thread_id,run_id來輪詢來檢查執(zhí)行的進(jìn)度以及run step,如果有調(diào)用自定義工具(Function call),需要提交工具的執(zhí)行結(jié)果,避免RUN任務(wù)停留在requires_action或者expired狀態(tài)。

curl https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123 \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: assistants=v1"

{
"id": "step_abc123",
"object": "thread.run.step",
"created_at": 1699063291,
"run_id": "run_abc123",
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"type": "message_creation",
"status": "completed",
"cancelled_at": null,
"completed_at": 1699063291,
"expired_at": null,
"failed_at": null,
"last_error": null,
"step_details": {
"type": "message_creation",
"message_creation": {
"message_id": "msg_abc123"
}
}
}

獲得Run Step信息:

curl https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123 \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: assistants=v1"

{
"id": "step_abc123",
"object": "thread.run.step",
"created_at": 1699063291,
"run_id": "run_abc123",
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"type": "message_creation",
"status": "completed",
"cancelled_at": null,
"completed_at": 1699063291,
"expired_at": null,
"failed_at": null,
"last_error": null,
"step_details": {
"type": "message_creation",
"message_creation": {
"message_id": "msg_abc123"
}
}
}

5.如果狀態(tài)是 completed 則可獲取最終結(jié)果

run = client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)

while run.status != "completed":
print(run.status)
time.sleep(60) # 等待60秒
run = client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)

messages = client.beta.threads.messages.list(
thread_id=thread.id
)

print(messages.data)

以上整個(gè)過程可以在OpenAI的playGround中進(jìn)行調(diào)試開發(fā)。

Langchain也第一時(shí)間支持了該接口,并提供了新的Agent實(shí)現(xiàn)教程。

!pip install e2b duckduckgo-search
from langchain.tools import DuckDuckGoSearchRun, E2BDataAnalysisTool

tools = [E2BDataAnalysisTool(api_key="..."), DuckDuckGoSearchRun()]
agent = OpenAIAssistantRunnable.create_assistant(
name="langchain assistant e2b tool",
instructions="You are a personal math tutor. Write and run code to answer math questions. You can also search the internet.",
tools=tools,
model="gpt-4-1106-preview",
as_agent=True,
)

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools)
agent_executor.invoke({"content": "What's the weather in SF today divided by 2.7"})

目前該接口處于Beta測試階段,未來官方將補(bǔ)齊如下功能。

不僅如此,在API的基礎(chǔ)上,OpenAI還提供了面向普通用戶的無代碼界面化的Agent開發(fā)方法,那就是的GPTs。

可觀看此視頻學(xué)習(xí)如何創(chuàng)建(來自@agishaun的一個(gè)簡歷向?qū)У腉PTs):

相較于當(dāng)前Beta版本的API構(gòu)建方式,GPTS的界面操作更為簡單直觀,提供了更多的內(nèi)置插件能力,如網(wǎng)頁瀏覽,圖片生成等,以及外部的Action調(diào)用,能夠方便的分享給其它用戶,甚至直接獲得收益。由于其制作門檻足夠低,發(fā)布之后短短幾天就有上千GPTs上架GPT store。這里提供一些導(dǎo)航網(wǎng)站可以探索和借鑒最新最熱的GPTs:

https://gptsdex.com/

Agent專有框架(單agent/MutiAgent)

除了上面這些基本的agent構(gòu)建的工具外,業(yè)內(nèi)還有一些專門設(shè)計(jì)旨在更好完成agent構(gòu)建的項(xiàng)目。通常包含單Agent和多Agent協(xié)同兩類項(xiàng)目。
單Agent

由Yohei Nakajima開發(fā)的babyAGI作為chatGPT橫空出世后,是最先利用大模型能力構(gòu)建的任務(wù)驅(qū)動(dòng)型自主Agent的概念性項(xiàng)目之一,在當(dāng)時(shí)獲得了巨大的關(guān)注,對后面的Agent項(xiàng)目有很大的啟發(fā)。它使用 OpenAI GPT-4接口和向量數(shù)據(jù)庫(如 Chroma 或 Weaviate)來創(chuàng)建、優(yōu)先排序和執(zhí)行任務(wù)。其核心實(shí)現(xiàn)就在babyagi.py這一個(gè)腳本中,在里面可以看到其精華所在來自于prompt,其關(guān)鍵的三個(gè)Agent的prompt如下:

1.task_creation_agent:根據(jù)目標(biāo)創(chuàng)建任務(wù)

prompt = f""" You are a task creation AI that uses the result of an execution agent to create new tasks with the following objective: {objective}, The last completed task has the result: {result}. This result was based on this task description: {task_description}. These are incomplete tasks: {', '.join(task_list)}. Based on the result, create new tasks to be completed by the AI system that do not overlap with incomplete tasks. Return the tasks as an array."""

2.execution_agent:根據(jù)運(yùn)行歷史及當(dāng)前任務(wù),獲得任務(wù)返回

prompt = f""" You are an AI who performs one task based on the following objective: {objective}\n. Take into account these previously completed tasks: {context}\n. Your task: {task}\nResponse:"""

3.prioritization_agent:基于目標(biāo)對任務(wù)進(jìn)行排序

prompt = f""" You are a task prioritization AI tasked with cleaning the formatting of and reprioritizing the following tasks: {task_names}. Consider the ultimate objective of your team:{OBJECTIVE}. Do not remove any tasks. Return the result as a numbered list, like: #. First task #. Second task Start the task list with number {next_task_id}."""

其工作過程如下:

1. 設(shè)定目標(biāo),并通過任務(wù)生成Agent生成任務(wù)列表,從任務(wù)列表中提取第一個(gè)任務(wù)。
2. 將任務(wù)發(fā)送給執(zhí)行Agent,Agent根據(jù)上下文構(gòu)建prompt調(diào)用OpenAI接口完成任務(wù)。
3. 將任務(wù)與結(jié)果存儲在記憶模塊中,如Chroma/Weaviate 等向量數(shù)據(jù)庫中。
4. 根據(jù)目標(biāo)和上下文,任務(wù)創(chuàng)建Agent創(chuàng)建新的任務(wù)。
5. 任務(wù)優(yōu)先級Agent,根據(jù)目標(biāo)和先前任務(wù)的結(jié)果重新排列任務(wù)列表的優(yōu)先級。
6. 重復(fù)步驟2-5,直到任務(wù)列表為空,判定結(jié)束目標(biāo)。該項(xiàng)目項(xiàng)目歷經(jīng)多次迭代,其最初實(shí)現(xiàn)參考:https://yoheinakajima.com/task-driven-autonomous-agent-utilizing-gpt-4-pinecone-and-langchain-for-diverse-applications/。不過遺憾的是,作為概念性產(chǎn)品,其核心思想也被langchain等框架吸收替代,項(xiàng)目本身也停止了更新。

AutoGPT由Toran Bruce Richards開發(fā)的開源自主AI代理,于2023年3月發(fā)布,與babyAGI齊名,并且其功能更為完善,支持抓取網(wǎng)站、搜索信息、生成圖像、創(chuàng)建和運(yùn)行代碼等。

AutoGPT相較于babyAGI來講,在Prompt engieering層面走的更遠(yuǎn),充分發(fā)揮大模型能力,讓大模型代替編程來控制流程。因此,它的核心在于如何構(gòu)造prompt,其執(zhí)行過程分為六步:

1.創(chuàng)建計(jì)劃,包含agent的名字,角色,以及計(jì)劃目標(biāo)。對應(yīng)的prompt片段如下:

You are

AI Name <-(Variable)
AI Role <-(Variable)

Your decisions must always be made independently
without seeking user assistance.

Play to your strengths as a LLM and
pursue simple strategies with no legal complications.

Golas <-(Variable)

2.提供可用的工具列表,如下包含搜索、瀏覽網(wǎng)站和生成圖像。

COMMANDS:

1. Google Search: "google", args: "input": "<search>"
5. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
20. Generate Image: "generate_image", args: "prompt": "<prompt>"

3.提供可用的命令,此部分與工具都在COMMANDS下聲明。

8. List GPT Agents: "list_agents", args: ""
9. Delete GPT Agent: "delete_agent", args: "key": "<key>"
10. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
11. Read file: "read_file", args: "file": "<file>"
12. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
13. Delete file: "delete_file", args: "file": "<file>"

4.進(jìn)入執(zhí)行計(jì)劃迭代。框架可基于大模型返回來調(diào)用相關(guān)工具。下面是框架調(diào)用工具的基本邏輯。

if command_name == "google":

# Check if the Google API key is set and use the official search method
# If the API key is not set or has only whitespaces, use the unofficial search method
if cfg.google_api_key and (cfg.google_api_key.strip() if cfg.google_api_key else None):
return google_official_search(arguments["input"])
else:
return google_search(arguments["input"])

5.準(zhǔn)備上下文信息,這里包含大模型在執(zhí)行過程中的限制、可以使用的資源及評價(jià)方法,執(zhí)行歷史,結(jié)果返回格式等,部分Prompt如下:

CONSTRAINTS:

1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"

RESOURCES:

1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

PERFORMANCE EVALUATION:

1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below

RESPONSE FORMAT:
{
"thoughts":
{
"text": "thought",
"reasoning": "reasoning",
"plan": "- short bulleted\n- list that conveys\n- long-term plan",
"criticism": "constructive self-criticism",
"speak": "thoughts summary to say to user"
},
"command": {
"name": "command name",
"args":{
"arg name": "value"
}
}
}

Ensure the response can be parsed by Python json.loads

在Prompt中添加對話歷史:

memory_to_add = f"Assistant Reply: {assistant_reply} " \
f"\nResult: {result} " \
f"\nHuman Feedback: {user_input} "

6.將計(jì)劃,工具,上下文等內(nèi)容整合為最終Prompt提交給大模型,等待大模型返回下一步更新后的執(zhí)行計(jì)劃。

最后,重復(fù)4-6步,不斷更新計(jì)劃,直到計(jì)劃完成。
AutoGPT也提供了自己的前端UI實(shí)現(xiàn),可在 Web、Android、iOS、Windows 和 Mac 上運(yùn)行。作為當(dāng)下最領(lǐng)先的Agent開發(fā)框架,開發(fā)者可以基于AutoGPT作為自己Agent,結(jié)合自己的垂直場景進(jìn)行改造完善。

相較于BabyAGI,AutoGPT,AgentGPT最大的特點(diǎn)就是試用方便,官方提供了試用的網(wǎng)站(https://agentgpt.reworkd.ai/zh/),設(shè)置自己的OpenAI key,就可以直接使用。在技術(shù)實(shí)現(xiàn)上,與其它框架區(qū)別不大,核心還是在Prompt和工具使用上。下面是它的一個(gè)prompt:

start_goal_prompt = PromptTemplate(
template="""You are a task creation AI called AgentGPT.
You answer in the "{language}" language. You have the following objective "{goal}".
Return a list of search queries that would be required to answer the entirety of the objective.
Limit the list to a maximum of 5 queries. Ensure the queries are as succinct as possible.
For simple questions use a single query.

Return the response as a JSON array of strings. Examples:

query: "Who is considered the best NBA player in the current season?", answer: ["current NBA MVP candidates"]
query: "How does the Olympicpayroll brand currently stand in the market, and what are its prospects and strategies for expansion in NJ, NY, and PA?", answer: ["Olympicpayroll brand comprehensive analysis 2023", "customer reviews of Olympicpayroll.com", "Olympicpayroll market position analysis", "payroll industry trends forecast 2023-2025", "payroll services expansion strategies in NJ, NY, PA"]
query: "How can I create a function to add weight to edges in a digraph using {language}?", answer: ["algorithm to add weight to digraph edge in {language}"]
query: "What is the current weather in New York?", answer: ["current weather in New York"]
query: "5 + 5?", answer: ["Sum of 5 and 5"]
query: "What is a good homemade recipe for KFC-style chicken?", answer: ["KFC style chicken recipe at home"]
query: "What are the nutritional values of almond milk and soy milk?", answer: ["nutritional information of almond milk", "nutritional information of soy milk"]""",
input_variables=["goal", "language"],
)

其更多prompt可查看:https://github.com/reworkd/AgentGPT/blob/c2084e4faa46ecd91621be17574ef9532668cbfc/platform/reworkd_platform/web/api/agent/prompts.py

Muti-Agent?

https://arxiv.org/pdf/2304.03442.pdf

對于一個(gè)復(fù)雜系統(tǒng),如何讓Agent之間協(xié)同,共同完成更為復(fù)雜的系統(tǒng)性工作,自然而然有了多Agent的概念。相較于單Agent,多Agent更為早期,更多是一種概念性的展示,還需要很長的路要走。而被廣大同行關(guān)注的是來自于斯坦福小鎮(zhèn)的項(xiàng)目(Generative Agents: Interactive Simulacra of Human Behavior),也得益于它的啟發(fā),大量的多Agent項(xiàng)目出現(xiàn)。

在這個(gè)虛擬的小鎮(zhèn)里,每個(gè)角色都是一個(gè)單獨(dú)的智能體,每天依據(jù)制定的計(jì)劃按照設(shè)定的角色去活動(dòng)和做事情,當(dāng)他們相遇并交談時(shí),他們的交談內(nèi)容會(huì)被存儲在記憶數(shù)據(jù)庫中,并在第二天的活動(dòng)計(jì)劃中被回憶和引用,這一過程中就能涌現(xiàn)出許多頗有趣味性的社會(huì)學(xué)現(xiàn)象。

下面我們介紹幾個(gè)比較知名的MutiAgent項(xiàng)目。

MetaGPT是一個(gè)開源多智能體框架,模擬一家軟件公司,讓Agent協(xié)同起來完成開發(fā)工作,它可以僅憑一行軟件需求就能生成 API、用戶故事、數(shù)據(jù)結(jié)構(gòu)、競爭分析等。

MetaGPT符合人類軟件開發(fā)的標(biāo)準(zhǔn)流程。Agent可以充當(dāng)產(chǎn)品經(jīng)理、軟件工程師和架構(gòu)師等角色協(xié)同起來完成開發(fā)流程。

在線試用:https://huggingface.co/spaces/deepwisdom/MetaGPT

與該項(xiàng)目類似,清華大學(xué)等國內(nèi)機(jī)構(gòu)發(fā)起的一個(gè)多智能體項(xiàng)目ChatDev,它虛擬一個(gè)由多智能體協(xié)作運(yùn)營的軟件公司,在人類“用戶”指定一個(gè)具體的任務(wù)需求后,不同角色的Agent將進(jìn)行交互式協(xié)同,以生產(chǎn)一個(gè)完整軟件(包括源代碼、環(huán)境依賴說明書、用戶手冊等)。

Autogen 是微軟開發(fā)的一款通用的多代理框架。它提供可定制和可對話的代理,將 LLM、工具和人類整合在一起。通過自動(dòng)處理多個(gè)有能力的代理之間的聊天,人們可以輕松地讓它們共同自主地或在人類反饋下執(zhí)行任務(wù),包括需要通過代碼使用工具的任務(wù)。相較于MetaGPT,AutoGen是通用的,利用它可以構(gòu)建各種不同形式的多Agent應(yīng)用。

該框架具有以下特點(diǎn):

AutoGen框架使我們能夠協(xié)調(diào)編排多智能體工作流,相較于傳統(tǒng)流程驅(qū)動(dòng)的任務(wù)流,這種消息驅(qū)動(dòng)的任務(wù)流程顯得更為靈活,對處理復(fù)雜流程以及解耦領(lǐng)域邏輯有一定的幫助。它提供了一些通用的代理類,還提供了一個(gè)“Group Chat”的上下文,以促進(jìn)跨Agent協(xié)作。以下是一些常用到的Agent類。

user_proxy.register_function(
function_map={
"search_and_index_wikipedia": search_and_index_wikipedia,
"query_wiki_index":query_wiki_index,
}
)

Assistant Agent:由大模型能力的支持,使其能夠完成特定的任務(wù),可扮演不同的AI角色。比如,設(shè)置一個(gè)分析師角色。

analyst = autogen.AssistantAgent(
name="analyst",
system_message='''
As the Information Gatherer, you must start by using the search_and_index_wikipedia function to gather relevant data about the user's query. Follow these steps: 1. Upon receiving a query, immediately invoke the search_and_index_wikipedia function to find and index Wikipedia pages related to the query. Do not proceed without completing this step. 2. After successfully indexing, utilize the query_wiki_index to extract detailed information from the indexed content. 3. Present the indexed information and detailed findings to the Reporter, ensuring they have a comprehensive dataset to draft a response. 4. Conclude your part with "INFORMATION GATHERING COMPLETE" to signal that you have finished collecting data and it is now ready for the Reporter to use in formulating the answer. Remember, you are responsible for information collection and indexing only. The Reporter will rely on the accuracy and completeness of your findings to generate the final answer. ''', llm_config=llm_config, )

Group Chat Manager:向群聊提供初始查詢,并管理所有代理之間的交互。其協(xié)調(diào)和任務(wù)分發(fā)能力受大模型能力的支持。如下:

# Define the group chat manager.
manager = autogen.GroupChatManager(
groupchat=groupchat,
llm_config=llm_config,
system_message='''You should start the workflow by consulting the analyst,
then the reporter and finally the moderator.
If the analyst does not use both the search_and_index_wikipedia and the query_wiki_index, you must request that it does.''' )

下面是一個(gè)利用AutoGen實(shí)現(xiàn)類似于MetaGPT或chatDev項(xiàng)目軟件開發(fā)場景的小例子:

# %pip install pyautogen~=0.2.0b4

import autogen
config_list_gpt4 = autogen.config_list_from_json(
"OAI_CONFIG_LIST",
filter_dict={
"model": ["gpt-4", "gpt-4-0314", "gpt4", "gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-v0314"],
},
)

llm_config = {"config_list": config_list_gpt4, "cache_seed": 42}
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
human_input_mode="TERMINATE"
)
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config,
)
pm = autogen.AssistantAgent(
name="Product_manager",
system_message="Creative in software product ideas.",
llm_config=llm_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, coder, pm], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)

user_proxy.initiate_chat(manager, message="Find a latest paper about gpt-4 on arxiv and find its potential applications in software.")
# type exit to terminate the chat

更多例子:https://microsoft.github.io/autogen/docs/Examples/AgentChat

總結(jié)

Prompt是人類與大模型交互的語言,通過本章從概念,理論到應(yīng)用的介紹,對于其價(jià)值已經(jīng)有了較為深刻的理解。而Agent就是prompt應(yīng)用的巔峰典范,從某種意義上講,它將向聲明式編程又向前推進(jìn)了一步,傳統(tǒng)的我們只能在某些細(xì)分領(lǐng)域采用獨(dú)特的dsl才可能實(shí)現(xiàn),如SQL,而現(xiàn)在我們只需要利用自然語言,描述我們想要達(dá)成的目標(biāo),Agent系統(tǒng)就將幫我們實(shí)現(xiàn)。另一方面,RAG/Copilot應(yīng)用到Agent應(yīng)用的過渡,也體現(xiàn)了“以人為主,AI輔助”向“以AI為主,人為輔助”的模式躍變。

實(shí)際上,對于一個(gè)LLM應(yīng)用,不論是Rag或是Agent,僅有一些要素組件,如大模型,Prompt,向量數(shù)據(jù)庫等是不夠的,怎么讓他們有效整合集成起來形成一個(gè)應(yīng)用系統(tǒng)才是最終的目標(biāo),這也是眾多LLM應(yīng)用框架首先切入編排集成領(lǐng)域的關(guān)鍵原因。

本文章轉(zhuǎn)載微信公眾號@AI工程化

上一篇:

如何選擇AI Agent框架?五種主流AI Agent框架對比

下一篇:

一文帶你了解大模型——智能體(Agent)
#你可能也喜歡這些API文章!

我們有何不同?

API服務(wù)商零注冊

多API并行試用

數(shù)據(jù)驅(qū)動(dòng)選型,提升決策效率

查看全部API→
??

熱門場景實(shí)測,選對API

#AI文本生成大模型API

對比大模型API的內(nèi)容創(chuàng)意新穎性、情感共鳴力、商業(yè)轉(zhuǎn)化潛力

25個(gè)渠道
一鍵對比試用API 限時(shí)免費(fèi)

#AI深度推理大模型API

對比大模型API的邏輯推理準(zhǔn)確性、分析深度、可視化建議合理性

10個(gè)渠道
一鍵對比試用API 限時(shí)免費(fèi)