
掌握API建模:基本概念和實踐
本篇博客將探討如何將阿里云的 AI 服務與 Elasticsearch 集成。您將學習如何在 Elasticsearch 中設置和使用阿里巴巴的文本生成(chat completion)、重排序(rerank)、稀疏向量(sparse vector)和稠密向量(dense vector)服務。將這些多種類型的模型集成到推理任務中將增強包括 RAG 在內的許多應用場景的搜索相關性。阿里云團隊為 Elasticsearch 開放推理 API 貢獻代碼來支持這些任務類型,并且通過示例了解如何在 Elasticsearch 環境中配置和使用這些服務。注意阿里云使用?service_id?這樣的術語 而不是?model_id。
本演練假設您已經擁有阿里云的帳戶來使用阿里云 AI 搜索開放平臺。接下來,您需要創建一個工作空間和 API key 以用于創建推理模型。
在 Elasticsearch 中,通過“alibabacloud-ai-search”服務來創建端點,并提供服務設置,包括工作空間、主機地址、服務 ID 和用于訪問阿里云 AI 搜索平臺的 api key。在 Elasticsearch 的示例中,使用“ops-text-embedding-001”作為 service id 創建一個文本向量端點。
PUT _inference/text_embedding/ali_ai_embeddings
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>",
"service_id": "ops-text-embedding-001",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}
您將收到來自 Elasticsearch 的響應,其中包含已成功創建的端點:
{
"inference_id": "ali_ai_embeddings",
"task_type": "text_embedding",
"service": "alibabacloud-ai-search",
"service_settings": {
"similarity": "dot_product",
"dimensions": 1536,
"service_id": "ops-text-embedding-001",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default",
"rate_limit": {
"requests_per_minute": 10000
}
},
"task_settings": {}
}
請注意,模型創建不需要其他額外的設置。Elasticsearch 將自動連接阿里云 AI 搜索開放平臺來測試您的憑據和 service id,并為您填寫維度數和相似度度量。接下來,測試端點以確保一切設置正確。為此將調用執行推理 API:
POST _inference/text_embedding/ali_ai_embeddings
{
"input": "What is Elastic?"
}
API 調用將返回輸入文本生成的向量,如下所示:
{
"text_embedding": [
{
"embedding": [
0.048400473,
0.051464397,
… (additional values) …
0.033325635,
-0.008986305
]
}
]
}
您現在已準備好開始探索。嘗試完這些示例后,請看一下 Elasticsearch 中針對語義搜索應用場景的一些令人興奮的創新:
semantic_text
?字段 簡化了向量的存儲和分塊 – 只需選擇您的模型,Elastic 即可完成剩余的工作!retrievers
允許您設置多階段召回處理管道首先,需要我們深入研究示例!
阿里云提供了多種會話生成模型,service ID 列于其模型?API文檔?中。
設置用于會話生成的推理服務:
PUT _inference/completion/ali-chat
{
"service": "alibabacloud-ai-search",
"service_settings": {
"host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"api_key": "xxxxxxxxxxxxxxxxxx",
"service_id": "ops-qwen-turbo",
"workspace" : "default"
}
}
返回
"inference_id": "ali-chat",
"task_type": "completion",
"service": "alibabacloud-ai-search",
"service_settings": {
"service_id": "ops-qwen-turbo",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default",
"rate_limit": {
"requests_per_minute": 1000
}
},
"task_settings": {}
}
使用配置的端點,發送 POST 請求以生成會話:
POST _inference/completion/ali-chat
{
"input":["Where is the capital of Henan?"]
}
返回
{
"completion": [
{
"result": "The capital of Henan is Zhengzhou."
}
]
}
獨一無二的,在阿里云的 Elastic 推理 API 集成中,聊天歷史記錄可以包含在輸入中,在此示例中,包含了之前返回的內容并添加了:“那里有什么有趣的事情嗎?”
POST _inference/completion/ali-chat
{
"input":["Where is the capital of Henan?", "The capital of Henan is Zhengzhou.", "What fun things are there?" ]
}
響應明確包含了歷史記錄
{
"completion": [
{
"result": "I'm sorry, I do not have enough information to provide a specific list of fun things to do in Zhengzhou, Henan. I can only tell you that Zhengzhou is the capital of Henan province. To find out about fun activities, attractions, or events in Zhengzhou, I would suggest researching local tourism websites, asking locals, or checking out travel guides for the area."
}
]
}
在未來的更新中,Elastic 計劃允許用戶明確地包含聊天歷史記錄,以提高易用性。
繼續下一個任務類型,重排序。重排序可以利用阿里云強大的模型對搜索結果進行重新排序,以提高相關性。如果您想了解有關此概念的更多信息,請查看 Elastic 上的此博客?Search Labs。
配置重排序推理服務:
PUT _inference/rerank/ali-rank
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "xxxxxxxxxxxxxxxxxx",
"service_id": "ops-bge-reranker-larger",
"host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace" : "default"
}
}
{
"inference_id": "ali-rank",
"task_type": "rerank",
"service": "alibabacloud-ai-search",
"service_settings": {
"service_id": "ops-bge-reranker-larger",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default",
"rate_limit": {
"requests_per_minute": 1000
}
},
"task_settings": {}
}
發送 POST 請求以重新排列您的搜索查詢結果:rerank 接口不需要很多配置(task_settings),它返回語義相關性分數,越相關的越靠前,并返回其對應文本在輸入數組中的序號 。
POST _inference/rerank/ali-rank
{
"query": "What is the capital of the USA?",
"input": [
"Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.",
"Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.",
"North Dakota is a state in the United States. 672,591 people lived in North Dakota in the year 2010. The capital and seat of government is Bismarck."
]
}
{
"rerank": [
{
"index": 3,
"relevance_score": 0.9998832
},
{
"index": 4,
"relevance_score": 0.008847355
},
{
"index": 5,
"relevance_score": 0.0026626128
},
{
"index": 0,
"relevance_score": 0.00068250194
},
{
"index": 2,
"relevance_score": 0.00019716943
},
{
"index": 1,
"relevance_score": 0.00011591934
}
]
}
阿里云特別為生成稀疏向量提供了一個模型 ,使用?ops-text-sparse-embedding-001?這個 service id。
PUT _inference/sparse_embedding/ali-sparse-embedding
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "xxxxxxxxxxxxxxxxxx",
"service_id": "ops-text-sparse-embedding-001",
"host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace" : "default"
}
}
{
"inference_id": "ali-sparse-embedding",
"task_type": "sparse_embedding",
"service": "alibabacloud-ai-search",
"service_settings": {
"service_id": "ops-text-sparse-embedding-001",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default",
"rate_limit": {
"requests_per_minute": 1000
}
},
"task_settings": {}
}
Sparse 的 task_settings 為:
POST _inference/sparse_embedding/ali-sparse-embedding
{
"input": "Hello world",
"task_settings": {
"input_type": "search",
"return_token": true
}
}
{
"sparse_embedding": [
{
"is_truncated": false,
"embedding": {
"hello": 0.27783203,
"world": 0.28222656
}
}
]
}
設置 return_token == false 返回如下
{
"sparse_embedding": [
{
"is_truncated": false,
"embedding": {
"8999": 0.28222656,
"35378": 0.27783203
}
}
]
}
阿里云還為不同的任務類型提供多種文本向量模型 。
文本向量只有一個 task_setting:
PUT _inference/text_embedding/ali-embeddings
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "xxxxxxxxxxxxxxxxxx",
"service_id": "ops-text-embedding-001",
"host" : "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace" : "default"
}
}
{
"inference_id": "ali-embeddings",
"task_type": "text_embedding",
"service": "alibabacloud-ai-search",
"service_settings": {
"service_id": "ops-text-embedding-001",
"host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default",
"rate_limit": {
"requests_per_minute": 1000
},
"similarity": "dot_product",
"dimensions": 1536
},
"task_settings": {}
}
發送 POST 請求以生成文本向量:
POST _inference/text_embedding/ali-embeddings
{
"input": "Hello world"
}
{
"text_embedding": [
{
"embedding": [
-0.017036675,
0.07038724,
0.044685286,
0.0064531807,
0.013290042,
0.011183944,
-0.0020014185,
-0.009508779,
無論您是使用 Elasticsearch 實現混合搜索、語義重排序,還是通過摘要增強 RAG 應用場景,與阿里云 AI 服務的連接都為 Elasticsearch 開發人員打開了一個充滿可能性的新世界。再次感謝阿里云團隊的貢獻!
本文章轉載微信公眾號@阿里云大數據AI平臺