Dec 23, 2024

Google A2A Protocol 深度剖析：Agent 間通訊的開放標準與任務生命週期管理

生成提示詞

請深入分析 Google A2A（Agent-to-Agent）Protocol：
1. Clone A2A 源代碼並閱讀規範文檔
2. 分析 Agent Card 發現機制與能力宣告
3. 研究 Task 生命週期與狀態管理
4. 解析 JSON-RPC 訊息格式與多模態內容
5. 分析 SSE 串流與 Push 通知機制
6. 繪製 Mermaid 流程圖說明協議交互
7. 以繁體中文撰寫工程級深度文章

執行摘要

Google A2A（Agent-to-Agent）Protocol 是一個開放標準，讓來自不同框架和供應商的 AI Agent 能夠作為「一等公民」進行通訊。本文將深入剖析其核心技術：

Agent Card：自描述的能力宣告與發現機制
Task 生命週期：從 SUBMITTED 到 COMPLETED 的狀態機
JSON-RPC 2.0：標準化的訊息格式與 RPC 方法
多模態內容：文字、檔案、表單的統一處理
串流與推送：SSE 即時串流與 Webhook 異步通知

1. 協議設計哲學

1.1 三層架構

flowchart TB
    subgraph Layer3["協議綁定層"]
        JSONRPC[JSON-RPC 2.0]
        GRPC[gRPC]
        HTTP[HTTP+JSON]
    end

    subgraph Layer2["抽象操作層"]
        Send[send]
        Stream[stream]
        Get[get]
        List[list]
        Cancel[cancel]
    end

    subgraph Layer1["規範資料模型"]
        Task[Task]
        Message[Message]
        Part[Part]
        Artifact[Artifact]
    end

    Layer3 --> Layer2
    Layer2 --> Layer1

1.2 五大設計原則

原則	說明
簡單性	重用 HTTP、JSON-RPC、SSE 等現有標準
企業就緒	內建認證、授權、安全、追蹤、監控
非同步優先	原生支援長時間任務與人機互動
模態無關	支援文字、檔案、結構化資料、表單
不透明執行	Agent 協作無需暴露內部邏輯與記憶體

1.3 解決的核心問題

傳統 Agent 整合的痛點：

flowchart LR
    subgraph Traditional["傳統方式"]
        A1[Agent 1] -->|自定義協議| A2[Agent 2]
        A1 -->|包裝為工具| A3[Agent 3]
        A2 -->|不同 API| A3
    end

    subgraph A2A["A2A 方式"]
        B1[Agent 1] -->|標準協議| B2[Agent 2]
        B1 -->|標準協議| B3[Agent 3]
        B2 -->|標準協議| B3
    end

2. Agent Card 發現機制

2.1 Agent Card 結構

Agent Card 是 Agent 的自描述清單：

interface AgentCard {
  // 核心身份
  protocolVersion: "1.0";           // 必須
  name: string;                     // Agent 名稱
  description: string;              // 功能描述
  version: string;                  // 語義化版本

  // 提供者資訊
  provider: AgentProvider;
  documentationUrl?: string;
  iconUrl?: string;

  // 介面配置
  supportedInterfaces: AgentInterface[];

  // 能力宣告
  capabilities: AgentCapabilities;

  // 安全方案
  securitySchemes: Record<string, SecurityScheme>;
  security: Security[];

  // 互動規範
  defaultInputModes: string[];      // 接受的媒體類型
  defaultOutputModes: string[];     // 產出的媒體類型

  // 技能廣告
  skills: AgentSkill[];

  // 擴展
  extensions?: AgentExtension[];
  signatures?: AgentCardSignature[];
}

2.2 AgentSkill 定義

每個技能代表 Agent 的一項能力：

{
  "skills": [
    {
      "id": "recipe-generator",
      "name": "Recipe Generator",
      "description": "Generate recipes based on ingredients",
      "tags": ["cooking", "recipes", "food"],
      "examples": [
        "Generate a recipe with chicken and vegetables",
        "Create a vegetarian pasta dish"
      ],
      "inputModes": ["text/plain", "application/json"],
      "outputModes": ["text/plain", "application/json"]
    }
  ]
}

2.3 發現策略

flowchart TB
    subgraph WellKnown["Well-Known URI"]
        WK1[GET /.well-known/agent-card.json]
        WK2[RFC 8615 標準]
        WK3[域名控制的發現]
    end

    subgraph Registry["策展註冊表"]
        R1[中央註冊表]
        R2[技能/標籤查詢]
        R3[存取控制]
    end

    subgraph Direct["直接配置"]
        D1[環境變數]
        D2[配置檔案]
        D3[專有 API]
    end

    Client[客戶端] --> WellKnown
    Client --> Registry
    Client --> Direct

2.4 Agent Card 範例

{
  "protocolVersion": "1.0",
  "name": "Travel Booking Agent",
  "description": "Book flights, hotels, and activities",
  "version": "2.1.0",
  "provider": {
    "name": "TravelCorp",
    "url": "https://travelcorp.example.com"
  },
  "supportedInterfaces": [
    {
      "url": "https://api.travelcorp.example.com/a2a",
      "protocolBinding": "JSONRPC"
    }
  ],
  "capabilities": {
    "streaming": true,
    "pushNotifications": true
  },
  "securitySchemes": {
    "oauth2": {
      "oauth2SecurityScheme": {
        "flows": {
          "clientCredentials": {
            "tokenUrl": "https://auth.travelcorp.example.com/token",
            "scopes": {
              "booking:read": "Read bookings",
              "booking:write": "Create bookings"
            }
          }
        }
      }
    }
  },
  "security": [{"oauth2": ["booking:read", "booking:write"]}],
  "defaultInputModes": ["text/plain", "application/json"],
  "defaultOutputModes": ["text/plain", "application/json"],
  "skills": [
    {
      "id": "flight-booking",
      "name": "Flight Booking",
      "description": "Search and book flights"
    },
    {
      "id": "hotel-booking",
      "name": "Hotel Booking",
      "description": "Search and book hotels"
    }
  ]
}

3. Task 生命週期管理

3.1 Task 狀態機

stateDiagram-v2
    [*] --> SUBMITTED: 創建任務

    SUBMITTED --> WORKING: 開始處理
    SUBMITTED --> REJECTED: 拒絕任務

    WORKING --> INPUT_REQUIRED: 需要輸入
    WORKING --> AUTH_REQUIRED: 需要認證
    WORKING --> COMPLETED: 成功完成
    WORKING --> FAILED: 處理失敗
    WORKING --> CANCELLED: 用戶取消

    INPUT_REQUIRED --> WORKING: 提供輸入
    AUTH_REQUIRED --> WORKING: 提供認證

    COMPLETED --> [*]
    FAILED --> [*]
    CANCELLED --> [*]
    REJECTED --> [*]

3.2 狀態定義

狀態	類型	說明
SUBMITTED	初始	任務已創建並接受
WORKING	活躍	任務正在處理中
INPUT_REQUIRED	中斷	Agent 需要澄清或資料
AUTH_REQUIRED	暫停	需要額外認證
COMPLETED	終態	成功完成
FAILED	終態	處理失敗（Agent 錯誤）
CANCELLED	終態	用戶/客戶端取消
REJECTED	終態	Agent 拒絕任務

3.3 Task 結構

interface Task {
  id: string;                    // Server 生成的 UUID
  contextId: string;             // 關聯任務的群組 ID
  status: TaskStatus;            // 當前狀態
  artifacts: Artifact[];         // 生成的輸出
  history: Message[];            // 多輪對話歷史
  metadata?: Record<string, any>;// 自定義元資料
}

interface TaskStatus {
  state: TaskState;              // 當前狀態
  message?: Message;             // 可選的狀態訊息
  timestamp: string;             // ISO 8601 時間戳
}

3.4 Context 管理

contextId 邏輯上將相關的任務和訊息分組：

flowchart TB
    subgraph Context["contextId: ctx-abc"]
        T1[Task 1: 搜尋航班]
        T2[Task 2: 修改目的地]
        T3[Task 3: 確認預訂]

        T1 --> T2
        T2 --> T3
    end

    subgraph Artifacts["Artifact 演進"]
        A1[art-v1: 初始結果]
        A2[art-v2: 修改後結果]
        A3[art-v3: 最終確認]
    end

    T1 --> A1
    T2 --> A2
    T3 --> A3

3.5 任務精煉模式

客戶端可以在同一 context 中精煉之前的結果：

// 原始任務創建 artifact
{
  "taskId": "task-123",
  "contextId": "ctx-abc",
  "artifacts": [{"artifactId": "art-v1"}]
}

// 精煉請求引用之前的任務
{
  "method": "message:send",
  "params": {
    "message": {
      "contextId": "ctx-abc",
      "referenceTaskIds": ["task-123"],
      "parts": [{"text": "請將航班改為下午出發"}]
    }
  }
}

// Agent 創建新任務，相同 contextId
{
  "taskId": "task-456",
  "contextId": "ctx-abc",
  "artifacts": [{"artifactId": "art-v2", "name": "flight-options"}]
}

4. JSON-RPC 訊息格式

4.1 JSON-RPC 2.0 基礎

所有 A2A 通訊遵循 JSON-RPC 2.0 規範：

{
  "jsonrpc": "2.0",
  "id": "req-001",
  "method": "message:send",
  "params": {
    // 請求參數
  }
}

4.2 核心 RPC 方法

message:send（非串流）:

{
  "method": "message:send",
  "params": {
    "message": {
      "messageId": "msg-001",
      "role": "user",
      "contextId": "ctx-abc",
      "parts": [
        {"text": "幫我預訂明天飛往東京的航班"}
      ]
    },
    "configuration": {
      "acceptedOutputModes": ["text/plain", "application/json"],
      "historyLength": 10,
      "blocking": false
    }
  }
}

message:stream（SSE 串流）:

POST /message:stream
Content-Type: application/json

data: {"jsonrpc": "2.0", "result": {"id": "task-789", ...}}
data: {"jsonrpc": "2.0", "result": {"taskId": "task-789", "status": {"state": "working"}}}
data: {"jsonrpc": "2.0", "result": {"taskId": "task-789", "artifact": {...}}}
data: {"jsonrpc": "2.0", "result": {"taskId": "task-789", "status": {"state": "completed", "final": true}}}

tasks:get:

{
  "method": "tasks:get",
  "params": {
    "name": "tasks/task-456",
    "historyLength": 5
  }
}

tasks:list（分頁）:

{
  "method": "tasks:list",
  "params": {
    "contextId": "ctx-abc",
    "status": "working",
    "pageSize": 50,
    "pageToken": "",
    "includeArtifacts": true
  }
}

tasks:cancel:

{
  "method": "tasks:cancel",
  "params": {
    "name": "tasks/task-456"
  }
}

4.3 Message 結構

interface Message {
  messageId: string;              // 客戶端/服務器生成
  contextId?: string;             // 可選分組
  taskId?: string;                // 可選任務關聯
  role: "user" | "agent";         // 訊息角色
  parts: Part[];                  // 內容
  metadata?: Record<string, any>; // 自定義元資料
  extensions?: string[];          // 啟用的擴展
  referenceTaskIds?: string[];    // 引用的任務 ID
}

4.4 Artifact 結構

interface Artifact {
  artifactId: string;             // UUID，任務內唯一
  name: string;                   // 人類可讀名稱
  description?: string;           // 可選用途說明
  parts: Part[];                  // 內容（與 Message 相同）
  metadata?: Record<string, any>; // 自定義元資料
  extensions?: string[];          // 擴展元資料
}

5. 多模態內容處理

5.1 Part 類型

interface Part {
  // 三選一
  text?: string;                  // 純文字
  file?: FilePart;                // 二進位/檔案
  data?: DataPart;                // 結構化資料

  metadata?: Record<string, any>; // 自定義元資料
}

interface FilePart {
  // 二選一
  fileWithUri?: string;           // URI 引用
  fileWithBytes?: string;         // Base64 內聯

  mediaType: string;              // MIME 類型
  name?: string;                  // 檔案名稱
}

interface DataPart {
  data: Record<string, any>;      // JSON 結構
}

5.2 支援的內容類型

┌─ 文字 (text/plain, text/markdown)
├─ 程式碼 (text/python, text/javascript)
├─ 結構化資料 (application/json, application/xml)
├─ 表單 (application/x-www-form-urlencoded)
├─ 圖片 (image/png, image/jpeg, image/webp)
├─ 音訊 (audio/mp3, audio/wav)
├─ 影片 (video/mp4, video/webm)
├─ 文件 (application/pdf, application/msword)
└─ 壓縮檔 (application/zip)

5.3 媒體類型協商

flowchart TB
    Client[客戶端] -->|1. 讀取 AgentCard| AgentCard[defaultOutputModes]
    Client -->|2. 設定偏好| Config[acceptedOutputModes]

    AgentCard --> Negotiate[取交集]
    Config --> Negotiate

    Negotiate --> Response[回應格式]

5.4 檔案處理範例

URI 引用（大檔案）:

{
  "parts": [{
    "file": {
      "fileWithUri": "https://storage.example.com/dataset.csv",
      "mediaType": "text/csv",
      "name": "dataset.csv"
    }
  }]
}

Base64 內聯（小檔案）:

{
  "parts": [{
    "file": {
      "fileWithBytes": "SGVsbG8sIFdvcmxkIQ==",
      "mediaType": "text/plain",
      "name": "greeting.txt"
    }
  }]
}

5.5 結構化資料（表單）

{
  "parts": [{
    "data": {
      "data": {
        "departure": "2024-02-01",
        "destination": "Paris",
        "passengers": 2,
        "budget": 5000,
        "preferences": {
          "class": "business",
          "directFlight": true
        }
      }
    }
  }]
}

6. 串流與 Push 通知

6.1 SSE 串流

能力宣告:

{
  "capabilities": {
    "streaming": true
  }
}

串流生命週期:

sequenceDiagram
    participant C as Client
    participant S as Server

    C->>S: POST /message:stream
    S-->>C: HTTP 200 OK (text/event-stream)

    S-->>C: data: {Task}
    S-->>C: data: {StatusUpdate: working}
    S-->>C: data: {ArtifactUpdate: chunk1}
    S-->>C: data: {ArtifactUpdate: chunk2}
    S-->>C: data: {StatusUpdate: completed, final: true}

    Note over C,S: 連線關閉

串流事件類型:

interface StreamResponse {
  // 四選一
  task?: Task;                           // 初始任務
  msg?: Message;                         // 訊息回應
  statusUpdate?: TaskStatusUpdateEvent;  // 狀態變更
  artifactUpdate?: TaskArtifactUpdateEvent; // Artifact 片段
}

interface TaskStatusUpdateEvent {
  taskId: string;
  contextId: string;
  status: TaskStatus;
  final: boolean;                        // 是否為終止信號
  metadata?: Record<string, any>;
}

interface TaskArtifactUpdateEvent {
  taskId: string;
  contextId: string;
  artifact: Artifact;
  append: boolean;                       // 是否追加到前一片段
  lastChunk: boolean;                    // 是否為最後片段
}

6.2 Push 通知（Webhook）