cherry-studio/packages/catalog/docs/provider-endpoint-design.md
suyao 5b009769c3
feat: add utility functions for merging models and providers, including deep merge capabilities
- Implemented mergeObjects function to smartly merge objects, preserving existing values and allowing for configurable overwrite options.
- Added mergeModelsList and mergeProvidersList functions to handle merging of model and provider lists, respectively, with case-insensitive ID matching.
- Introduced preset merge strategies for common use cases.
- Created a new API route for syncing provider models, handling data import and merge operations.
- Developed ModelEditForm and ProviderEditForm components for editing model and provider details, respectively, with form validation and state management.
- Added UI components for labels, selects, and notifications to enhance user experience.
2025-12-24 01:29:07 +08:00

5.8 KiB

Provider Endpoint Schema Design IMPLEMENTED

Problem Analysis

Previous Issues (SOLVED)

  1. provider_type was semantically unclear - represented API format/protocol, not provider type
  2. api_host was in metadata but is a core configuration field
  3. anthropic_api_host existed as a separate field for dual-protocol providers
  4. supported_endpoints was too coarse-grained (all were "CHAT_COMPLETIONS")
  5. No clear mapping between endpoint types, their API hosts, and request formats

Real-World Patterns

Different LLM providers use different API formats:

  • OpenAI: Covers /v1/chat/completions, /v1/embeddings, /v1/images/generations, etc.
  • Anthropic: /v1/messages (Claude API)
  • Gemini: Custom Google API format
  • DeepSeek: Supports both OpenAI format AND Anthropic format at different base URLs

Key Insight

Most providers share the same base_url for all their endpoints - only the API format and endpoint path differ.

Final Schema Design (IMPLEMENTED)

Two-Layer Abstraction

  1. Endpoint Type - What functionality (chat, embeddings, images, etc.)
  2. API Format - What protocol (OpenAI, Anthropic, Gemini, etc.)
// Endpoint types - represents the API functionality
export const EndpointTypeSchema = z.enum([
  // LLM endpoints
  'CHAT_COMPLETIONS',
  'TEXT_COMPLETIONS',

  // Embedding endpoints
  'EMBEDDINGS',
  'RERANK',

  // Image endpoints
  'IMAGE_GENERATION',
  'IMAGE_EDIT',
  'IMAGE_VARIATION',

  // Audio endpoints
  'AUDIO_TRANSCRIPTION',
  'AUDIO_TRANSLATION',
  'TEXT_TO_SPEECH',

  // Video endpoints
  'VIDEO_GENERATION'
])

// API format types - represents the protocol/format of the API
export const ApiFormatSchema = z.enum([
  'OPENAI',     // OpenAI standard format (covers chat, embeddings, images, etc.)
  'ANTHROPIC',  // Anthropic format
  'GEMINI',     // Google Gemini API format
  'CUSTOM'      // Custom/proprietary format
])

// Format configuration - maps API format to base URL
export const FormatConfigSchema = z.object({
  format: ApiFormatSchema,
  base_url: z.string().url(),
  default: z.boolean().default(false)
})

// Provider schema with format configurations
export const ProviderConfigSchema = z.object({
  id: ProviderIdSchema,
  name: z.string(),
  description: z.string().optional(),
  authentication: AuthenticationSchema.default('API_KEY'),

  // API format configurations
  // Each provider can support multiple API formats (e.g., OpenAI + Anthropic)
  formats: z.array(FormatConfigSchema).min(1)
    .refine((formats) => formats.filter(f => f.default).length <= 1, {
      message: 'Only one format can be marked as default'
    }),

  // Supported endpoint types (optional, for documentation)
  supported_endpoints: z.array(EndpointTypeSchema).optional(),

  // API compatibility - kept for online updates
  api_compatibility: ApiCompatibilitySchema.optional(),

  documentation: z.string().url().optional(),
  website: z.string().url().optional(),
  deprecated: z.boolean().default(false),

  // Additional metadata (only truly extra fields go here)
  metadata: MetadataSchema
})

Example Data

Single Format Provider (OpenAI)

{
  "id": "openai",
  "name": "OpenAI",
  "formats": [
    {
      "format": "OPENAI",
      "base_url": "https://api.openai.com",
      "default": true
    }
  ],
  "supported_endpoints": [
    "CHAT_COMPLETIONS",
    "EMBEDDINGS",
    "IMAGE_GENERATION",
    "TEXT_TO_SPEECH",
    "AUDIO_TRANSCRIPTION"
  ]
}

Multi-Format Provider (DeepSeek)

{
  "id": "deepseek",
  "name": "DeepSeek",
  "formats": [
    {
      "format": "OPENAI",
      "base_url": "https://api.deepseek.com",
      "default": true
    },
    {
      "format": "ANTHROPIC",
      "base_url": "https://api.deepseek.com/anthropic"
    }
  ],
  "supported_endpoints": ["CHAT_COMPLETIONS"]
}

Custom Format Provider (Anthropic)

{
  "id": "anthropic",
  "name": "Anthropic",
  "formats": [
    {
      "format": "ANTHROPIC",
      "base_url": "https://api.anthropic.com",
      "default": true
    }
  ],
  "supported_endpoints": ["CHAT_COMPLETIONS"]
}

Benefits

  1. Clear Semantics: format clearly indicates the API protocol, endpoint_type indicates functionality
  2. Simplified Structure: Same base_url for most providers, only format differs
  3. Multi-Protocol Support: Providers can support multiple formats naturally (e.g., DeepSeek)
  4. Default Selection: Client knows which format to use by default
  5. No Metadata Pollution: Core config fields are top-level, not in metadata
  6. Extensible: Easy to add new endpoint types or formats
  7. Business Logic Separation: Schema doesn't encode priority/selection logic - that's for client code

Migration Completed

Migration script: scripts/migrate-providers-to-formats.ts

Transformations applied:

  • metadata.provider_typeformats[0].format (mapped to OPENAI/ANTHROPIC/GEMINI)
  • metadata.api_hostformats[0].base_url
  • metadata.anthropic_api_hostformats[1] with format: ANTHROPIC
  • supported_endpoints → set to ["CHAT_COMPLETIONS"] as default
  • Cleaned metadata to remove migrated fields

Special Cases

Replicate (per-model endpoints)

For providers where each model has a unique endpoint URL:

  • Provider defines formats: [{ format: "CUSTOM", base_url: "https://api.replicate.com", default: true }]
  • Model stores custom endpoint in metadata.custom_endpoint or similar field
  • Client code handles CUSTOM format by checking model metadata

Future: Multiple Endpoint Types

When providers add support for embeddings, images, etc.:

  • Simply update supported_endpoints array
  • Client code maps endpoint_type + format to correct API path
    • Example: EMBEDDINGS + OPENAI{base_url}/v1/embeddings
    • Example: CHAT_COMPLETIONS + ANTHROPIC{base_url}/v1/messages