- Implemented mergeObjects function to smartly merge objects, preserving existing values and allowing for configurable overwrite options. - Added mergeModelsList and mergeProvidersList functions to handle merging of model and provider lists, respectively, with case-insensitive ID matching. - Introduced preset merge strategies for common use cases. - Created a new API route for syncing provider models, handling data import and merge operations. - Developed ModelEditForm and ProviderEditForm components for editing model and provider details, respectively, with form validation and state management. - Added UI components for labels, selects, and notifications to enhance user experience.
11 KiB
Provider Model Synchronization Guide
This guide explains how to use the provider model synchronization system to automatically fetch and update model catalogs from provider APIs.
Overview
The synchronization system consists of three main components:
- Provider API Configuration (
models_apiin providers.json) - Web UI Sync Button (Manual sync per provider)
- Batch Sync Script (Automated sync for all providers)
Provider API Configuration
Schema
Each provider can have a models_api configuration:
{
"id": "openrouter",
"models_api": {
"endpoints": [
{
"url": "https://openrouter.ai/api/v1/models",
"endpoint_type": "CHAT_COMPLETIONS",
"format": "OPENAI",
"transformer": "openrouter"
}
],
"enabled": true,
"update_frequency": "realtime",
"last_synced": "2025-01-15T10:30:00.000Z"
}
}
Fields
-
endpoints: Array of API endpoints to fetch models fromurl: Full API endpoint URLendpoint_type: Type of models (CHAT_COMPLETIONS, EMBEDDINGS, etc.)format: API format (OPENAI, ANTHROPIC, GEMINI)transformer: Optional custom transformer name (openrouter, aihubmix)
-
enabled: Whether sync is enabled for this provider -
update_frequency: Suggested sync frequencyrealtime: Aggregators that change frequently (OpenRouter, AIHubMix)daily: Most official providersweekly: Stable providersmanual: Manual sync only
-
last_synced: ISO timestamp of last successful sync (auto-updated)
Setup
Environment Variables
Most providers require API keys to list their models. Configure your API keys:
-
Copy the example file:
cd packages/catalog cp .env.example .env -
Edit
.envand add your API keys:# Official Providers OPENAI_API_KEY=sk-... GROQ_API_KEY=gsk_... TOGETHER_API_KEY=... # China Aggregators DEEPSEEK_API_KEY=... SILICON_API_KEY=... -
Keep
.envsecure:- Never commit
.envto git (already in.gitignore) - Use different keys for development and production
- Rotate keys periodically
- Never commit
API Key Format
Each provider has a corresponding environment variable:
| Provider ID | Environment Variable | Example Format |
|---|---|---|
| openai | OPENAI_API_KEY |
sk-... |
| groq | GROQ_API_KEY |
gsk_... |
| deepseek | DEEPSEEK_API_KEY |
sk-... |
| silicon | SILICON_API_KEY |
sk-... |
| together | TOGETHER_API_KEY |
... |
| mistral | MISTRAL_API_KEY |
... |
| perplexity | PERPLEXITY_API_KEY |
pplx-... |
See .env.example for the complete list.
Usage
Method 1: Web UI (Per Provider)
- Open the provider management page (
/providers) - Find a provider with
models_apienabled - Click the Sync button in the Actions column
- Wait for the sync to complete (toast notification will show progress)
- Review the statistics (fetched, new models, overrides)
Features:
- Real-time progress feedback
- Detailed statistics
- Manual trigger control
- Per-provider sync
Use Cases:
- Testing new provider configurations
- Emergency updates for specific providers
- Validating API changes
Method 2: Batch Sync Script (All Providers)
Run the batch sync script to sync all providers at once:
cd packages/catalog
npm run sync:all
Features:
- Syncs all providers with
models_api.enabled = true - Skips OpenRouter and AIHubMix (use dedicated import scripts)
- Adds delays to avoid rate limiting
- Comprehensive progress logging
- Summary statistics
Use Cases:
- Scheduled updates (cron jobs, CI/CD)
- Initial bulk import
- Regular maintenance updates
Output Example:
============================================================
Batch Provider Model Sync
============================================================
Loading data files...
Loaded:
- 51 providers
- 604 models
- 120 overrides
Providers to sync: 49
Skipping: openrouter, aihubmix (authoritative sources)
API Keys Status:
✓ Found: 12
✗ Missing: 37
Providers without API keys (will likely fail):
- cherryin (env: CHERRYIN_API_KEY)
- silicon (env: SILICON_API_KEY)
...
To configure API keys:
1. Copy .env.example to .env
2. Fill in your API keys
3. Re-run this script
[deepseek] Syncing models...
- Fetching from https://api.deepseek.com/v1/models
✓ Fetched 3 models
+ Adding 1 new models to models.json
+ Generated 2 new overrides
...
============================================================
Sync Summary
============================================================
Total providers: 49
✓ Successful: 47
✗ Failed: 2
Statistics:
- Total models fetched: 520
- New models added: 45
- Overrides generated: 178
- Overrides merged: 12
✓ Batch sync completed
============================================================
How It Works
Data Flow
Provider API → Transformer → ModelConfig
↓
Compare with models.json
↓
┌──────────────────┴─────────────────┐
↓ ↓
New Model Existing Model
↓ ↓
Add to models.json Generate Override
↓
Merge with existing
↓
Save to overrides.json
Override Generation
The system automatically generates overrides for all models supported by a provider, even if identical to the base model. This serves two purposes:
- Provider Support Tracking: Mark which providers support which models
- Difference Recording: Record any differences from the base model
Override Types:
-
Empty Override (identical models):
{ "provider_id": "groq", "model_id": "llama-3.1-8b", "priority": 0 }This marks that the provider supports the model with no differences.
-
Override with Differences:
{ "provider_id": "provider-x", "model_id": "gpt-4", "priority": 0, "pricing": { "input": { "per_million_tokens": 5.0, "currency": "USD" }, "output": { "per_million_tokens": 15.0, "currency": "USD" } }, "limits": { "context_window": 32000 } }
Priority System:
priority < 100: Auto-generated overrides (replaced on sync)priority >= 100: Manual overrides (preserved during sync)
Merge Strategy
When syncing:
- New Models: Added directly to
models.json - Existing Models with Differences: Override created/updated in
overrides.json - Manual Overrides: Preserved (priority >= 100)
- Auto Overrides: Replaced with latest data (priority < 100)
Transformers
Built-in Transformers
-
OpenAI-compatible (default): Standard OpenAI API format
- Used by most providers (deepseek, groq, together, etc.)
- Handles
{ data: [...] }responses - Basic capability inference
-
OpenRouter: Custom transformer for OpenRouter aggregator
- Normalizes model IDs to lowercase
- Extracts provider from model ID format (
openai/gpt-4) - Advanced capability inference from supported_parameters
- Pricing conversion (per-token → per-million)
-
AIHubMix: Custom transformer for AIHubMix aggregator
- Normalizes model IDs to lowercase
- Parses CSV fields (types, features, input_modalities)
- Capability mapping (thinking → REASONING, etc.)
- Provider extraction from model ID
Adding Custom Transformers
To add a custom transformer:
- Create
src/utils/importers/{provider}/transformer.ts - Implement
ITransformerinterface - Update sync endpoint to use your transformer
- Add transformer name to provider config
Example:
import type { ModelConfig } from '../../../schemas'
import type { ITransformer } from '../base/base-transformer'
export class CustomTransformer implements ITransformer<CustomModel> {
extractModels(response: any): CustomModel[] {
// Extract models from API response
}
transform(apiModel: CustomModel): ModelConfig {
// Transform to internal format
}
}
Best Practices
1. Authoritative Sources
OpenRouter and AIHubMix are treated as authoritative sources because:
- They aggregate models from multiple providers
- They have custom transformers with advanced logic
- They should be imported using dedicated scripts:
npm run import:openrouter npm run import:aihubmix
2. Sync Frequency
Recommended sync frequencies:
| Provider Type | Frequency | Reason |
|---|---|---|
| Aggregators | Daily | Models change frequently |
| Official APIs | Weekly | Stable, infrequent updates |
| Beta/Experimental | Manual | May have unstable APIs |
3. API Keys
Most providers require API keys for model listing:
For Batch Script:
- Configure in
.envfile (see Setup section above) - Script will automatically use the appropriate key for each provider
- Missing keys will trigger warnings but won't stop the sync
For Web UI:
- Currently uses same
.envfile (server-side) - Future enhancement: API key input field in UI
4. Rate Limiting
The batch script includes:
- 1-second delay between providers
- Error handling to continue on failures
- Retry logic (future enhancement)
5. Manual Overrides
To create manual overrides that won't be replaced:
- Set
priority >= 100inoverrides.json - Add reason field to document why it's manual
- These will be preserved during sync
Example:
{
"provider_id": "custom-provider",
"model_id": "special-model",
"priority": 100,
"reason": "Custom pricing negotiated with provider",
"pricing": {
"input": { "per_million_tokens": 1.0, "currency": "USD" },
"output": { "per_million_tokens": 2.0, "currency": "USD" }
}
}
Troubleshooting
Provider Sync Fails
- Check if
models_api.enabled = true - Verify API endpoint URL is accessible
- Check if API key is required
- Review transformer compatibility
Models Not Appearing
- Check if model IDs are normalized to lowercase
- Verify transformer is extracting models correctly
- Check console logs for transformation errors
Overrides Not Generated
- Verify model exists in base
models.json - Check if differences actually exist (pricing, capabilities, etc.)
- Review merge strategy settings
Future Enhancements
- API key management in Web UI
- Scheduled sync (cron-style)
- Sync history and audit log
- Conflict resolution UI
- Retry logic with exponential backoff
- Webhook notifications
- Differential sync (only changed models)
- Provider-specific transformers registry