- Home
- Documentation
- providers
- Model and Provider Configuration
Model and Provider Configuration
Model and Provider Configuration (models.yml)
Section titled “Model and Provider Configuration (models.yml)”This document describes how the coding-agent currently loads models, applies overrides, resolves credentials, and chooses models at runtime.
What controls model behavior
Section titled “What controls model behavior”Primary implementation files:
src/config/model-registry.ts— loads built-in + custom models, provider overrides, runtime discovery, auth integrationsrc/config/model-resolver.ts— parses model patterns and selects initial/smol/slow modelssrc/config/settings-schema.ts— model-related settings (modelRoles, provider transport preferences)src/session/auth-storage.ts— API key + OAuth resolution orderpackages/ai/src/models.tsandpackages/ai/src/types.ts— built-in providers/models andModel/compattypes
Config file location and legacy behavior
Section titled “Config file location and legacy behavior”Default config path:
~/.xcsh/agent/models.yml
Legacy behavior still present:
- If
models.ymlis missing andmodels.jsonexists at the same location, it is migrated tomodels.yml. - Explicit
.json/.jsoncconfig paths are still supported when passed programmatically toModelRegistry.
models.yml shape
Section titled “models.yml shape”configVersion: 1 # optional — written by auto-config, used for migration detectionproviders: <provider-id>: # provider-level configequivalence: overrides: <provider-id>/<model-id>: <canonical-model-id> exclude: - <provider-id>/<model-id>configVersion is an optional integer written by the auto-config system. When present, xcsh uses it to detect outdated configs and auto-upgrade them.
provider-id is the canonical provider key used across selection and auth lookup.
equivalence is optional and configures canonical model grouping on top of concrete provider models:
overridesmaps an exact concrete selector (provider/modelId) to an official upstream canonical idexcludeopts a concrete selector out of canonical grouping
Provider-level fields
Section titled “Provider-level fields”providers: my-provider: baseUrl: https://api.example.com/v1 apiKey: MY_PROVIDER_API_KEY api: openai-completions headers: X-Team: platform authHeader: true auth: apiKey discovery: type: ollama modelOverrides: some-model-id: name: Renamed model models: - id: some-model-id name: Some Model api: openai-completions reasoning: false input: [text] cost: input: 0 output: 0 cacheRead: 0 cacheWrite: 0 contextWindow: 128000 maxTokens: 16384 headers: X-Model: value compat: supportsStore: true supportsDeveloperRole: true supportsReasoningEffort: true maxTokensField: max_completion_tokens openRouterRouting: only: [anthropic] vercelGatewayRouting: order: [anthropic, openai] extraBody: gateway: m1-01 controller: mlxAllowed provider/model api values
Section titled “Allowed provider/model api values”openai-completionsopenai-responsesopenai-codex-responsesazure-openai-responsesanthropic-messagesgoogle-generative-aigoogle-vertex
Allowed auth/discovery values
Section titled “Allowed auth/discovery values”auth:apiKey(default) ornonediscovery.type:ollama
Validation rules (current)
Section titled “Validation rules (current)”Full custom provider (models is non-empty)
Section titled “Full custom provider (models is non-empty)”Required:
baseUrlapiKeyunlessauth: noneapiat provider level or each model
Override-only provider (models missing or empty)
Section titled “Override-only provider (models missing or empty)”Must define at least one of:
baseUrlmodelOverridesdiscovery
Discovery
Section titled “Discovery”discoveryrequires provider-levelapi.
Model value checks
Section titled “Model value checks”idrequiredcontextWindowandmaxTokensmust be positive if provided
Merge and override order
Section titled “Merge and override order”ModelRegistry pipeline (on refresh):
- Load built-in providers/models from
@f5xc-salesdemos/pi-ai. - Load
models.ymlcustom config. - Apply provider overrides (
baseUrl,headers) to built-in models. - Apply
modelOverrides(per provider + model id). - Merge custom
models:- same
provider + idreplaces existing - otherwise append
- same
- Apply runtime-discovered models (currently Ollama and LM Studio), then re-apply model overrides.
Canonical model equivalence and coalescing
Section titled “Canonical model equivalence and coalescing”The registry keeps every concrete provider model and then builds a canonical layer above them.
Canonical ids are official upstream ids only, for example:
claude-opus-4-6claude-haiku-4-5gpt-5.3-codex
models.yml equivalence config
Section titled “models.yml equivalence config”Example:
providers: zenmux: baseUrl: https://api.zenmux.example/v1 apiKey: ZENMUX_API_KEY api: openai-codex-responses models: - id: codex name: Zenmux Codex reasoning: true input: [text] cost: input: 0 output: 0 cacheRead: 0 cacheWrite: 0 contextWindow: 200000 maxTokens: 32768
equivalence: overrides: zenmux/codex: gpt-5.3-codex p-codex/codex: gpt-5.3-codex exclude: - demo/codex-previewBuild order for canonical grouping:
- exact user override from
equivalence.overrides - bundled official-id matches from built-in model metadata
- conservative heuristic normalization for gateway/provider variants
- fallback to the concrete model’s own id
Current heuristics are intentionally narrow:
- embedded upstream prefixes can be stripped when present, for example
anthropic/...oropenai/... - dotted and dashed version variants can normalize only when they map to an existing official id, for example
4.6 -> 4-6 - ambiguous families or versions are not merged without a bundled match or explicit override
Canonical resolution behavior
Section titled “Canonical resolution behavior”When multiple concrete variants share a canonical id, resolution uses:
- availability and auth
config.ymlmodelProviderOrder- existing registry/provider order if
modelProviderOrderis unset
Disabled or unauthenticated providers are skipped.
Session state and transcripts continue to record the concrete provider/model that actually executed the turn.
Provider defaults vs per-model overrides:
- Provider
headersare baseline. - Model
headersoverride provider header keys. modelOverridescan override model metadata (name,reasoning,input,cost,contextWindow,maxTokens,headers,compat,contextPromotionTarget).compatis deep-merged for nested routing blocks (openRouterRouting,vercelGatewayRouting,extraBody).
Runtime discovery integration
Section titled “Runtime discovery integration”Implicit Ollama discovery
Section titled “Implicit Ollama discovery”If ollama is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
ollama - api:
openai-completions - base URL:
OLLAMA_BASE_URLorhttp://127.0.0.1:11434 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET /api/tags on Ollama and synthesizes model entries with local defaults.
Implicit llama.cpp discovery
Section titled “Implicit llama.cpp discovery”If llama.cpp is not explicitly configured, registry adds an implicit discoverable provider:
Note: it’s using the newer antropic messages api instead of the openai-competions.
- provider:
llama.cpp - api:
openai-responses - base URL:
LLAMA_CPP_BASE_URLorhttp://127.0.0.1:8080 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET models on llama.cpp and synthesizes model entries with local defaults.
Implicit LM Studio discovery
Section titled “Implicit LM Studio discovery”If lm-studio is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
lm-studio - api:
openai-completions - base URL:
LM_STUDIO_BASE_URLorhttp://127.0.0.1:1234/v1 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery fetches models (GET /models) and synthesizes model entries with local defaults.
Explicit provider discovery
Section titled “Explicit provider discovery”You can configure discovery yourself:
providers: ollama: baseUrl: http://127.0.0.1:11434 api: openai-completions auth: none discovery: type: ollama
llama.cpp: baseUrl: http://127.0.0.1:8080 api: openai-responses auth: none discovery: type: llama.cppExtension provider registration
Section titled “Extension provider registration”Extensions can register providers at runtime (pi.registerProvider(...)), including:
- model replacement/append for a provider
- custom stream handler registration for new API IDs
- custom OAuth provider registration
Auth and API key resolution order
Section titled “Auth and API key resolution order”When requesting a key for a provider, effective order is:
- Runtime override (CLI
--api-key) - Stored API key credential in
agent.db - Stored OAuth credential in
agent.db(with refresh) - Environment variable mapping (
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc.) - ModelRegistry fallback resolver (provider
apiKeyfrommodels.yml, env-name-or-literal semantics)
models.yml apiKey behavior:
- Value is first treated as an environment variable name.
- If no env var exists, the literal string is used as the token.
If authHeader: true and provider apiKey is set, models get:
Authorization: Bearer <resolved-key>header injected.
Keyless providers:
- Providers marked
auth: noneare treated as available without credentials. getApiKey*returnskNoAuthfor them.
Model availability vs all models
Section titled “Model availability vs all models”getAll()returns the loaded model registry (built-in + merged custom + discovered).getAvailable()filters to models that are keyless or have resolvable auth.
So a model can exist in registry but not be selectable until auth is available.
Runtime model resolution
Section titled “Runtime model resolution”CLI and pattern parsing
Section titled “CLI and pattern parsing”model-resolver.ts supports:
- exact
provider/modelId - exact canonical model id
- exact model id (provider inferred)
- fuzzy/substring matching
- glob scope patterns in
--models(e.g.openai/*,*sonnet*) - optional
:thinkingLevelsuffix (off|minimal|low|medium|high|xhigh)
--provider is legacy; --model is preferred.
Resolution precedence for exact selectors:
- exact
provider/modelIdbypasses coalescing - exact canonical id resolves through the canonical index
- exact bare concrete id still works
- fuzzy and glob matching run after the exact paths
Initial model selection priority
Section titled “Initial model selection priority”findInitialModel(...) uses this order:
- explicit CLI provider+model
- first scoped model (if not resuming)
- saved default provider/model
- known provider defaults (e.g. OpenAI/Anthropic/etc.) among available models
- first available model
Role aliases and settings
Section titled “Role aliases and settings”Supported model roles:
default,smol,slow,plan,commit
Role aliases like pi/smol expand through settings.modelRoles. Each role value can also append a thinking selector such as :minimal, :low, :medium, or :high.
If a role points at another role, the target model still inherits normally and any explicit suffix on the referring role wins for that role-specific use.
Related settings:
modelRoles(record)enabledModels(scoped pattern list)modelProviderOrder(global canonical-provider precedence)providers.kimiApiFormat(openaioranthropicrequest format)providers.openaiWebsockets(auto|off|onwebsocket preference for OpenAI Codex transport)
modelRoles may store either:
provider/modelIdto pin a concrete provider variant- a canonical id such as
gpt-5.3-codexto allow provider coalescing
For enabledModels and CLI --models:
- exact canonical ids expand to all concrete variants in that canonical group
- explicit
provider/modelIdentries stay exact - globs and fuzzy matches still operate on concrete models
/model and --list-models
Section titled “/model and --list-models”Both surfaces keep provider-prefixed models visible and selectable.
They now also expose canonical/coalesced models:
/modelincludes a canonical view alongside provider tabs--list-modelsprints a canonical section plus the concrete provider rows
Selecting a canonical entry stores the canonical selector. Selecting a provider row stores the explicit provider/modelId.
Context promotion (model-level fallback chains)
Section titled “Context promotion (model-level fallback chains)”Context promotion is an overflow recovery mechanism for small-context variants (for example *-spark) that automatically promotes to a larger-context sibling when the API rejects a request with a context length error.
Trigger and order
Section titled “Trigger and order”When a turn fails with a context overflow error (e.g. context_length_exceeded), AgentSession attempts promotion before falling back to compaction:
- If
contextPromotion.enabledis true, resolve a promotion target (see below). - If a target is found, switch to it and retry the request — no compaction needed.
- If no target is available, fall through to auto-compaction on the current model.
Target selection
Section titled “Target selection”Selection is model-driven, not role-driven:
currentModel.contextPromotionTarget(if configured)- smallest larger-context model on the same provider + API
Candidates are ignored unless credentials resolve (ModelRegistry.getApiKey(...)).
OpenAI Codex websocket handoff
Section titled “OpenAI Codex websocket handoff”If switching from/to openai-codex-responses, session provider state key openai-codex-responses is closed before model switch. This drops websocket transport state so the next turn starts clean on the promoted model.
Persistence behavior
Section titled “Persistence behavior”Promotion uses temporary switching (setModelTemporary):
- recorded as a temporary
model_changein session history - does not rewrite saved role mapping
Configuring explicit fallback chains
Section titled “Configuring explicit fallback chains”Configure fallback directly in model metadata via contextPromotionTarget.
contextPromotionTarget accepts either:
provider/model-id(explicit)model-id(resolved within current provider)
Example (models.yml) for Spark -> non-Spark on the same provider:
providers: openai-codex: modelOverrides: gpt-5.3-codex-spark: contextPromotionTarget: openai-codex/gpt-5.3-codexThe built-in model generator also assigns this automatically for *-spark models when a same-provider base model exists.
Compatibility and routing fields
Section titled “Compatibility and routing fields”models.yml supports this compat subset:
supportsStoresupportsDeveloperRolesupportsReasoningEffortmaxTokensField(max_completion_tokensormax_tokens)openRouterRouting.only/openRouterRouting.ordervercelGatewayRouting.only/vercelGatewayRouting.order
These are consumed by the OpenAI-completions transport logic and combined with URL-based auto-detection.
Practical examples
Section titled “Practical examples”Local OpenAI-compatible endpoint (no auth)
Section titled “Local OpenAI-compatible endpoint (no auth)”providers: local-openai: baseUrl: http://127.0.0.1:8000/v1 auth: none api: openai-completions models: - id: Qwen/Qwen2.5-Coder-32B-Instruct name: Qwen 2.5 Coder 32B (local)Hosted proxy with env-based key
Section titled “Hosted proxy with env-based key”providers: anthropic-proxy: baseUrl: https://proxy.example.com/anthropic apiKey: ANTHROPIC_PROXY_API_KEY api: anthropic-messages authHeader: true models: - id: claude-sonnet-4-20250514 name: Claude Sonnet 4 (Proxy) reasoning: true input: [text, image]Override built-in provider route + model metadata
Section titled “Override built-in provider route + model metadata”providers: openrouter: baseUrl: https://my-proxy.example.com/v1 headers: X-Team: platform modelOverrides: anthropic/claude-sonnet-4: name: Sonnet 4 (Corp) compat: openRouterRouting: only: [anthropic]LiteLLM proxy auto-configuration
Section titled “LiteLLM proxy auto-configuration”When both LITELLM_BASE_URL and LITELLM_API_KEY environment variables are set, xcsh automatically manages models.yml configuration for the LiteLLM proxy.
First-run auto-generation
Section titled “First-run auto-generation”If models.yml does not exist and LiteLLM env vars are detected, xcsh generates it automatically:
# Auto-generated by xcsh for LiteLLM proxy# API key resolved from LITELLM_API_KEY env var at runtimeconfigVersion: 1providers: anthropic: baseUrl: "https://your-litellm-proxy.example.com/anthropic" apiKey: LITELLM_API_KEYA default config.yml is also generated with sensible image provider settings.
Startup self-healing
Section titled “Startup self-healing”On every startup, startupHealthCheck() in the model registry runs the following checks:
| Condition | Action |
|---|---|
models.yml missing | Auto-generate from env vars |
models.yml corrupt or unparseable | Backup to .bak, regenerate |
baseUrl doesn’t match LITELLM_BASE_URL | Backup to .bak, regenerate with new URL |
configVersion missing or outdated | Backup to .bak, regenerate with current version |
| Config is healthy | No action |
All repairs create .bak backups before overwriting. All operations are idempotent.
CLI command
Section titled “CLI command”xcsh setup litellm # Generate or fix LiteLLM configxcsh setup litellm --check # Validate without writingxcsh setup litellm --check --json # Machine-readable validation outputRequired environment variables
Section titled “Required environment variables”| Variable | Purpose |
|---|---|
LITELLM_BASE_URL | LiteLLM proxy URL (e.g. https://your-proxy.example.com). Must start with http:// or https://. |
LITELLM_API_KEY | API key for the proxy. Referenced by name in generated config, resolved at runtime. |
If either variable is unset, auto-configuration is silently skipped.
Config versioning
Section titled “Config versioning”Generated configs include a configVersion field. When the generated format changes in future releases, xcsh detects outdated configs and automatically upgrades them (with backup).
Legacy consumer caveat
Section titled “Legacy consumer caveat”Most model configuration now flows through models.yml via ModelRegistry.
One notable legacy path remains: web-search Anthropic auth resolution still reads ~/.xcsh/agent/models.json directly in src/web/search/auth.ts.
If you rely on that specific path, keep JSON compatibility in mind until that module is migrated.
Failure mode
Section titled “Failure mode”If models.yml fails schema or validation checks:
- If
LITELLM_BASE_URLandLITELLM_API_KEYare set, the startup health check attempts auto-repair (backup corrupt file, regenerate from env vars). If repair succeeds, the registry reloads the fixed config. - If auto-repair is not possible (env vars unset, write failure), the registry keeps operating with built-in models.
- Error is exposed via
ModelRegistry.getError()and surfaced in UI/notifications.