Kit supports four AI providers through a unified interface. Providers are routed through the Vercel AI Gateway using
provider/model strings — switching providers requires only changing an environment variable. This page covers provider setup, model configuration, the default-model catalog, Gateway failover, and capabilities.Quick Setup
With a single API key in
apps/boilerplate/.env.local, both LLM Chat and RAG Chat work immediately. No AI_PROVIDER or AI_MODEL configuration needed — Kit uses sensible defaults.Available Providers
API Key:
OPENAI_API_KEY Default Model: gpt-5-nano Base URL Override: OPENAI_BASE_URL| Model | Context | Input (per 1M) | Output (per 1M) | Best For |
|---|---|---|---|---|
gpt-5.2 | 400K | $1.75 | $14.00 | Latest flagship reasoning |
gpt-5 | 400K | $1.25 | $10.00 | Coding and agentic tasks |
gpt-5-mini | 400K | $0.25 | $2.00 | Well-defined tasks |
gpt-5-nano | 400K | $0.05 | $0.40 | Default — fast, cheapest |
gpt-4.1 | 1M | $2.00 | $8.00 | Complex tasks, coding |
gpt-4.1-mini | 1M | $0.40 | $1.60 | Efficient coding |
o3 | 200K | $2.00 | $8.00 | Deep reasoning |
o4-mini | 200K | $1.10 | $4.40 | Reasoning on a budget |
OpenAI is required for RAG embeddings (
text-embedding-3-small), even when using another provider for chat.Model Selection
You select a model with a
provider/model string (e.g. anthropic/claude-haiku-4-5-20251001). The Gateway routes it to the matching provider; switching providers needs no code change. When no model is given, the per-provider default from DEFAULT_MODELS is used. To set the default provider, use AI_PROVIDER (defaults to anthropic):bash
# Make Anthropic the default provider for unqualified requests
AI_PROVIDER=anthropic
Gateway Routing
Kit routes every chat request through the Vercel AI Gateway. Instead of constructing per-provider SDK clients, the service passes a plain
provider/model routing string to streamText / generateText. The model registry (model-registry.ts) is the single source of truth for pricing, reasoning flags, and the routing-string builder:typescript
// src/lib/ai/model-registry.ts — Gateway routing string
export function toGatewayModelString(
provider: AIProvider,
modelId: string
): string {
return `${provider}/${modelId}`
}
// src/lib/ai/ai-service.ts — routing a request through the Gateway
const provider = resolveProviderForModel(modelId, this._provider)
const gatewayModel = toGatewayModelString(provider, modelId)
const result = streamText({
model: gatewayModel, // e.g. 'anthropic/claude-haiku-4-5-20251001'
messages,
...buildGatewaySettings({ modelId, isReasoning /* ... */ }),
})
Key behaviors:
- One API key for all providers — the SDK reads
AI_GATEWAY_API_KEYautomatically; no per-provider client construction - Provider resolved from the model —
resolveProviderForModellooks up the model in the registry and falls back to the env-configured provider for unknown models - String routing — the Gateway accepts
provider/modeldirectly, so no provider import is required
Gateway Fallback
When the primary model is unavailable, the Gateway can fail over to alternative models. This is configured via the optional
AI_GATEWAY_FALLBACK_MODELS environment variable (a comma-separated provider/model list). When set, Kit attaches the list to providerOptions.gateway.models:bash
# Optional: comma-separated provider/model fallbacks
AI_GATEWAY_FALLBACK_MODELS=openai/gpt-5,google/gemini-2.5-flash
typescript
// src/lib/ai/gateway.ts — attaching fallback models (only when configured)
const fallbackModels = getFallbackModels()
return {
maxOutputTokens: maxTokens ?? (isReasoning ? 16384 : defaultMaxTokens),
...(fallbackModels.length > 0 && {
providerOptions: { gateway: { models: fallbackModels } },
}),
}
Provider failover is handled server-side by the Gateway, not by a client-side scoring algorithm. If no fallback models are configured, there is no automatic provider switch — the primary model is used directly.
Provider Capabilities
Not all providers support all features. The capability matrix is a reference for choosing a model and configuring Gateway fallbacks:
| Capability | OpenAI | Anthropic | xAI | |
|---|---|---|---|---|
| Streaming | Yes | Yes | Yes | Yes |
| Functions/Tools | Yes | Yes | Yes | Yes |
| Vision | Yes | Yes | Yes | No |
| Embeddings | Yes | No | Yes | No |
| System Messages | Yes | Yes | Yes | Yes |
| Max Context | 1M | 200K (1M beta) | 1M | 2M |
RAG search uses OpenAI's embedding model (configurable via
AI_EMBEDDING_MODEL, default: text-embedding-3-small), regardless of the active chat provider. Ensure OPENAI_API_KEY is set (or AI_API_KEY as fallback) if you use the RAG system, even with Anthropic or Google as the primary provider.Reasoning Model Handling
GPT-5 family and o-series models are reasoning models with special parameter constraints. The
isReasoningModel flag on ModelInfo controls runtime behavior:| Model | Reasoning | Temperature | Default maxTokens |
|---|---|---|---|
| GPT-5, GPT-5.2, GPT-5 Mini, GPT-5 Nano | Yes | Unsupported | 16,384 |
| o3, o4-mini | Yes | Unsupported | 16,384 |
| GPT-4.1 | No | 0.7 (default) | 1,000 |
Key constraints:
- Temperature is unsupported — passing it to reasoning models triggers an SDK warning and may degrade behavior. Kit omits
temperatureentirely for reasoning models. - maxOutputTokens covers BOTH internal reasoning AND visible output — reasoning models use most of the token budget for internal chain-of-thought. The default of 1,000 is far too low; Kit uses 16,384 for reasoning models.
- Symptom of too-low maxOutputTokens:
finishReason: 'length'with 0 output tokens → empty response to the user.
This logic is centralized in
buildGatewaySettings (gateway.ts) — the single place where v6 settings names (maxTokens → maxOutputTokens) and reasoning-safety live:typescript
// Reasoning-aware parameter handling (gateway.ts → buildGatewaySettings)
const isReasoning = isReasoningModel(modelId)
return {
...(isReasoning ? {} : { temperature: temperature ?? defaultTemperature }),
maxOutputTokens: maxTokens ?? (isReasoning ? 16384 : defaultMaxTokens),
}
Streaming Response Diagnostics
streamText() from the Vercel AI SDK returns lazy Promises that resolve after the stream ends. Kit reads these for diagnostics:| Property | Type | Purpose |
|---|---|---|
result.finishReason | Promise<string> | Why the stream ended ('stop', 'length', 'content_filter') |
result.usage | Promise<object> | Token counts (promptTokens, completionTokens) |
result.warnings | Warning[] | Unsupported parameters, model deprecations |
finishReason values:
| Value | Meaning | Action |
|---|---|---|
'stop' | Normal completion | None |
'length' | Token budget exhausted | Increase maxTokens |
'content_filter' | Provider blocked the response | Review content policy |
Kit logs a warning when the stream completes with 0 content chunks or a non-
'stop' finish reason. This diagnostic data is essential for debugging empty AI responses.Custom Base URLs
Each provider supports a custom base URL for proxies, self-hosted models, or alternative endpoints:
| Variable | Default | Purpose |
|---|---|---|
OPENAI_BASE_URL | https://api.openai.com/v1 | OpenAI API proxy or compatible endpoint |
ANTHROPIC_BASE_URL | https://api.anthropic.com | Anthropic API proxy |
GOOGLE_AI_BASE_URL | https://generativelanguage.googleapis.com/v1 | Google AI proxy |
XAI_BASE_URL | https://api.x.ai/v1 | xAI API proxy |
OPENAI_ORG_ID | — | OpenAI organization ID for billing |
Error Handling and Retries
Retries are handled by the Vercel AI SDK. Both
streamText and generateText accept a maxRetries setting (default: 2) and apply exponential backoff with jitter automatically. Failed requests on transient errors (network timeouts, 5xx responses) are retried; client errors (400, 401, 403) fail immediately.typescript
// Retries are configured per call via the AI SDK
streamText({
model: gatewayModel,
messages,
maxRetries: 2, // default; set to 0 to disable
})
Kit's structured error classes (
src/lib/ai/errors.ts) wrap provider failures with retryability metadata:| Error Class | When | Retryable |
|---|---|---|
AIProviderError | Base class for all provider errors | Varies |
NetworkError | Connection failures, DNS errors | Yes |
TimeoutError | Request exceeds timeout | Yes |
ValidationError | Invalid config or request | No |
InvalidProviderError | Unknown provider string | No |