AI Providers

Configure Anthropic, OpenAI, Google, and xAI — model catalog, Gateway routing, fallback chains, and provider capabilities

Kit supports four AI providers through a unified interface. Providers are routed through the Vercel AI Gateway using provider/model strings — switching providers requires only changing an environment variable. This page covers provider setup, model configuration, the default-model catalog, Gateway failover, and capabilities.

Quick Setup

Available Providers

API Key: OPENAI_API_KEY Default Model: gpt-5-nano Base URL Override: OPENAI_BASE_URL
ModelContextInput (per 1M)Output (per 1M)Best For
gpt-5.2400K$1.75$14.00Latest flagship reasoning
gpt-5400K$1.25$10.00Coding and agentic tasks
gpt-5-mini400K$0.25$2.00Well-defined tasks
gpt-5-nano400K$0.05$0.40Default — fast, cheapest
gpt-4.11M$2.00$8.00Complex tasks, coding
gpt-4.1-mini1M$0.40$1.60Efficient coding
o3200K$2.00$8.00Deep reasoning
o4-mini200K$1.10$4.40Reasoning on a budget
OpenAI is required for RAG embeddings (text-embedding-3-small), even when using another provider for chat.

Model Selection

You select a model with a provider/model string (e.g. anthropic/claude-haiku-4-5-20251001). The Gateway routes it to the matching provider; switching providers needs no code change. When no model is given, the per-provider default from DEFAULT_MODELS is used. To set the default provider, use AI_PROVIDER (defaults to anthropic):
bash
# Make Anthropic the default provider for unqualified requests
AI_PROVIDER=anthropic

Gateway Routing

Kit routes every chat request through the Vercel AI Gateway. Instead of constructing per-provider SDK clients, the service passes a plain provider/model routing string to streamText / generateText. The model registry (model-registry.ts) is the single source of truth for pricing, reasoning flags, and the routing-string builder:
typescript
// src/lib/ai/model-registry.ts — Gateway routing string
export function toGatewayModelString(
  provider: AIProvider,
  modelId: string
): string {
  return `${provider}/${modelId}`
}

// src/lib/ai/ai-service.ts — routing a request through the Gateway
const provider = resolveProviderForModel(modelId, this._provider)
const gatewayModel = toGatewayModelString(provider, modelId)
const result = streamText({
  model: gatewayModel, // e.g. 'anthropic/claude-haiku-4-5-20251001'
  messages,
  ...buildGatewaySettings({ modelId, isReasoning /* ... */ }),
})
Key behaviors:
  • One API key for all providers — the SDK reads AI_GATEWAY_API_KEY automatically; no per-provider client construction
  • Provider resolved from the modelresolveProviderForModel looks up the model in the registry and falls back to the env-configured provider for unknown models
  • String routing — the Gateway accepts provider/model directly, so no provider import is required

Gateway Fallback

When the primary model is unavailable, the Gateway can fail over to alternative models. This is configured via the optional AI_GATEWAY_FALLBACK_MODELS environment variable (a comma-separated provider/model list). When set, Kit attaches the list to providerOptions.gateway.models:
bash
# Optional: comma-separated provider/model fallbacks
AI_GATEWAY_FALLBACK_MODELS=openai/gpt-5,google/gemini-2.5-flash
typescript
// src/lib/ai/gateway.ts — attaching fallback models (only when configured)
const fallbackModels = getFallbackModels()
return {
  maxOutputTokens: maxTokens ?? (isReasoning ? 16384 : defaultMaxTokens),
  ...(fallbackModels.length > 0 && {
    providerOptions: { gateway: { models: fallbackModels } },
  }),
}

Provider Capabilities

Not all providers support all features. The capability matrix is a reference for choosing a model and configuring Gateway fallbacks:
CapabilityOpenAIAnthropicGooglexAI
StreamingYesYesYesYes
Functions/ToolsYesYesYesYes
VisionYesYesYesNo
EmbeddingsYesNoYesNo
System MessagesYesYesYesYes
Max Context1M200K (1M beta)1M2M

Reasoning Model Handling

GPT-5 family and o-series models are reasoning models with special parameter constraints. The isReasoningModel flag on ModelInfo controls runtime behavior:
ModelReasoningTemperatureDefault maxTokens
GPT-5, GPT-5.2, GPT-5 Mini, GPT-5 NanoYesUnsupported16,384
o3, o4-miniYesUnsupported16,384
GPT-4.1No0.7 (default)1,000
Key constraints:
  • Temperature is unsupported — passing it to reasoning models triggers an SDK warning and may degrade behavior. Kit omits temperature entirely for reasoning models.
  • maxOutputTokens covers BOTH internal reasoning AND visible output — reasoning models use most of the token budget for internal chain-of-thought. The default of 1,000 is far too low; Kit uses 16,384 for reasoning models.
  • Symptom of too-low maxOutputTokens: finishReason: 'length' with 0 output tokens → empty response to the user.
This logic is centralized in buildGatewaySettings (gateway.ts) — the single place where v6 settings names (maxTokensmaxOutputTokens) and reasoning-safety live:
typescript
// Reasoning-aware parameter handling (gateway.ts → buildGatewaySettings)
const isReasoning = isReasoningModel(modelId)
return {
  ...(isReasoning ? {} : { temperature: temperature ?? defaultTemperature }),
  maxOutputTokens: maxTokens ?? (isReasoning ? 16384 : defaultMaxTokens),
}

Streaming Response Diagnostics

streamText() from the Vercel AI SDK returns lazy Promises that resolve after the stream ends. Kit reads these for diagnostics:
PropertyTypePurpose
result.finishReasonPromise<string>Why the stream ended ('stop', 'length', 'content_filter')
result.usagePromise<object>Token counts (promptTokens, completionTokens)
result.warningsWarning[]Unsupported parameters, model deprecations
finishReason values:
ValueMeaningAction
'stop'Normal completionNone
'length'Token budget exhaustedIncrease maxTokens
'content_filter'Provider blocked the responseReview content policy
Kit logs a warning when the stream completes with 0 content chunks or a non-'stop' finish reason. This diagnostic data is essential for debugging empty AI responses.

Custom Base URLs

Each provider supports a custom base URL for proxies, self-hosted models, or alternative endpoints:
VariableDefaultPurpose
OPENAI_BASE_URLhttps://api.openai.com/v1OpenAI API proxy or compatible endpoint
ANTHROPIC_BASE_URLhttps://api.anthropic.comAnthropic API proxy
GOOGLE_AI_BASE_URLhttps://generativelanguage.googleapis.com/v1Google AI proxy
XAI_BASE_URLhttps://api.x.ai/v1xAI API proxy
OPENAI_ORG_IDOpenAI organization ID for billing

Error Handling and Retries

Retries are handled by the Vercel AI SDK. Both streamText and generateText accept a maxRetries setting (default: 2) and apply exponential backoff with jitter automatically. Failed requests on transient errors (network timeouts, 5xx responses) are retried; client errors (400, 401, 403) fail immediately.
typescript
// Retries are configured per call via the AI SDK
streamText({
  model: gatewayModel,
  messages,
  maxRetries: 2, // default; set to 0 to disable
})
Kit's structured error classes (src/lib/ai/errors.ts) wrap provider failures with retryability metadata:
Error ClassWhenRetryable
AIProviderErrorBase class for all provider errorsVaries
NetworkErrorConnection failures, DNS errorsYes
TimeoutErrorRequest exceeds timeoutYes
ValidationErrorInvalid config or requestNo
InvalidProviderErrorUnknown provider stringNo