AI Integration Overview

Multi-provider AI system with OpenAI, Anthropic, Google, and xAI — architecture, feature flags, and directory structure

Kit ships with a production-ready AI system that supports four LLM providers (Anthropic, OpenAI, Google, xAI) routed through the Vercel AI Gateway. The system includes two chat modes, streaming SSE responses, pgvector-powered RAG, and a three-layer cost management system.
This page covers the architecture and core concepts. For provider setup, see AI Providers. For the chat system, see Chat System. For knowledge base search, see RAG System. For rate limiting and credits, see Cost Management.

How It Works

Every AI request in Kit flows through the same pipeline — from the React hook to the provider and back:
User types message
    |
    v
React Hook (useAIChat / useAICompletion)
    |--- Manages message history
    |--- Handles streaming state
    |--- Triggers credit animation
    |
    v
API Route (/api/ai/stream or /api/ai/chat)
    |--- 1. Feature guard (is chat mode enabled?)
    |--- 2. Authentication (Clerk → DB user)
    |--- 3. Rate limit check (global burst + credit balance)
    |--- 4. Credit deduction (BEFORE processing)
    |--- 5. Zod request validation
    |
    v
AI Service (ai-service.ts)
    |--- Resolves the model (AI_MODEL or DEFAULT_MODELS)
    |--- Builds the provider/model routing string (toGatewayModelString)
    |--- Calls the SDK with the Gateway-routed model
    |
    v
Vercel AI Gateway (provider/model string)
    |--- Routes to the target provider (OpenAI / Anthropic / Google / xAI)
    |--- Applies server-side failover (AI_GATEWAY_FALLBACK_MODELS)
    |--- Streams response chunks via SSE
    |
    v
Response flows back
    |--- Usage tracked to database (non-blocking)
    |--- Credit balance invalidated in TanStack Query cache
    |--- Message displayed in chat UI

Provider Architecture

Kit routes AI providers through the Vercel AI Gateway using provider/model strings, so switching providers requires zero code changes — only an environment variable update.
You select a model with a provider/model string (e.g. anthropic/claude-haiku-4-5-20251001), and the Gateway routes it to the matching provider. Each provider has a preconfigured default model, optimized for cost-efficiency:
src/lib/ai/config.ts — Default Models
export const DEFAULT_MODELS: Record<AIProvider, string> = {
  openai: 'gpt-5-nano',
  anthropic: 'claude-haiku-4-5-20251001',
  google: 'gemini-2.5-flash',
  xai: 'grok-4-1-fast-reasoning',
}
ProviderDefault ModelContext WindowBest For
Anthropicclaude-haiku-4-5200K tokens (1M beta)Primary — nuanced reasoning, long context
OpenAIgpt-5-nano400K tokensGeneral purpose, RAG embeddings
Googlegemini-2.5-flash1M tokensLarge documents, cost efficiency
xAIgrok-4-1-fast-reasoning2M tokensReal-time data, conversational

Two Chat Modes

Kit provides two distinct chat experiences, each with its own route, API, and UI:
AspectLLM ChatRAG Chat
Route/dashboard/chat-llm/dashboard/chat-rag
API/api/ai/stream, /api/ai/chat/api/ai/rag/ask
HookuseAIChat()Custom RAG hook
ContextDirect LLM conversationKnowledge base + LLM
Feature FlagNEXT_PUBLIC_AI_LLM_CHAT_ENABLEDNEXT_PUBLIC_AI_RAG_CHAT_ENABLED
Token UsageFull conversation history~3-5K tokens (RAG context)
Best ForOpen-ended conversation, coding helpProduct support, FAQ

Feature Flags

Seven environment variables control which AI features are available. All default to true (enabled):
src/lib/ai/feature-flags.ts — Feature Configuration
export const AI_CHAT_FEATURES = {
  /**
   * RAG Chat (Modern UI)
   * Routes: /dashboard/chat-rag, /api/ai/rag/*
   * Features: Modern chat UI, Knowledge Base integration, Source Attribution
   */
  ragChat: process.env.NEXT_PUBLIC_AI_RAG_CHAT_ENABLED !== 'false',

  /**
   * LLM Chat (Direct Chat)
   * Routes: /dashboard/chat-llm, /api/ai/chat, /api/ai/stream
   * Features: Modern chat UI, Direct LLM conversation, Streaming
   */
  llmChat: process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false',

  /**
   * Vision Chat (Image Analysis in LLM Chat)
   * Extends LLM Chat with image upload and analysis capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, Paste, File picker, Base64 image transport
   */
  visionChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_VISION_ENABLED !== 'false',

  /**
   * PDF Chat (Document Analysis in LLM Chat)
   * Extends LLM Chat with PDF upload and text extraction capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, File picker, server-side text extraction, all providers
   */
  pdfChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_PDF_CHAT_ENABLED !== 'false',

  /**
   * Audio Input (Speech-to-Text for all AI Chats)
   * Adds microphone recording and Whisper transcription to any AI input field.
   * Standalone feature — works with LLM Chat, RAG Chat, and Image Gen.
   * Features: MediaRecorder, Whisper STT, editable transcript in input field
   */
  audioInput: process.env.NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED !== 'false',

  /**
   * Image Generation (Text-to-Image)
   * Routes: /dashboard/image-gen, /api/ai/image-gen
   * Features: GPT Image models, multiple sizes/qualities/formats, transparent backgrounds
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  imageGen: process.env.NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED !== 'false',

  /**
   * Content Generator (Template-based Text Generation)
   * Routes: /dashboard/content, /api/ai/generate-content
   * Features: 5 templates (Email, Product, Blog, Social, Marketing), tone/language/length controls, streaming output
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  contentGen: process.env.NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED !== 'false',

} as const
VariableDefaultControls
NEXT_PUBLIC_AI_RAG_CHAT_ENABLEDtrueRAG Chat on /dashboard/chat-rag
NEXT_PUBLIC_AI_LLM_CHAT_ENABLEDtrueLLM Chat on /dashboard/chat-llm
NEXT_PUBLIC_AI_VISION_ENABLEDtrueImage analysis in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLEDtrueVoice input via speech-to-text in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_PDF_CHAT_ENABLEDtruePDF analysis in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLEDtrueImage Generation on /dashboard/image-gen
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLEDtrueContent Generator on /dashboard/content
When Vision Chat is enabled, users can attach images to LLM Chat messages via drag & drop, clipboard paste, or file picker. Images are sent as ContentPart[] (Base64 data URIs) to /api/ai/stream, which auto-selects the image_analysis credit operation (30 credits). See Chat System for details.
When Audio Input is enabled, a microphone button appears in the LLM Chat input area. Users can record voice messages (up to 120 seconds) which are transcribed via the Whisper API at /api/ai/speech-to-text (20 credits per transcription). The transcribed text is inserted into the chat input field. See Chat System for details.
When Image Generation is enabled, the /dashboard/image-gen route provides a text-to-image interface using OpenAI's GPT Image models (gpt-image-1, gpt-image-1.5, gpt-image-1-mini). Users can configure size, quality, format, and background transparency. Generated images are stored in session history (up to 10 entries). Unlike chat features, Image Generation is a standalone feature — it does NOT require LLM Chat to be enabled.
When Content Generator is enabled, the /dashboard/content route provides a template-based text generation interface with five templates (email, product description, blog outline, social media, marketing copy). Users can configure tone, language, and length. The generator uses SSE streaming to deliver results progressively. Like Image Generation, the Content Generator is a standalone feature — it does NOT require LLM Chat to be enabled.
Feature flags are checked at two levels:
  1. Page levelshouldShowRAGChat() / shouldShowLLMChat() / shouldShowImageGen() / shouldShowContentGen() guard functions call notFound() if disabled
  2. API levelguardRAGChat() / guardLLMChat() / guardAudioInput() / guardImageGen() / guardContentGen() return 404 responses for disabled features

Directory Structure

All AI-related code lives in apps/boilerplate/src/lib/ai/ with API routes in apps/boilerplate/src/app/api/ai/:
apps/boilerplate/src/
├── lib/
│   └── ai/
│       ├── config.ts            # Default models + OpenAI key resolver
│       ├── types.ts             # Shared TypeScript types (Message, Provider, etc.)
│       ├── feature-flags.ts     # AI_CHAT_FEATURES, guard functions
│       ├── route-guards.ts      # API + page guards for feature flags
│       ├── ai-service.ts        # High-level service (routes via the AI Gateway)
│       ├── gateway.ts           # Gateway adapter — settings, reasoning-safety, fallback
│       ├── model-registry.ts    # Model catalog, pricing, reasoning flags, routing strings
│       ├── rag-service.ts       # RAG pipeline (search → context → answer)
│       ├── rag-search.ts        # pgvector similarity search
│       ├── rate-limiter.ts      # Global burst + tier-based limiting
│       ├── usage-tracker.ts     # Token/cost tracking to database
│       ├── image-gen/
│       │   ├── config.ts        # Model configs, sizes, quality options
│       │   ├── service.ts       # OpenAI image generation service
│       │   └── types.ts         # Image generation TypeScript types
│       ├── content-gen/
│       │   ├── config.ts        # Template definitions, prompt builder, UI labels
│       │   ├── service.ts       # Content generation AI service wrapper
│       │   └── types.ts         # Content generator TypeScript types
│       ├── sse-parser.ts        # Shared SSE stream parser with error handling
│       ├── quick-prompts.ts     # Configurable suggestion buttons
│       └── errors.ts            # Error class hierarchy
├── hooks/
│   ├── use-ai.ts               # React hooks (useAIChat, useAICompletion, etc.)
│   ├── use-image-gen.ts        # Image generation hook with history
│   ├── use-content-generator.ts # Content generator hook with SSE streaming
│   └── use-audio-recorder.ts   # Audio recording hook (MediaRecorder API)
├── app/
│   └── api/
│       └── ai/
│           ├── stream/route.ts          # POST — SSE streaming endpoint
│           ├── chat/route.ts            # POST — Synchronous chat endpoint
│           ├── speech-to-text/route.ts  # POST — Audio transcription (Whisper)
│           ├── image-gen/route.ts       # POST — Image generation endpoint
│           ├── generate-content/route.ts # POST — Content generation endpoint
│           ├── usage/route.ts           # GET — Usage statistics endpoint
│           └── rag/
│               ├── ask/route.ts         # POST — RAG question answering
│               └── conversations/       # CRUD for conversation history
└── components/
    └── ai/
        ├── chat/                # Chat UI components (12 components)
        ├── image-gen/           # Image generation UI (4 components)
        └── content-gen/         # Content generator UI (5 components)

Environment Variables

VariableRequiredPurpose
AI_GATEWAY_API_KEYYes*Vercel AI Gateway key — one key for all providers on the chat path (provider/model routing). Keyless OIDC works on Vercel.
AI_PROVIDERNoDefault provider slug for Gateway routing (openai, anthropic, google, xai; default anthropic)
AI_MODELNoOverride the default model (otherwise DEFAULT_MODELS[AI_PROVIDER])
AI_GATEWAY_FALLBACK_MODELSNoComma-separated provider/model list for Gateway-side failover
OPENAI_API_KEYYes†OpenAI key for the OpenAI-direct paths (RAG embeddings, Whisper STT, image generation)
AI_API_KEYNoFallback for the OpenAI-direct paths when AI_PROVIDER=openai (not read on the chat path)
AI_EMBEDDING_MODELNoEmbedding model for RAG (default: text-embedding-3-small)
NEXT_PUBLIC_AI_RAG_CHAT_ENABLEDNoEnable RAG Chat (default: true)
NEXT_PUBLIC_AI_LLM_CHAT_ENABLEDNoEnable LLM Chat (default: true)
NEXT_PUBLIC_AI_VISION_ENABLEDNoEnable image analysis in LLM Chat (default: true)
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLEDNoEnable voice input in LLM Chat (default: true)
NEXT_PUBLIC_AI_PDF_CHAT_ENABLEDNoEnable PDF analysis in LLM Chat (default: true)
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLEDNoEnable Image Generation (default: true)
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLEDNoEnable Content Generator (default: true)
UPSTASH_REDIS_REST_URLNoRedis URL for rate limiting
UPSTASH_REDIS_REST_TOKENNoRedis token for rate limiting
*Chat requires AI_GATEWAY_API_KEY (or keyless OIDC on Vercel). †OPENAI_API_KEY is required only for the OpenAI-direct paths (RAG embeddings, Whisper STT, image generation). To run chat on your own provider credentials, use Vercel BYOK in the Gateway dashboard instead of per-provider env vars.

Key Files

FilePurpose
apps/boilerplate/src/lib/ai/config.tsDefault model catalog + OpenAI key resolver (chat runs gateway-only)
apps/boilerplate/src/lib/ai/feature-flags.tsFeature flag definitions and guard functions
apps/boilerplate/src/lib/ai/ai-service.tsHigh-level AI service (routes via the AI Gateway, calculates costs)
apps/boilerplate/src/lib/ai/gateway.tsGateway adapter — request settings, reasoning-safety, fallback models
apps/boilerplate/src/lib/ai/model-registry.tsModel catalog — pricing, reasoning flags, Gateway routing strings
apps/boilerplate/src/lib/ai/rag-service.tsRAG pipeline — query rewriting, search, context assembly, answer generation
apps/boilerplate/src/lib/ai/rag-search.tspgvector similarity search with OpenAI embeddings
apps/boilerplate/src/lib/ai/rate-limiter.tsTwo-layer rate limiting (global burst, tier-based)
apps/boilerplate/src/lib/credits/credit-costs.tsPer-operation credit costs (21 operation types)
apps/boilerplate/src/hooks/use-ai.tsReact hooks — useAIChat, useAICompletion, useAIQuery, useAIStream
apps/boilerplate/src/app/api/ai/stream/route.tsSSE streaming endpoint with full cost management pipeline