v0.3.3

AI Vision & PDF Chat

Image analysis via multimodal models and PDF document chat with automatic content chunking.

Vision capabilities

Users can upload images directly into the chat and receive analysis from multimodal models. The image is sent as a base64-encoded payload alongside the text prompt, letting the model reason about visual content — diagrams, screenshots, charts, handwritten notes. No separate vision API or pre-processing pipeline required.

PDF document chat

PDF chat follows a three-step pipeline: upload, parse, and converse. Documents are parsed server-side using pdf-parse, then split into chunks sized for the model's context window. The chunked content is injected as context for subsequent chat messages, allowing users to ask questions about specific sections without re-uploading.

Credits are deducted after pre-processing (parsing and chunking) but before streaming begins. This ensures users are only charged for successfully processed documents while preventing abuse through repeated upload attempts.

Turbopack compatibility

The pdf-parse library depends on pdfjs-dist, which uses dynamic worker imports that break Turbopack's static analysis. A createRequire loader module pattern wraps the native require() call in a separate file that Turbopack ignores, avoiding runtime failures without falling back to Webpack.

Contributors

Sascha RahnSascha Rahn