Edge-ready AI with persistent memory
AI at the edge, memory in the cloud
The Problem
The Vercel AI SDK is how modern apps do AI. Streaming responses, edge deployment, provider-agnostic design. Your chatbot runs at the edge, milliseconds from your users, scaling to millions without you thinking about infrastructure.
But edge functions are stateless by design.
Every request to your AI route starts fresh. The conversation your user had this morning? Gone. The preferences they expressed last week? Unknown. The project context that would make responses actually useful? Not available at the edge.
You've built a beautiful AI-powered app. The streaming UI is smooth, the latency is low, the UX is excellent. Users are impressed—until they realize the AI has the memory of a goldfish.
"As I mentioned earlier..." they type. But you didn't mention it to this function. You mentioned it to a previous function invocation that has since been garbage collected in some edge location you'll never see.
Edge computing solved the latency problem. It didn't solve the memory problem. Your AI is fast. It's also perpetually confused about who it's talking to and why.
How Stompy Helps
Stompy gives your edge AI functions the memory that serverless can't provide.
Your Vercel AI routes gain persistent intelligence: - **Cross-request memory**: Conversations span sessions; user preferences persist; project context is always available - **Edge + cloud architecture**: Stompy's memory layer is optimized for the latency patterns edge functions need - **Provider-agnostic**: Whether you're using Anthropic, OpenAI, or open models via the AI SDK, Stompy adds the same memory layer - **Streaming-compatible**: Context enrichment happens before streaming starts—your smooth UX isn't affected
The pattern is elegant: your edge function fetches relevant context from Stompy, includes it in the prompt, and streams the response. Users get the speed of edge deployment with the intelligence of contextual awareness.
Edge performance meets persistent memory.
Integration Walkthrough
Set up context-aware AI routes
Create a utility for fetching Stompy context, then use it in your AI routes.
// lib/stompy.ts - Shared context utilitiesconst STOMPY_URL = 'https://mcp.stompy.ai/sse';const STOMPY_TOKEN = process.env.STOMPY_TOKEN!;export async function getProjectContext(topic: string): Promise<string> {const response = await fetch(STOMPY_URL, {method: 'POST',headers: {'Authorization': `Bearer ${STOMPY_TOKEN}`,'Content-Type': 'application/json',},body: JSON.stringify({ tool: 'recall_context', topic }),});const data = await response.json();return data.content || '';}export async function searchContext(query: string, limit = 3): Promise<string[]> {const response = await fetch(STOMPY_URL, {method: 'POST',headers: {'Authorization': `Bearer ${STOMPY_TOKEN}`,'Content-Type': 'application/json',},body: JSON.stringify({ tool: 'context_search', query, limit }),});const data = await response.json();return data.contexts?.map((c: any) => c.content) || [];}
Context-enriched streaming chat
Fetch relevant context before streaming, giving your AI project awareness without sacrificing performance.
// app/api/chat/route.tsimport { streamText } from 'ai';import { anthropic } from '@ai-sdk/anthropic';import { getProjectContext, searchContext } from '@/lib/stompy';export async function POST(req: Request) {const { messages } = await req.json();const lastMessage = messages[messages.length - 1]?.content || '';// Fetch context in parallel for speedconst [projectRules, relevantContext] = await Promise.all([getProjectContext('project_rules'),searchContext(lastMessage, 3),]);// Build context-aware system promptconst systemPrompt = `You are a helpful assistant for this project.PROJECT RULES (always follow these):${projectRules}RELEVANT CONTEXT (from previous work):${relevantContext.join('\n\n')}Respond helpfully while respecting our project's established patterns and decisions.`;const result = await streamText({model: anthropic('claude-sonnet-4-20250514'),system: systemPrompt,messages,});return result.toDataStreamResponse();}export const runtime = 'edge'; // Still runs at the edge!
Save conversation insights
When important decisions happen in chat, save them to Stompy so future conversations benefit.
// app/api/save-insight/route.tsimport { NextResponse } from 'next/server';const STOMPY_URL = 'https://mcp.stompy.ai/sse';export async function POST(req: Request) {const { topic, content, tags } = await req.json();const response = await fetch(STOMPY_URL, {method: 'POST',headers: {'Authorization': `Bearer ${process.env.STOMPY_TOKEN}`,'Content-Type': 'application/json',},body: JSON.stringify({tool: 'lock_context',topic,content,tags,}),});const result = await response.json();return NextResponse.json(result);}// Client-side: Call this when user confirms a decision in chat// await fetch('/api/save-insight', {// method: 'POST',// body: JSON.stringify({// topic: 'authentication_approach',// content: 'Decided to use NextAuth.js with Google OAuth',// tags: 'auth,decisions,nextauth'// })// });
What You Get
- Edge + memory: Your AI routes stay at the edge while context lives in the cloud—best of both
- Parallel context fetch: Get project context and relevant memories simultaneously, minimizing latency
- Provider-agnostic: Works with Anthropic, OpenAI, Google, or any AI SDK provider
- Streaming preserved: Context enrichment happens before streaming—smooth UX unchanged
- Progressive enhancement: Add memory to existing AI routes without rewrites
Ready to give Vercel AI SDK a memory?
Join the waitlist and be the first to know when Stompy is ready. Your Vercel AI SDK projects will never forget again.