Edge-ready AI with persistent memory

AI at the edge, memory in the cloud

The Problem

The Vercel AI SDK is how modern apps do AI. Streaming responses, edge deployment, provider-agnostic design. Your chatbot runs at the edge, milliseconds from your users, scaling to millions without you thinking about infrastructure.

But edge functions are stateless by design.

Every request to your AI route starts fresh. The conversation your user had this morning? Gone. The preferences they expressed last week? Unknown. The project context that would make responses actually useful? Not available at the edge.

You've built a beautiful AI-powered app. The streaming UI is smooth, the latency is low, the UX is excellent. Users are impressed—until they realize the AI has the memory of a goldfish.

"As I mentioned earlier..." they type. But you didn't mention it to this function. You mentioned it to a previous function invocation that has since been garbage collected in some edge location you'll never see.

Edge computing solved the latency problem. It didn't solve the memory problem. Your AI is fast. It's also perpetually confused about who it's talking to and why.

How Stompy Helps

Stompy gives your edge AI functions the memory that serverless can't provide.

Your Vercel AI routes gain persistent intelligence: - **Cross-request memory**: Conversations span sessions; user preferences persist; project context is always available - **Edge + cloud architecture**: Stompy's memory layer is optimized for the latency patterns edge functions need - **Provider-agnostic**: Whether you're using Anthropic, OpenAI, or open models via the AI SDK, Stompy adds the same memory layer - **Streaming-compatible**: Context enrichment happens before streaming starts—your smooth UX isn't affected

The pattern is elegant: your edge function fetches relevant context from Stompy, includes it in the prompt, and streams the response. Users get the speed of edge deployment with the intelligence of contextual awareness.

Edge performance meets persistent memory.

Integration Walkthrough

Set up context-aware AI routes

Create a utility for fetching Stompy context, then use it in your AI routes.

// lib/stompy.ts - Shared context utilities
const STOMPY_URL = 'https://mcp.stompy.ai/sse';
const STOMPY_TOKEN = process.env.STOMPY_TOKEN!;

export async function getProjectContext(topic: string): Promise<string> {
  const response = await fetch(STOMPY_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${STOMPY_TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ tool: 'recall_context', topic }),
  });

  const data = await response.json();
  return data.content || '';
}

export async function searchContext(query: string, limit = 3): Promise<string[]> {
  const response = await fetch(STOMPY_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${STOMPY_TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ tool: 'context_search', query, limit }),
  });

  const data = await response.json();
  return data.contexts?.map((c: any) => c.content) || [];
}

Context-enriched streaming chat

Fetch relevant context before streaming, giving your AI project awareness without sacrificing performance.

// app/api/chat/route.ts
import { streamText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { getProjectContext, searchContext } from '@/lib/stompy';

export async function POST(req: Request) {
  const { messages } = await req.json();
  const lastMessage = messages[messages.length - 1]?.content || '';

  // Fetch context in parallel for speed
  const [projectRules, relevantContext] = await Promise.all([
    getProjectContext('project_rules'),
    searchContext(lastMessage, 3),
  ]);

  // Build context-aware system prompt
  const systemPrompt = `You are a helpful assistant for this project.

PROJECT RULES (always follow these):
${projectRules}

RELEVANT CONTEXT (from previous work):
${relevantContext.join('\n\n')}

Respond helpfully while respecting our project's established patterns and decisions.`;

  const result = await streamText({
    model: anthropic('claude-sonnet-4-20250514'),
    system: systemPrompt,
    messages,
  });

  return result.toDataStreamResponse();
}

export const runtime = 'edge';  // Still runs at the edge!

Save conversation insights

When important decisions happen in chat, save them to Stompy so future conversations benefit.

// app/api/save-insight/route.ts
import { NextResponse } from 'next/server';

const STOMPY_URL = 'https://mcp.stompy.ai/sse';

export async function POST(req: Request) {
  const { topic, content, tags } = await req.json();

  const response = await fetch(STOMPY_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.STOMPY_TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      tool: 'lock_context',
      topic,
      content,
      tags,
    }),
  });

  const result = await response.json();
  return NextResponse.json(result);
}

// Client-side: Call this when user confirms a decision in chat
// await fetch('/api/save-insight', {
//   method: 'POST',
//   body: JSON.stringify({
//     topic: 'authentication_approach',
//     content: 'Decided to use NextAuth.js with Google OAuth',
//     tags: 'auth,decisions,nextauth'
//   })
// });

What You Get

Edge + memory: Your AI routes stay at the edge while context lives in the cloud—best of both
Parallel context fetch: Get project context and relevant memories simultaneously, minimizing latency
Provider-agnostic: Works with Anthropic, OpenAI, Google, or any AI SDK provider
Streaming preserved: Context enrichment happens before streaming—smooth UX unchanged
Progressive enhancement: Add memory to existing AI routes without rewrites

Ready to give Vercel AI SDK a memory?

Join the waitlist and be the first to know when Stompy is ready. Your Vercel AI SDK projects will never forget again.

Related Integrations

Anthropic Claude API

Vertical & Specialized

Claude with memory that matches its intelligence

Learn more

OpenAI API

Vertical & Specialized

GPT models with persistent memory

Learn more

Together AI

Vertical & Specialized

Open models with persistent memory

Learn more

View all integrations