Scalable vectors with scalable memory
Billion-scale vectors, persistent context
The Problem
Milvus handles scale that makes other databases nervous. Billions of vectors across distributed nodes, sharded and replicated for enterprise reliability. When your RAG corpus is measured in terabytes, Milvus is where vectors go to live.
But enterprise scale doesn't mean enterprise intelligence.
Your Milvus cluster can search billions of vectors in milliseconds. It doesn't know that the documents it's returning describe deprecated APIs. It doesn't remember that your enterprise decided on microservices last quarter, making the monolith docs irrelevant. It doesn't understand that legal approved only certain approaches for PII handling.
Every query to your billion-vector index is a fresh start. Perfect similarity search at massive scale, with zero institutional memory.
You've invested in enterprise infrastructure: Kubernetes operators, monitoring dashboards, automated scaling. Your vector search handles peak traffic like a dream. But "handling traffic" isn't the same as "understanding context." At enterprise scale, the cost of irrelevant results multiplies fast.
Big data without big context is just big confusion.
How Stompy Helps
Stompy adds enterprise memory to match Milvus's enterprise scale.
Your billion-scale search gains organizational intelligence: - **Multi-tenant context**: Different teams, different projects, different memory spaces—all managed cleanly - **Institutional knowledge**: Enterprise decisions, approved patterns, compliance requirements—all searchable alongside documents - **Cross-project learning**: Insights from one team's queries can inform another's (when appropriate) - **Audit-friendly**: Track what context informed which decisions, supporting enterprise governance
The combination serves enterprise needs: Milvus handles the scale of your document corpus; Stompy handles the scale of your organizational knowledge. Both are built for the reliability and performance enterprises demand.
Enterprise vectors with enterprise memory.
Integration Walkthrough
Connect Milvus and Stompy for enterprise RAG
Set up both systems with enterprise-grade configuration—authentication, connection pooling, and fault tolerance.
from pymilvus import connections, Collection, utilityimport httpximport os# Milvus: Enterprise-scale vector searchconnections.connect(alias="enterprise",host=os.environ["MILVUS_HOST"],port="19530",user=os.environ["MILVUS_USER"],password=os.environ["MILVUS_PASSWORD"],secure=True)# Verify collection existsif utility.has_collection("enterprise_docs"):collection = Collection("enterprise_docs")collection.load()# Stompy: Enterprise project memoryasync def get_enterprise_context(project: str, topic: str) -> str:"""Retrieve context with project isolation."""async with httpx.AsyncClient() as client:response = await client.post("https://mcp.stompy.ai/sse",headers={"Authorization": f"Bearer {os.environ['STOMPY_TOKEN']}"},json={"tool": "recall_context","topic": topic,"project": project # Multi-tenant isolation})return response.json().get("content", "")
Multi-tenant RAG with project context
Different teams get different context while sharing the same vector infrastructure.
async def enterprise_rag(user_query: str,team_id: str,project_id: str):"""RAG with team and project isolation."""# Get team-specific project contextproject_context = await get_enterprise_context(project=f"team_{team_id}_{project_id}",topic="architecture_decisions")compliance_rules = await get_enterprise_context(project=f"team_{team_id}_{project_id}",topic="compliance_requirements")# Search Milvus with partition (team isolation)search_params = {"metric_type": "IP", "params": {"nprobe": 16}}results = collection.search(data=[embed(user_query)],anns_field="embeddings",param=search_params,limit=10,partition_names=[f"team_{team_id}"], # Team-level isolationoutput_fields=["text", "doc_type", "last_updated"])# Filter results based on compliance contextfiltered_docs = filter_by_compliance(results, compliance_rules)return {"documents": filtered_docs,"project_context": project_context,"team_id": team_id}
Track enterprise search analytics
Aggregate insights across teams while respecting isolation boundaries.
async def save_enterprise_insights(team_id: str,project_id: str,insights: dict):"""Save team-specific insights to Stompy."""async with httpx.AsyncClient() as client:await client.post("https://mcp.stompy.ai/sse",headers={"Authorization": f"Bearer {os.environ['STOMPY_TOKEN']}"},json={"tool": "lock_context","topic": "search_analytics","project": f"team_{team_id}_{project_id}","content": f"""Enterprise Search Analytics:Team: {team_id}Project: {project_id}Period: {insights['period']}Top Query Categories: {insights['top_categories']}Most Retrieved Doc Types: {insights['top_doc_types']}Average Relevance Score: {insights['avg_relevance']}Recommendation: {insights['recommendation']}""","tags": "enterprise,analytics,milvus"})# Example: Weekly analytics saveawait save_enterprise_insights(team_id="platform-team",project_id="auth-service",insights={"period": "2024-W48","top_categories": ["authentication", "authorization", "SSO"],"top_doc_types": ["api_spec", "architecture_decision"],"avg_relevance": 0.82,"recommendation": "Add more OAuth2 flow documentation"})
What You Get
- Multi-tenant memory: Project isolation that matches your team structure and Milvus partitions
- Compliance-aware: Enterprise rules and requirements inform search results automatically
- Audit trail: Track which context influenced which responses for governance requirements
- Cross-project insights: Learn from patterns across teams while respecting boundaries
- Enterprise-grade: Both Milvus and Stompy are built for the reliability enterprises demand
Ready to give Milvus a memory?
Join the waitlist and be the first to know when Stompy is ready. Your Milvus projects will never forget again.