Scalable vectors with scalable memory

Billion-scale vectors, persistent context

The Problem

Milvus handles scale that makes other databases nervous. Billions of vectors across distributed nodes, sharded and replicated for enterprise reliability. When your RAG corpus is measured in terabytes, Milvus is where vectors go to live.

But enterprise scale doesn't mean enterprise intelligence.

Your Milvus cluster can search billions of vectors in milliseconds. It doesn't know that the documents it's returning describe deprecated APIs. It doesn't remember that your enterprise decided on microservices last quarter, making the monolith docs irrelevant. It doesn't understand that legal approved only certain approaches for PII handling.

Every query to your billion-vector index is a fresh start. Perfect similarity search at massive scale, with zero institutional memory.

You've invested in enterprise infrastructure: Kubernetes operators, monitoring dashboards, automated scaling. Your vector search handles peak traffic like a dream. But "handling traffic" isn't the same as "understanding context." At enterprise scale, the cost of irrelevant results multiplies fast.

Big data without big context is just big confusion.

How Stompy Helps

Stompy adds enterprise memory to match Milvus's enterprise scale.

Your billion-scale search gains organizational intelligence: - **Multi-tenant context**: Different teams, different projects, different memory spaces—all managed cleanly - **Institutional knowledge**: Enterprise decisions, approved patterns, compliance requirements—all searchable alongside documents - **Cross-project learning**: Insights from one team's queries can inform another's (when appropriate) - **Audit-friendly**: Track what context informed which decisions, supporting enterprise governance

The combination serves enterprise needs: Milvus handles the scale of your document corpus; Stompy handles the scale of your organizational knowledge. Both are built for the reliability and performance enterprises demand.

Enterprise vectors with enterprise memory.

Integration Walkthrough

Connect Milvus and Stompy for enterprise RAG

Set up both systems with enterprise-grade configuration—authentication, connection pooling, and fault tolerance.

from pymilvus import connections, Collection, utility
import httpx
import os

# Milvus: Enterprise-scale vector search
connections.connect(
    alias="enterprise",
    host=os.environ["MILVUS_HOST"],
    port="19530",
    user=os.environ["MILVUS_USER"],
    password=os.environ["MILVUS_PASSWORD"],
    secure=True
)

# Verify collection exists
if utility.has_collection("enterprise_docs"):
    collection = Collection("enterprise_docs")
    collection.load()

# Stompy: Enterprise project memory
async def get_enterprise_context(project: str, topic: str) -> str:
    """Retrieve context with project isolation."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://mcp.stompy.ai/sse",
            headers={"Authorization": f"Bearer {os.environ['STOMPY_TOKEN']}"},
            json={
                "tool": "recall_context",
                "topic": topic,
                "project": project  # Multi-tenant isolation
            }
        )
        return response.json().get("content", "")

Multi-tenant RAG with project context

Different teams get different context while sharing the same vector infrastructure.

async def enterprise_rag(
    user_query: str,
    team_id: str,
    project_id: str
):
    """RAG with team and project isolation."""

    # Get team-specific project context
    project_context = await get_enterprise_context(
        project=f"team_{team_id}_{project_id}",
        topic="architecture_decisions"
    )
    compliance_rules = await get_enterprise_context(
        project=f"team_{team_id}_{project_id}",
        topic="compliance_requirements"
    )

    # Search Milvus with partition (team isolation)
    search_params = {"metric_type": "IP", "params": {"nprobe": 16}}
    results = collection.search(
        data=[embed(user_query)],
        anns_field="embeddings",
        param=search_params,
        limit=10,
        partition_names=[f"team_{team_id}"],  # Team-level isolation
        output_fields=["text", "doc_type", "last_updated"]
    )

    # Filter results based on compliance context
    filtered_docs = filter_by_compliance(results, compliance_rules)

    return {
        "documents": filtered_docs,
        "project_context": project_context,
        "team_id": team_id
    }

Track enterprise search analytics

Aggregate insights across teams while respecting isolation boundaries.

async def save_enterprise_insights(
    team_id: str,
    project_id: str,
    insights: dict
):
    """Save team-specific insights to Stompy."""
    async with httpx.AsyncClient() as client:
        await client.post(
            "https://mcp.stompy.ai/sse",
            headers={"Authorization": f"Bearer {os.environ['STOMPY_TOKEN']}"},
            json={
                "tool": "lock_context",
                "topic": "search_analytics",
                "project": f"team_{team_id}_{project_id}",
                "content": f"""Enterprise Search Analytics:
Team: {team_id}
Project: {project_id}
Period: {insights['period']}

Top Query Categories: {insights['top_categories']}
Most Retrieved Doc Types: {insights['top_doc_types']}
Average Relevance Score: {insights['avg_relevance']}

Recommendation: {insights['recommendation']}""",
                "tags": "enterprise,analytics,milvus"
            }
        )

# Example: Weekly analytics save
await save_enterprise_insights(
    team_id="platform-team",
    project_id="auth-service",
    insights={
        "period": "2024-W48",
        "top_categories": ["authentication", "authorization", "SSO"],
        "top_doc_types": ["api_spec", "architecture_decision"],
        "avg_relevance": 0.82,
        "recommendation": "Add more OAuth2 flow documentation"
    }
)

What You Get

Multi-tenant memory: Project isolation that matches your team structure and Milvus partitions
Compliance-aware: Enterprise rules and requirements inform search results automatically
Audit trail: Track which context influenced which responses for governance requirements
Cross-project insights: Learn from patterns across teams while respecting boundaries
Enterprise-grade: Both Milvus and Stompy are built for the reliability enterprises demand

Ready to give Milvus a memory?

Join the waitlist and be the first to know when Stompy is ready. Your Milvus projects will never forget again.

Related Integrations

Pinecone

RAG & Knowledge

Vector search meets project memory

Learn more

Qdrant

RAG & Knowledge

High-performance vectors with high-performance memory

Learn more

Weaviate

RAG & Knowledge

AI-native search with AI-native memory

Learn more

View all integrations