Memory (RAG)

Simpler’s RAG (Retrieval-Augmented Generation) system utilizes MongoDB’s vector search capabilities to store and retrieve knowledge by using cosine similarity. This creates an intelligent retrieval mechanism that finds the most relevant information for each query. When documents are uploaded to the system, they are transformed into vector embeddings and stored in the MongoDB vector database, while API connections with enterprise systems create dynamic vectors that are updated in real-time as soon as the source data changes.

The RAG memory operates through a tool use pattern rather than relying on loading every context in every message. This means that the AI actively sends requests to the vector database only when specific information is needed, instead of pre-loading massive contexts that would slow down response times. When a query requires knowledge retrieval, the system transforms the query into a vector representation and performs a cosine similarity search in MongoDB to identify the most relevant document fragments (chunks) or API data points.

This approach allows the system to maintain vast knowledge repositories from both uploaded files and live API integrations, without needing to reload the AI model’s context window. The tool use mechanism ensures that only precisely relevant information is retrieved and added to the conversation at exactly the moment it is needed, maintaining efficiency and ensuring access to unlimited knowledge sources stored in the MongoDB vector database.

infrastructure

Functions instructions page

or via email: info@simpler.ge