Recipe: Building Your Own RAG Assistant with LFS Workflow, pgvector, and Ollama

The AI assistant on our website is not just a demo. It's a real example of how you can use LFS Workflow to create powerful intelligent tools within your company. Today, we're sharing our 'recipe' and showing how our platform's architecture is perfectly suited for building modern RAG (Retrieval-Augmented Generation) systems.
What is RAG and Why is it Important?
A Large Language Model (LLM) by itself knows nothing about your business. RAG is an architecture that allows you to 'ground' the LLM on your own, private data. Instead of asking a question to a generic ChatGPT, you ask your own AI, which knows your internal regulations, contracts, and knowledge bases.
The Three Pillars of Our Implementation:
- Knowledge Base (`pgvector`): We used the `pgvector` extension for PostgreSQL, which allows storing documents not as text, but as mathematical vectors (embeddings). This enables instantly finding the most relevant pieces of information based on the query's meaning, not just keywords.
- The Brain (`Ollama + LLM`): We use Ollama to run large language models (like Llama 3) locally, on our own servers. This ensures that confidential data never leaves our perimeter.
- The Nervous System (`LFS Workflow + n8n`): This is the key element that ties everything together into a single process.
How Our AI Assistant 'Dasha' Works
When you ask a question in the chat on our website, the following process, orchestrated by LFS Workflow, is launched:
- Query: Your question ('How do dynamic roles work?') arrives at a Webhook in n8n.
- Retrieval: n8n passes your question to our embedding service, which converts it into a vector. This vector then goes to the `pgvector` database, which finds the 3-5 most relevant text fragments from our LFS Workflow knowledge base.
- Augmentation: n8n formulates a final prompt for the LLM, which looks something like this: 'Based ONLY on this context: [inserts the found fragments], answer the user's question: [inserts your question]'.
- Generation: The augmented prompt is sent to Ollama. The LLM generates a response based exclusively on the provided, relevant data.
Why is LFS Workflow the Ideal Base for This?
You can build such a system yourself, but LFS Workflow gives you a ready, reliable infrastructure: a role model for access, a secure execution environment, a ready-made archive for logging all queries and responses, and the flexibility to integrate with any tools. You can embed such an assistant not just on a website, but directly into your internal business processes, providing your employees with a super-powerful tool for decision-making.
Turn Theory into Practice
Learn how LFS Workflow can solve your specific business challenges. Request a free demo and get a personal consultation.
Request a Demo