Experience Report: Two Months with Our Local AI System

Post Reply
User avatar
Otto
Posts: 6430
Joined: Fri Oct 07, 2005 7:07 pm
Has thanked: 48 times
Been thanked: 15 times
Contact:

Experience Report: Two Months with Our Local AI System

Post by Otto »

Hello friends,
Two months after launching our local AI setup – based on Ollama running on Ubuntu with a RAG (Retrieval-Augmented Generation) extension – it’s clear: our daily work with documents has significantly changed for the better.

Our Document Management System (DMS) follows a deliberately simple structure: HTML files, JSON metadata, and organized folder hierarchies. Nothing fancy or overly complex. The key difference now is the new AI integration: all contents are automatically transferred into a vector database (Chroma), where they’re indexed and stored with context. The locally installed AI can access this data – and provides clear, accurate answers.

What has changed in practice:
The DMS became a knowledge source: Instead of just storing documents, the AI “understands” their content and can respond meaningfully.

Context-based answers: The AI doesn’t just look for keywords – it intelligently searches across documents and combines relevant information.

Fully automated integration: Every new document automatically becomes part of the knowledge network – without extra effort.

High-quality responses: The answers are precise, easy to understand, and far superior to traditional full-text search.

100% local operation: No cloud, no API fees, no internet dependency – and full control over data privacy.

A surprising bonus:
What really stands out is the practical side effect: Everyone in the company – whether at the front desk, in the kitchen, or in maintenance – contributes to the AI knowledge base simply by writing into the DMS. Without even realizing it. It’s what we’d call effortless knowledge management.

For developers:
Integration is refreshingly simple. You send a text request via the local HTTP API and receive a JSON response. That’s it. Existing applications can easily tap into the system without major changes – but with a huge gain in functionality.

Conclusion:
For us, combining a local DMS with AI is far from a hype. It’s a practical, forward-thinking step – especially for small businesses that want to actively use their internal knowledge instead of just storing it.

Best regards,
Otto
User avatar
Antonio Linares
Site Admin
Posts: 42844
Joined: Thu Oct 06, 2005 5:47 pm
Location: Spain
Has thanked: 181 times
Been thanked: 124 times
Contact:

Re: Experience Report: Two Months with Our Local AI System

Post by Antonio Linares »

Dear Otto,

Would you be so kind to explain us how to use Chroma and some examples ?

Congratulations! :)
regards, saludos

Antonio Linares
www.fivetechsoft.com
User avatar
cnavarro
Posts: 6661
Joined: Wed Feb 15, 2012 8:25 pm
Location: España
Has thanked: 9 times
Been thanked: 12 times

Re: Experience Report: Two Months with Our Local AI System

Post by cnavarro »

Dear Otto,

Reading comments like yours helps encourage us all to delve deeper into the topic of AI.
Thank you and congratulations.
Cristobal Navarro
Hay dos tipos de personas: las que te hacen perder el tiempo y las que te hacen perder la noción del tiempo
El secreto de la felicidad no está en hacer lo que te gusta, sino en que te guste lo que haces
User avatar
leandro
Posts: 1779
Joined: Wed Oct 26, 2005 2:49 pm
Location: Colombia
Has thanked: 53 times
Been thanked: 16 times
Contact:

Re: Experience Report: Two Months with Our Local AI System

Post by leandro »

Que bien Otto :D

Que versión de oLlama estas usando? y que configuración de maquina tienes para lograr el objetivo? con la configuración de maquina que tienes, responde de manera rápida?

Gracias de antemano por la respuesta.
Saludos
LEANDRO AREVALO
Bogotá (Colombia)
https://hymlyma.com
https://hymplus.com/
leandroalfonso111@gmail.com
leandroalfonso111@hotmail.com

[ Turbo Incremental Link64 6.98 Embarcadero 7.70 ] [ FiveWin 24.09 ] [ xHarbour 64 bits) ]
User avatar
Otto
Posts: 6430
Joined: Fri Oct 07, 2005 7:07 pm
Has thanked: 48 times
Been thanked: 15 times
Contact:

Re: Experience Report: Two Months with Our Local AI System

Post by Otto »

Dear Antonio, dear Cristobal,
thank you for the kind words.
It's not so easy for me – I'm a beginner – to explain this here, as I don't fully understand it myself yet.
But I have documented all the points.

I think, in principle, this one Python function is sufficient.

Best regards,
Otto

Code: Select all | Expand

meine raglogic.py

nano ~/ollama-rag/ragserver/raglogic.py

import chromadb
from chromadb.utils import embedding_functions
import requests

# Chroma vorbereiten
chroma_client = chromadb.Client()
collection = chroma_client.get_or_create_collection(
    name="hotelwissen",
    embedding_function=embedding_functions.DefaultEmbeddingFunction()
    )


def frage_beantworten(frage: str) -> str:
    result = collection.query(query_texts=[frage], n_results=3)
    kontext = " ".join([doc for doc in result["documents"][0]])

    payload = {
        "model": "mistral",
        "prompt": f"Beantworte folgende Frage basierend auf diesem Kontext:\n\n>
    }

    response = requests.post("http://localhost:11434/api/generate", json=payloa>
    for line in response.iter_lines():
        if line:
            data = line.decode("utf-8")
            if '"response":"' in data:
                return data.split('"response":"')[1].split('"')[0]
    return "Keine Antwort erhalten"

start.sh

cd ~/ollama-rag/ragserver
source venv/bin/activate
uvicorn main:app --host 127.0.0.1 --port 8000

otto-atzwanger@KI-Server:~$ nano ~/ollama-rag/ragserver/raglogic.py
otto-atzwanger@KI-Server:~$ cd ~/ollama-rag/ragserver
source venv/bin/activate
uvicorn main:app --host 127.0.0.1 --port 8000
INFO:     Started server process [6402]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
Image
User avatar
Otto
Posts: 6430
Joined: Fri Oct 07, 2005 7:07 pm
Has thanked: 48 times
Been thanked: 15 times
Contact:

Re: Experience Report: Two Months with Our Local AI System

Post by Otto »

Daer Leandro,
I use the following hardware. The responses are really fast. I asked the AI some questions because we want to deploy the systems directly at customer sites, including how it performs when everything is installed in a VM on existing customer systems.

✅ **CPU**: AMD Ryzen 9 7950X (16 cores / 32 threads)
- Ideal for parallel AI inference (e.g., handling multiple hotel queries simultaneously).

✅ **GPU**: RTX 3090 (24 GB VRAM)
- **Large VRAM** suitable for 7B–13B models (e.g., `llama3-70B` in 4-bit quantized format).
- *Tip:* With `mistral`/`llama3`-7B, multiple models can even fit in memory simultaneously.

✅ **RAM**: 64 GB DDR5-6000
- Sufficient for CPU inference in case the GPU is under load.

✅ **SSD**: 2 TB PCIe 4.0
- Fast loading of large models (e.g., `nougat` for invoice recognition).

Since the **RTX 4090 is not available**

Here’s my question to the AI:
File Details
How do you install a VirtualBox for Windows on a Linux PC? I need Windows Server.

Sure, let me walk you through it step by step — so you want to run a virtual machine with Windows Server on a Linux machine (e.g., Ubuntu), perhaps for testing or as an RDP target. Here's the guide:
Best regards,
Otto
File Details
How does it work to install a virtual box for Windows on a Linux PC? I need Windows Server.

Sure, I'll explain it step by step – so you want to run a virtual machine with Windows Server on a Linux computer (e.g., Ubuntu), for testing or as an RDP target. Here's the guide:

🧰 Requirements

A Linux PC with sufficient RAM (at least 8 GB recommended)

Installed Linux operating system (e.g., Ubuntu, Debian, Fedora …)

Windows Server ISO file (e.g., Windows Server 2019, 2022)

VirtualBox + Extension Pack

🔧 Step 1: Install VirtualBox
For Ubuntu/Debian:

bash
Kopieren
Bearbeiten
sudo apt update
sudo apt install virtualbox
For Fedora/RHEL-based:

bash
Kopieren
Bearbeiten
sudo dnf install virtualbox
Alternatively, you can get the latest version directly from Oracle’s website.

🧩 Step 2: Install Extension Pack
The Extension Pack is needed for USB 2.0/3.0, RDP, etc.

Download here: https://www.virtualbox.org/wiki/Downloads

Then install:

bash
Kopieren
Bearbeiten
sudo VBoxManage extpack install <path-to-.vbox-extpack-file>
Or via the GUI: Open VirtualBox → File → Preferences → Extensions.

💿 Step 3: Create a VM for Windows Server

Start VirtualBox

Create new VM:

Name: e.g., WinServer2022

Type: Microsoft Windows

Version: Windows 2022 (64-bit), or depending on ISO

RAM: e.g., 4096 MB (or more, depending on system)

Hard disk: new virtual disk (e.g., 50 GB, VDI, dynamically allocated)

Then: Select VM → Settings → Storage → Insert ISO file as virtual CD/DVD

▶️ Step 4: Install Windows Server

Start the VM

Install Windows Server as you would on a real PC

At the end, activate Windows Server, set up RDP, etc.

🔐 Optional: Set up RDP access to the VM
If you want to access the VM via RDP:

Enable RDP in Windows Server

Set the network mode in VM settings to Bridged Adapter (so the VM gets its own IP address on the network)

Question: What is the minimum requirement for the customer's system to work reasonably well?

Answer: For reasonably smooth operation of a VirtualBox VM with Windows Server (e.g., for testing or RDP access), the host system (i.e., the customer’s Linux PC) should meet at least the following requirements:

✅ Minimum Requirements for Host PC

Component Recommendation for productive use
CPU Quad-core processor, e.g., Intel i5 / AMD Ryzen 5 or better
RAM At least 16 GB (so that 4–8 GB can be allocated to the VM)
Storage SSD with at least 100 GB free, ideally NVMe
GPU (optional) Not required unless for graphics-heavy applications
Network Gigabit LAN or stable Wi-Fi for RDP connections
BIOS/UEFI Virtualization enabled (Intel VT-x / AMD-V)
📦 Virtual Machine Configuration

Resource Minimum (for Windows Server 2019/2022)
RAM (VM) 4 GB (better: 6–8 GB)
CPU (VM) 2 virtual CPUs
HDD (VM) 50–80 GB dynamically allocated
Network Mode Bridged (for LAN-based RDP access)
⚡ What else is important?

SSD is mandatory – HDDs are too slow for virtual systems.

Cooling system: Ensure good airflow for continuous operation.

Power consumption: Mini-PCs (e.g., Intel NUC, Ryzen Mini PCs) are great if quiet and energy-efficient operation is desired.

🧪 Practical Tip
For small companies with 1–2 simultaneous RDP users, a low-cost mini-PC with Ryzen 5, 16 GB RAM, and 1 TB SSD is often sufficient. Windows Server runs stably in VirtualBox on such a system.
User avatar
Otto
Posts: 6430
Joined: Fri Oct 07, 2005 7:07 pm
Has thanked: 48 times
Been thanked: 15 times
Contact:

Re: Experience Report: Two Months with Our Local AI System

Post by Otto »

Hello friends,
There is a large .dbf file containing around 200,000 forum posts. The posts are stored in memo fields and have so far been used for a traditional full-text search.

Now these contents are to be fed into a local RAG system (Retrieval-Augmented Generation).

Each post has been exported as a separate .txt file – one file per post to allow independent processing of the texts.

Planned steps:
Automatic cleanup of the texts:
Removal of HTML markup, BBCode elements

Code: Select all | Expand

([code], <span>, &#40;, etc.)
Saving the cleaned content as plain text files

Transferring the cleaned files into the RAG document folder

Running load_docs.py to index the content using Chroma

I'm really curious to see the quality of the responses and how fast it will work.

Best regards,
Otto
Post Reply