Skip to main content

Ahnlich Architecture V2

Status: Alpha / testing – subject to breaking changes.**

Ahnlich is split into two independent, network‑accessible services that work in tandem:

  • ahnlich‑ai – the Intelligence Layer
  • ahnlich‑db – the Vector Store Layer

Clients can speak to either layer through gRPC/HTTP or the bundled CLI/SDKs. The AI layer adds automated embedding and model management on top of the raw vector store exposed by the DB layer.

📦 1. High‑Level Design​

Analogy to Kafka​

KafkaAhnlich
ProducerAI Client / DB Client
Brokerahnlich‑ai & ahnlich‑db services
Topic / PartitionStore (logical namespace)
MessageVector + metadata
ConsumerClient fetching GetSimN

2. Key Components​

2.1 ahnlich‑ai – Intelligence Layer​

Sub‑componentResponsibility
AI Client APIExternal gRPC/HTTP endpoints – accepts raw documents (text, images…) & metadata.
Store HandlerMaps incoming requests to a Store; maintains per‑store configuration (models, preprocess pipeline).
StoreLogical namespace. Each holds a pair of ONNX models (Index & Query) plus preprocessing logic.
Model NodeExecutes preprocessing → model inference → produces embedding.
Optional PersistencePeriodic snapshots of store metadata & model cache to disk.

2.2 ahnlich‑db – Vector Store Layer​

Sub‑componentResponsibility
DB Client APIAccepts vector‑level commands: SET, GETSIMN, CREATESTORE, etc.
Store HandlerRoutes to correct Store; enforces isolation; coordinates concurrent reads/writes.
Store (Vector Index)In‑memory index (brute‑force or KD‑Tree) plus metadata map. Supports cosine & Euclidean similarity.
Filter EngineApplies boolean predicates on metadata during query.
Optional PersistenceSnapshots vectors & metadata to on‑disk binary file for warm restarts.

3. Data Flow​

3.1 Indexing (Write) Path​

  1. Client ➜ AI Layer – Sends raw document + metadata.
  2. Preprocessing & Embedding – AI layer cleans input, runs Index Model to yield vector.
  3. AI ➜ DB – Issues SET carrying vector & metadata.
  4. DB Store – Writes vector into index, stores metadata.

3.2 Similarity Query Path​

  1. Client ➜ AI Layer – Provides search text/image.
  2. Embedding – AI layer runs Query Model to create search vector.
  3. AI ➜ DB (GETSIMN) – Vector + algorithm + optional predicate.
  4. DB – Computes distance, applies metadata filter, returns Top‑N IDs & scores.
  5. AI Layer – (Optional) post‑processes or joins additional metadata before responding to client.

3.3 Direct DB Access​

Advanced users can bypass AI and push pre‑computed vectors directly into ahnlich‑db for maximum control.

4 Persistence & Durability​

  • Opt‑in via --enable-persistence.
  • Snapshot interval configurable (--persistence-interval, default 300 s).
  • DB writes a flat binary file; AI persists model cache & store manifests.
  • On startup each service checks for the snapshot file and hydrates memory before accepting traffic.
  • No replication yet; Ahnlich currently targets single‑node or shared‑nothing sharded deployments.

5. Scaling & Deployment Topologies​

PatternHow it worksWhen to use
Single‑NodeOne ahnlich‑ai & one ahnlich‑db container (shown in README Compose).Prototyping, local dev.
Vertical ScalingGive DB more RAM/CPU; mount NVIDIA GPU for AI layer.Medium workloads where a single node still fits in memory.
Store‑Level ShardingRun multiple DB instances, each owning a subset of Stores; fronted by one AI layer.Multi‑tenant SaaS or very large corpora.
Function ShardingIsolate heavy NLP image pipelines by model type: one AI instance per model group.Heterogeneous workloads, GPU scheduling.

Roadmap: cluster‑wide replication & consistent hashing for transparent sharding.

6. Observability​

  • Both services instrumented with OpenTelemetry; enable with --enable-tracing and send spans to Jaeger, Prometheus, etc.
  • Internal metrics: query latency, index size, RAM usage, model inference time.

7. Extensibility​

  • Add a new similarity metric – implement SimAlgorithm trait in ahnlich‑db.
  • Bring your own model – point ahnlich‑ai to an ONNX file or HuggingFace repo via --supported-models.
  • Custom predicates – extend the predicate DSL to support regex or full‑text.

8. Security Considerations​

Currently no built‑in auth. Recommend placing behind an API gateway or reverse proxy that enforces:

  • JWT / OAuth 2 bearer tokens.
  • Mutual TLS between AI ⇄ DB if running across hosts.

9. Limitations (July 2025)​

  • No distributed consensus – durability limited to local snapshots.
  • Single‑writer per Store lock may become a bottleneck under heavy concurrent writes.
  • Model hot‑swap requires store recreation.

🔍 Summary​

Ahnlich decouples vector intelligence (embedding generation, model lifecycle) from vector persistence & retrieval. This split allows you to scale and tune each layer independently while keeping a simple mental model—much like Kafka separates producers, brokers, and consumers around an immutable log.