system dossier

Kairos

Graph-guided retrieval engine for multi-hop reasoning over personal knowledge.

System Brief

Kairos keeps semantic text recall and structural graph traversal as separate primitives. The current system stores source text and vectors in PostgreSQL/pgvector, writes graph structure to Neo4j, and uses GDS PPR to expand dense passage seeds.

System Signals

ONNX Runtime and DJL keep embeddings on the JVM for chunks, concepts, and queries.

Gemini Flash extracts triples during ingestion so the graph grows with the source material.

The retrieval layer reads dense candidates and graph structure without merging those concerns into one database model.

signal

Dense retrieval and graph traversal run together because cosine similarity alone misses multi-hop questions.

Storage Model2

pgvector for dense search and Neo4j for graph traversal

Embedding Space384

all-MiniLM-L6-v2 vectors shared by chunks, concepts, and queries

Retrieval CorePPR

Personalized PageRank seeded by dense passage candidates and hydrated through PostgreSQL

Signal Path

protected APIcontext enginedual storesgraph retrieval

Architecture Board

Kairos is a protected Spring Boot monolith around a context engine core. AI indexing feeds PostgreSQL/pgvector and Neo4j separately, while online retrieval combines dense anchors with graph expansion.

01protected API

02context engine

03dual stores

04graph retrieval

constraint

Dense passage recall gives useful anchors, but current retrieval still needs graph expansion for multi-hop context.

result

Kairos separates semantic storage and graph structure, then combines both signals at retrieval time.

Architecture

The main architecture is a Spring Boot monolith with identity, protected source APIs, a context engine core, AI indexing adapters, and two persistence surfaces. Ingestion feeds both stores; retrieval combines pgvector anchors with Neo4j GDS expansion before PostgreSQL hydrates the final chunk payloads.

Key Decisions

Dual-store retrieval over one generalized database

One generalized store forces either graph traversal or dense similarity search into a weaker execution model. Kairos keeps pgvector for vector search and Neo4j for graph traversal because HippoRAG 2 needs both primitives natively.

HippoRAG 2 over cosine-only retrieval

Cosine-only retrieval misses questions that require a chain of concepts spread across different passages. HippoRAG 2 seeds Personalized PageRank from semantic anchors, so relevance can move through the graph before the final ranker chooses context.

ONNX Runtime on the JVM over a Python sidecar

A Python sidecar adds another runtime, another deployment unit, and another failure boundary to every embedding call. ONNX Runtime plus DJL tokenizers keeps chunk, node, and query embeddings inside the JVM with one semantic space.

LLM-powered OpenIE over manual relation entry

Manual relation entry does not scale with passive knowledge accumulation and leaves the graph stale unless the user curates it constantly. Gemini Flash extracts triples during ingestion so the graph grows with the content itself.

Trade-offs Accepted

Two persistence stores with separate operational concerns

pgvector and Neo4j need different backup, monitoring, and tuning paths. Kairos accepts that operational split because retrieval quality depends on having both stores do the work they are good at.

Longer ingestion time from graph construction and triple embedding

Graph creation adds extraction and embedding work before a note is fully indexed. That delay stays acceptable because Kairos optimizes for query quality after ingestion, not instant writes.

External extraction dependency during ingestion

Gemini Flash adds a network dependency and explicit failure modes to triple extraction. The trade stayed acceptable because the extracted graph is materially better than the local alternatives tested so far.

Stack

activeJava 21Spring BootPostgreSQLpgvectorNeo4jONNX RuntimeDJLGemini FlashDocker