RAGTAG · how we got here

Taxxa’s chat, then ours.

A short story. RAGTAG (Retrieval Augmented Graph Tax Answer Generator) answers Finnish tax questions with citations across Finlex, Vero and KHO.

i. the constraint

It started with a list.

Before any code, the Taxxa team sat us down on 23.05. Seven points, each a hard constraint.

  1. €60 per user per month. Queries have to be cheap.
  2. Connect Finlex, Vero, case law. EU-lex is out of scope.
  3. Case laws refer to Finlex. Vero is just an interpreter. Case laws can overwrite Vero.
  4. Can’t bring 1,000 chunks per question. 25M-page DB.
  5. Timeline matters. Active now, not then, not later.
  6. Run RAG locally first. DeepSeek is good and cheap.
  7. Reference extraction by regex / NLP is a good idea. They aren’t doing it.
ii. the cathedral

We tried to build too much.

Our first sketch: a bitemporal knowledge graph on Neo4j with the SAT-Graph RAG schema arXiv:2505.00039. BGE-M3 hybrid retrieval, ColBERT SIGIR 2020, Reciprocal Rank Fusion SIGIR 2009. A courtroom debate from AgenticSimLaw arXiv:2601.21936. A 63k-node GPU constellation. SPARQL fallback CRAG-style arXiv:2401.15884 with Self-RAG reflection tokens ICLR 2024.

Two days in, zero answers. The cathedral lost against chat #01, #03 and #04 before it ever resolved a conflict.

iii. the pivot

Chat #07 was the unlock.

Taxxa said reference extraction by regex was a good idea and they weren’t doing it. We flipped the build order. Three deterministic passes over HTML: structural (heading tree), anchor (<a href>) and regex (text citations). The graph fell out automatically; one Verifier comparing an integer rank replaced three agents arguing.

iv. ragtag

Ten small pieces.

Graph in SQLite (1.97M nodes, 2.18M edges). Chunks in LanceDB on the filesystem, embedded by Voyage voyage-3-large (1024-dim multilingual). Section-anchored chunking, six-preset strategy router, bounded BFS with hub-skip caps, bge-reranker-v2-m3 cross-encoder. Temporal correctness is a per-section version_chain plus a deterministic check_temporal_mismatches on difflib. Authority is one integer: Finlex 100, Treaty 90, KHO 80, Vero 60. Generation runs on DeepSeek-V4-Pro via Featherless, query rewrites cached in process.

v. the architecture

How it actually fits together.

Sources Finlex · Vero · KHO · Treaty (HTML) Three deterministic passes structural · anchor · regex no model in this batch path SQLite graph nodes · edges 1.97M nodes · 2.18M edges LanceDB vectors Voyage voyage-3-large 402k chunks · 1024-dim joined by section_id · available to query Question Finnish or English Embed & search Voyage encodes → LanceDB returns top seeds Cross-encoder rerank bge-reranker-v2-m3 · 0.6 CE + 0.3 cos + 0.1 meta Strategy router + BFS six presets · 1 to 2 hops over the SQLite graph hub-skip caps: interprets_in > 30, cites_out > 15, parent_of_in > 50 Draft & answer

Built from scratch on purpose. The unique part is the architecture, not any one model. Each layer is small enough to debug and replace.

vi. ai act ready, by accident

The graph is auditable by construction.

A future EU AI Act review asks “how did the system reach this answer?” RAGTAG answers that without extra work. Every cite ships a RetrievalPath. Every stale chunk ships an AmendmentCaveat. Every conflict ships an integer-rank resolution. None of this is live on Taxxa today; the demo is what a compliance-ready future could look like.

EU-lex itself is out of scope today (chat #02), but adding it later is new corpus, not new infrastructure. The transposes edge type is already in the schema, the authority lattice extends with one number, and the strategy router treats it as another cross_source route.

vii. two things we learned

The graph paid off twice.

Mojibake recovered through the graph

About 1.7% of chunks were double-encoded; the HTML sniffer mis-read UTF-8 as Latin-1. We traced RAG hits back to source files, forced UTF-8 at the parse layer, and re-embedded only the affected slice.

scripts/reingest_corrupted_chunks.py

Not every tax question is in the law

Eval question N49 asks the common account-number range for myyntisaamiset and ostovelat. Our system returned the correct legal answer (no universally binding range exists). The question-bank reference traces to KILA practice, not Finlex.

eval/questions.json · question N49
viii. burning questions

What we get asked most.