Architecture
Vectorless transforms documents into hierarchical semantic trees and uses LLM-powered reasoning to navigate them. This page describes the end-to-end pipeline.
High-Level Flow
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Document │────▶│ Index │────▶│ Storage │
│ (PDF/MD) │ │ Pipeline │ │ (Disk) │
└──────────────┘ └──────────────┘ └──────┬───────┘
│
┌──────────────┐ ┌──────▼───────┐
│ Result │◀────│ Retrieval │
│ (Evidence) │ │ Pipeline │
└──────────────┘ └──────────────┘
Index Pipeline
The indexing pipeline processes documents through ordered stages:
| Stage | Priority | Description |
|---|---|---|
| Parse | 10 | Parse document into raw nodes (Markdown headings, PDF pages) |
| Build | 20 | Construct arena-based tree with thinning and content merge |
| Validate | 22 | Tree integrity checks |
| Split | 25 | Split oversized leaf nodes (>4000 tokens) |
| Enhance | 30 | Generate LLM summaries (Full, Selective, or Lazy strategy) |
| Enrich | 40 | Calculate metadata, page ranges, resolve cross-references |
| Reasoning Index | 45 | Build keyword-to-node mappings, synonym expansion, summary shortcuts |
| Navigation Index | 50 | Build NavEntry + ChildRoute data for agent navigation |
| Optimize | 60 | Final tree optimization |
Each stage is independently configurable. The pipeline supports incremental re-indexing via content fingerprinting.
Tree Structure
Each node in the tree contains:
TreeNode
├── title — Section heading
├── content — Raw text (leaf nodes)
├── summary — LLM-generated summary
├── structure — Hierarchical index (e.g., "1.2.3")
├── depth — Tree depth (root = 0)
├── references[] — Resolved cross-references ("see Section 2.1" → NodeId)
├── token_count — Estimated token count
└── page_range — Start/end page (PDF)
Retrieval Pipeline
The retrieval pipeline is a supervisor loop driven entirely by LLM reasoning. Every decision — which documents to query, how to navigate, whether evidence is sufficient — is made by the model, not by heuristics.
Principles
- Reason, don't vector. — Every retrieval decision is an LLM decision.
- Model fails, we fail. — No silent degradation. No heuristic fallbacks.
- No thought, no answer. — Only LLM-reasoned output counts as an answer.
Flow
Engine.ask()
→ Dispatcher
→ Query Understanding (LLM) → QueryPlan (intent, concepts, strategy)
→ Orchestrator (always — single or multi-doc)
→ Analyze (LLM reviews DocCards, selects documents + tasks)
→ Supervisor Loop:
Dispatch Workers → Evaluate (LLM sufficiency check)
→ if insufficient → Replan (LLM) → loop
→ Rerank (dedup → BM25 score → evidence formatting)
Query Understanding
Every query first passes through LLM-based understanding:
| Field | Description |
|---|---|
| Intent | Factual, Analytical, Navigational, or Summary |
| Strategy Hint | focused, exploratory, comparative, or summary |
| Key Concepts | LLM-extracted concepts (distinct from keywords) |
Orchestrator (Supervisor)
The Orchestrator is the central coordinator. It always runs — even for single-document queries. Its supervisor loop:
- Analyze — LLM reviews DocCards (lightweight metadata) and selects relevant documents with specific tasks
- Dispatch — Fan-out Workers in parallel (one per document)
- Evaluate — LLM checks if collected evidence is sufficient to answer the query
- Replan (if insufficient) — LLM identifies missing information and dispatches additional Workers
When the user specifies document IDs directly, the Orchestrator skips the analysis phase and dispatches Workers immediately.
Worker (Evidence Collector)
Each Worker navigates a single document's tree to collect evidence through a command-based loop:
- Bird's-eye —
lsthe root for an overview - Plan — LLM generates a navigation plan based on keyword index hits
- Navigate — Loop: LLM selects command → execute → observe result → repeat
- Return — Collected evidence only — no answer synthesis
Available Commands
| Command | Description |
|---|---|
ls | List children at current position (with summaries and leaf counts) |
cd <name> | Enter a child node |
cd .. | Go back to parent |
cat <name> | Read node content (automatically collected as evidence) |
head <name> | Preview first N lines (does NOT collect evidence) |
find <keyword> | Search the document's ReasoningIndex for a keyword |
findtree <pattern> | Search for nodes by title pattern (case-insensitive) |
grep <pattern> | Regex search across content in current subtree |
wc <name> | Show content size (lines, words, chars) |
pwd | Show current navigation path |
check | Evaluate if collected evidence is sufficient |
done | End navigation |
Navigation Strategy
Workers prioritize keyword-based navigation over manual exploration:
- When keyword index hits are available, Workers use
findwith the exact keyword to jump directly to relevant sections - Workers use
lswhen no keyword hints exist or when discovering unknown structure - Workers use
findtreewhen the section title pattern is known but not the exact name
Dynamic Re-planning
After a check command finds insufficient evidence, the Worker triggers a re-plan — the LLM generates a new navigation plan based on what's missing. This allows the Worker to adapt its strategy mid-navigation.
Rerank Pipeline
After all Workers complete, the Orchestrator runs the final pipeline:
- Dedup — Remove duplicate and low-quality evidence
- BM25 Scoring — Rank evidence by keyword relevance
- Evidence Formatting — Return original document text with source attribution
The system returns raw evidence text — no LLM synthesis or paraphrasing. This ensures the user sees the exact document content that matches their query.
DocCard Catalog
When multiple documents are indexed, Vectorless maintains a lightweight catalog.bin containing DocCard metadata for each document. This allows the Orchestrator to analyze and select relevant documents without loading the full document trees — a significant optimization for workspaces with many documents.
Cross-Document Graph
When multiple documents are indexed, Vectorless automatically builds a relationship graph based on shared keywords and Jaccard similarity. The graph is constructed as a background task after each indexing operation.
Zero Infrastructure
The entire system requires only an LLM API key. No vector database, no embedding models, no additional infrastructure. Trees and metadata are persisted to the local filesystem in the workspace directory.