Welcome to Vectorless
Vectorless is a document understanding engine for AI. It compiles documents into structured trees of meaning, then dispatches multiple agents to reason through headings, sections, and paragraphs — evaluating how each part relates to the whole. The problem it solves is not "where to look", but "what does this mean in context". Every answer is a reasoning act, not a retrieval result.
How It Works
- Parse — Documents (Markdown, PDF) are parsed into hierarchical semantic trees, preserving structure and relationships between sections.
- Compile — Trees are stored with metadata, keywords, and summaries. The pipeline resolves cross-references ("see Section 2.1") and expands keywords with LLM-generated synonyms for improved recall. Incremental compiling skips unchanged files via content fingerprinting.
- Ask — An LLM-powered agent navigates the tree to find the most relevant sections. The Orchestrator coordinates multi-document queries, dispatching Workers that use
ls,cd,cat,find, andgrepcommands to explore the tree and collect evidence.
Quick Start
import asyncio
from vectorless import Engine
async def main():
engine = Engine(
api_key="sk-...",
model="gpt-4o",
)
# Compile a document
result = await engine.compile(path="./report.pdf")
doc_id = result.doc_id
# Ask a question
response = await engine.ask("What is the total revenue?", doc_ids=[doc_id])
print(response.single().content)
asyncio.run(main())
Using a Custom Endpoint
engine = Engine(
api_key="sk-...",
model="gpt-4o",
endpoint="https://api.your-provider.com/v1",
)
From Environment Variables
engine = Engine.from_env()
From Config File
engine = Engine.from_config_file("./config.toml")