How it works

Reasoning, not retrieval.

Most tools answer questions about your data by retrieving chunks of it, or by looping a model over your files one tool call at a time. RLM does something different: the model writes a program, runs it in a sandbox over your data, and only the answer comes back.

vs. coding agents

Agents loop. RLM writes code.

Coding agents (Claude Code, Codex, Pi) run a tool-call loop: read a file, grep, run a command—each result streamed back into the model's context window. That's powerful for editing a codebase, but the context window is the ceiling: to reason over more data than fits, you can't.

RLM writes a program that is the loop. It filters and aggregates in the sandbox and fans out sub-calls (llm_query) over slices—so the data never has to fit in one context window.

Tool-loop agent
Claude Code · Codex · Pi
LLM
read_file grep bash
every result returns
Context window
file1.md … (4k tokens)
grep results … (2k)
file2.md … (6k)
file3.md … ⚠ window full

Data over the window size can't be reasoned over—it never fits.

RLM
ModelRelay
LLM
writes a program ↓
# runs in sandbox
rows = query("…")
hits = grep(docs, q)
parts = llm_batch(hits)
answer["ready"] = summarize(parts)
only the answer returns
Answer
Intermediate data stayed in the sandbox—out of the context window.

Scales past the context window: the code does the reduction.

vs. RAG & search

No retrieval pipeline.

Search and RAG tools (vector DBs, hybrid search like qmd) answer by retrieving: chunk everything, embed it, rank the top matches, and hope the answer is in there. You build and maintain that pipeline, and it only ever returns snippets.

RLM skips it. The model queries and reads your data directly, follows links and structure, and reasons across what it finds—no chunking, no embeddings, no reranking to maintain.

RAG / search pipeline
Chunk every document
Embed → vector DB
Retrieve top-k matches
Re-rank
Stuff into prompt
Returns snippets—not answers. You own the pipeline.
RLM
Inspect schema / files
Query & read what matters
Reason across results
Recurse with sub-calls
Returns an answer. Nothing to build or maintain.
At a glance

Where RLM fits.

RLM isn't a coding agent, a search engine, or a BI tool. It's a runtime you embed to reason over your data—across any model, in your own boundary.

Coding agentsRAG / searchText-to-SQLRLM
ExamplesClaude Code, Codex, Piqmd, vector DBsCortex, GenieModelRelay
JobAct on a codebaseRetrieve snippetsOne-shot queryReason & answer
HowTool-call loopChunk + embed + rankNL → SQL → rowsWrites & runs code
Bigger than context?No—window-boundOnly top-kN/AYes—code reduces it
ModelsSingle labVariesWarehouse-lockedAny lab
ShapeApp you useLibrary / indexBI toolEmbeddable runtime

These tools aren't competitors so much as different jobs—RLM can even call a search index or run SQL as one tool inside its loop.

Reason over your data.

Connect a database or a folder of docs and ask anything.

Request access