All Posts
Jeff Hooton · · 23 min read

Building Scry: An Agent-First Code Index That Replaces Read and Grep

If you watch a Claude Code session for ten minutes, you’ll notice how much of the work is not actually work. The agent greps for a function name, gets back forty results across a dozen files, reads each one, throws away most of it, and finally has the context it needs to make a one-line change. Then the next question comes in and it does the whole dance again. I got tired of watching it, so I built scry to cut it out.

Scry is a local daemon that pre-computes a semantic index of every repo you work in, symbols, references, definitions, call graphs, implementations, and exposes it as a millisecond-latency API over a Unix socket. It’s a single static Go binary, it registers itself as an MCP server with Claude Code, and when you ask “where is processOrder used” it answers in 6-7 milliseconds instead of 5-10 seconds of Read and Grep churn.

This is how I thought about the design, what the numbers actually are, and a few of the uglier bits where reality diverged from the spec.

Why not just use LSP

The first question everyone asks is “isn’t this just LSP?” No, and the reason is important.

LSP was designed for a 60-fps editor. The whole protocol is built around a stateful client (your editor) asking a stateful server positional questions like “what’s at line 47 column 12 right now?” It’s excellent at hover tooltips and autocomplete dropdowns because those are the problems it was built for. Humans look at one position at a time.

Agents don’t. An agent wants a batch answer to a structural question. “Give me every call site of processOrder with one line of context for each.” “What does OrderService implement, and what other classes implement the same interface?” “If I change the signature of handlePayment, what’s the full set of files that will break?” LSP can technically answer some of these, but the latency target is wrong (it was tuned for keypress cadence, not tool-call cadence), the output is wrong (pretty hover cards instead of structured JSON), and the per-language-server-per-editor deployment model is wrong for a CLI agent.

The way I ended up thinking about it: LSP is the interactive cousin, scry is the batch cousin. They’re solving different problems even though they touch the same data. An LSP server lazily computes what you ask for, because you’re one human clicking around at 100ms cadence. Scry eagerly pre-computes everything, because you’re an agent making hundreds of queries per session at millisecond cadence and the only way to hit that latency is to have the answer ready before the question lands.

The thesis

The developer toolchain is being rewritten for agents, and the rewrites are 10-100x faster because they don’t carry the human-UI tax. Playwright is human-first, trawl is agent-first. LSP is human-first, scry is agent-first. The pattern is always the same. Figure out what the agent actually needs, pre-compute it once, store it in an embedded KV, and answer queries in single-digit milliseconds.

Concretely, for a typical Claude Code session, the Read + Grep + Glob cycle burns 30-50% of tool calls on “find the relevant code” before anything useful happens. Each round is 5-15 seconds of wall clock and a few thousand tokens of mostly-irrelevant context that the agent immediately discards. Multiply that across a session and you’re looking at a real tax on both cost and capability. If you can turn that into a single millisecond-scale lookup that returns exactly the lines that matter, you get a better agent for less money.

How it works

Scry is a single Go binary that speaks two modes. Run it as scry refs processOrder and it acts as a thin CLI client. Run it as scry start and it becomes a daemon listening on ~/.scry/scryd.sock. The client auto-spawns the daemon on first call, so from the user’s perspective there’s no mode switch. It just works.

The indexing pipeline leans on Sourcegraph’s SCIP format as the common currency. For each language, there’s a dedicated indexer that produces a .scip file (a Protobuf-encoded semantic index), and scry’s job is to parse that file, normalize the symbols and occurrences, and write them into a per-repo BadgerDB store at ~/.scry/repos/<sha256>/. Queries then hit BadgerDB, not the source tree.

scry CLI

    │  JSON-RPC 2.0 over Unix socket

scryd
    ├── JSON-RPC dispatcher
    ├── Query engine (refs, defs, callers, callees, impls)
    ├── Store registry (one BadgerDB per repo)
    ├── File watcher (fsnotify, 300ms debounce)
    └── Index builder
         ├── scip-typescript (npm)
         ├── scip-go (auto-downloaded, SHA256-verified)
         └── scip-php (embedded in binary)

For TypeScript and JavaScript, I lean on scip-typescript. For Go, scip-go gets auto-downloaded on first use into ~/.scry/bin/, pinned to a specific version with SHA256 verification. For PHP, I ended up embedding scip-php as a vendored directory tree inside the scry binary itself, for reasons I’ll get to in a minute. TypeScript is P0, Go is P1, PHP is P1 engine and P2 Laravel post-processor. Python and Ruby are on deck because Sourcegraph ships first-party SCIP indexers for both.

The query engine is pretty boring and that’s intentional. Refs and defs are direct BadgerDB lookups. Callers and callees are built from SCIP’s enclosing_range metadata at index time, so querying the call graph is another direct lookup rather than a walk over occurrences. Implementations come from SCIP’s Relationships.is_implementation edges. Everything that can be pre-computed at index time is pre-computed at index time.

Why a daemon

Pre-computation is the entire game, and pre-computation requires a warm process. A typical 100k-LOC TypeScript repo has thousands of symbols and tens of thousands of references. Computing those on every query (LSP-style) is impossibly slow. Computing them once at index time, storing them in an embedded KV, and then answering queries via index lookup is single-digit milliseconds. But that only works if the index stays warm between calls. A fresh CLI invocation has to re-open the BadgerDB store, re-mmap the index files, re-build any in-memory caches. The daemon avoids all of that.

The cost of connecting to a Unix socket and sending a JSON-RPC request is around 50 microseconds. The cost of spawning a new process and parsing command-line arguments is around 50 milliseconds. An agent makes hundreds of queries per session, so that’s the difference between “imperceptible” and “noticeable enough to slow the whole session down.”

The daemon is auto-spawned on first CLI call and runs until you stop it. One daemon per user, many indexed repos. No daemon manager, no systemd unit, no launchd plist. Same story as trawl.

The numbers

I measured everything against a real TypeScript codebase (~/herd/advocates, 400 files, 55k LOC) and a real Laravel codebase (hoopless_crm, larger).

MetricTargetActual
Daemon cold spawn (CLI exits, daemon listening)<500ms~17ms
scry refs <symbol> end-to-end (warm)<10ms p506-7ms
Cold index build, 100k-LOC TS repo<60s9.9s
Query unavailability during reindex(originally 3-15s)12ms swap, no failed queries

The daemon cold spawn came in thirty times faster than the spec target, which surprised me. I’d budgeted half a second because I assumed BadgerDB would want to do some work on open, but in practice the store opens lazily on first query and the bare daemon startup is just “bind the socket and wait.”

The 6-7ms query number is the one that matters. That’s the full wall-clock from scry refs processOrder hitting the CLI to JSON coming back on stdout. Inside that, the JSON-RPC round trip is sub-millisecond, the BadgerDB lookup is sub-millisecond, and most of the time is actually spent in Go’s startup and flag parsing on the client side. The daemon-side query is effectively free.

Cold index build was the biggest question mark going in. Ten seconds for 100k LOC is fast enough that you can re-index from scratch on demand, which opens up a bunch of nice operational patterns (no need for manual invalidation, no cache coherence bugs, just nuke it and rebuild).

The reindex-during-serve problem

The ugliest bit of the whole project was the watcher. The first cut was naive: fsnotify tells you a file changed, you debounce for 300ms, then you re-run the indexer and overwrite the store. Simple. Except that during the reindex window, every query returns nothing, because the store is being rewritten underneath them. On a mid-sized repo that’s a 3-15 second blackout every time someone hits save.

The fix took a couple of iterations but ended up clean. Instead of rewriting in place, the watcher calls index.BuildIntoTemp which writes the new index to a sibling directory at <storage>/index.db.next/. The live store keeps serving queries the whole time. When the build finishes, Registry.SwapNext performs a single close-plus-rename-plus-open dance that takes around 12ms, and from the next query forward the new index is live. There’s never a moment where the store is broken.

I measured this on a real reindex against hoopless_crm: 1449 successful queries during a 48-second reindex window, zero failures, slowest query 84ms. The blackout went from 3-15 seconds to 12 milliseconds, and 1449 other queries slipped through without noticing.

The spec’s “<200ms incremental update” target turned out to be unreachable for a different reason, though. The target assumed that single-file SCIP indexing exists. It doesn’t. scip-typescript and scip-go are both project-wide, type-resolution-driven, and offer no single-file mode, because type checking doesn’t work that way. So the watcher always re-indexes the whole project on any change. Realistic numbers are around 600ms for a tiny project, 3 seconds for a trawl-class project, 10-15 seconds for an advocates-class project. The long-term fix is probably a tree-sitter overlay for the 95% of queries where syntactic precision is good enough, but I haven’t needed it yet.

PHP, because Laravel is a primary user stack

Most code intelligence tools treat PHP as an afterthought. I actively build in Laravel, so I wasn’t willing to ship scry without solid PHP support, and the PHP work ended up being the most interesting post-processor pipeline in the project.

The engine is davidrjenni/scip-php, which produces a passable SCIP index for plain PHP. But Laravel has a bunch of dynamic patterns that no static indexer can resolve on its own, and “passable” isn’t enough if you want scry refs DB::table to actually find the call sites. I ended up building four post-processors that run after scip-php and synthesize the missing edges.

First, a non-PSR-4 file walker. Laravel has a bunch of PHP files outside the PSR-4 autoload paths, specifically routes/, config/, database/migrations/, and bootstrap/. scip-php walks PSR-4 namespaces, so those directories are invisible to it, which means about 1300 ::class references per real Laravel codebase get dropped on the floor. The walker scans those directories after scip-php runs, finds the ::class references, and binds them to scip-php’s symbol IDs. On hoopless_crm it recovered 1254 of 1283 references, about 98%.

Second, a facade resolver. Laravel’s facades are magic: Auth::user() resolves at runtime to Illuminate\Auth\AuthManager::user() or Illuminate\Contracts\Auth\Guard::user() depending on the binding. Static analysis can’t follow this. I hardcoded a map of all 31 Illuminate facades to their backing manager and contract classes, and the post-processor emits synthetic reference edges for every facade call site. So scry refs AuthManager::user finds every Auth::user() call. On hoopless_crm this synthesized about 5,000 edges.

Third and fourth, a string-ref walker for view('foo.bar') and config('foo.bar') calls. These are even more magic. view('users.show') resolves to resources/views/users/show.blade.php, and config('services.dataforseo.login') resolves to config/services.php at the dataforseo.login key. Neither scip-php nor any general PHP indexer knows about this. The post-processor walks every .php file in the project, finds the string references, and emits synthetic edges to the blade files and config keys as if they were first-class symbols. On hoopless_crm it recovered 7 view refs and 280 config refs, and it means you can run scry refs services.dataforseo.login and get every config call site with file and line.

One gnarly detail: I originally planned to distribute scip-php as a PHAR (a self-contained PHP archive), because that’s the standard PHP distribution format. It didn’t work. php-scoper choked on PHP 8.4 keyword shims, the PHAR build broke in ways that were going to take a week to chase, and I was already behind schedule on PHP. I ended up vendoring scip-php as a raw directory tree embedded into the scry Go binary via embed.FS, extracted on first use into ~/.scry/bin/scip-php-<sha>/. Ugly, but it works, and it means users never have to install anything PHP-related themselves as long as they already have php on their PATH. The trade-off is that the scry binary is larger than it would otherwise be, but that’s a price I’m willing to pay.

The UTF-8 bug that ate a whole afternoon

Somewhere in the middle of the PHP post-processor work, scry’s hand-rolled PHP scanner hung forever on a real Laravel command file. Not slow. Actually infinite. I had to kill -9 it. And since the walker ran on every PHP file in the project, indexing hoopless_crm was completely stuck.

The file was app/Console/Commands/BackfillOrphanedScans.php, 5009 bytes, nothing exotic about it. I started bisecting by byte count. 5009 hung, 2500 hung, 1250 hung, 1100 worked, 1200 worked, 1250 hung. Then byte by byte against truncated copies with go test -count=1 -timeout 3s. 1248 worked, 1249 hung. Exactly one byte was the difference between a parser that finished in milliseconds and one that spun until I killed it.

The byte was \xE2. That’s the first byte of in UTF-8, three bytes total (\xE2\x86\x92). The arrow was inside an interpolated string in a log call:

$this->line("  LOC {$location->id} ({$location->name}) → BIZ {$business->id}")

My first guess was that I was doing something wrong with UTF-8 decoding. That guess was wrong in a more interesting way than I expected.

The parser’s UTF-8 handling was actually fine. The identifier reader correctly called utf8.DecodeRune on byte slices, handled truncated sequences, returned early on RuneError. I’d written tests for all of that. The bug was one layer up, in the main dispatch loop, which looked like this:

// Broken.
for s.pos < len(s.src) {
    c := s.src[s.pos]
    switch {
    case isIdentStart(rune(c)):
        s.scanIdentifierOrKeyword(&res)
    // ...
    }
}

See the trap? c is a byte. rune(c) zero-extends it to a 32-bit rune. Byte \xE2 becomes rune 0x00E2, which is â, Latin-1 lowercase a-circumflex. unicode.IsLetter('â') returns true, because it genuinely is a letter. So the scanner routed a raw UTF-8 continuation byte into the identifier reader as if it were the start of an identifier.

Inside the identifier reader, the proper UTF-8 decode kicked in. utf8.DecodeRune(s.src[s.pos:]) on a lone \xE2 returns (RuneError, 1). The identifier loop breaks after zero iterations because RuneError isn’t a letter. The empty-identifier guard fires: if ident == "" { return }. And s.pos never advances. Main loop iterates, dispatches on \xE2 again, calls the identifier reader again, returns without advancing. Forever.

There was a second ingredient I haven’t explained yet: how the scanner got inside the string in the first place. PHP literal strings are handled by a separate state, so "hello → world" should never reach the main loop byte by byte. The walker’s identifier path has a special case for extracting view('users.show') and config('mail.from') call targets, so after reading an identifier it peeks ahead for ( followed by a string literal. When it saw line( followed by ", it called consumeStringLiteral, which returns ("", false) on interpolated strings (strings containing $ aren’t literals for config-key extraction). But the consume function left s.pos parked inside the string at the $, not advanced past the closing quote. Main loop resumed from there and started processing the string body as if it were code. The \xE2 was just sitting there waiting.

Both ingredients were needed. A function call with an interpolated string argument, to trip the consume helper into returning early with s.pos still inside the string, AND a multibyte character later in that same string. Neither was present in any of my hand-written test cases. I didn’t discover it until the walker hit a real codebase with real Laravel logging containing one real Unicode arrow.

The fix was in two places. First, a proper byte-aware helper that decodes UTF-8 before deciding:

func isIdentStartByte(b byte, tail []byte) bool {
    if b < 0x80 {
        return b == '_' || (b >= 'a' && b <= 'z') || (b >= 'A' && b <= 'Z')
    }
    r, size := utf8.DecodeRune(tail)
    if size == 0 || r == utf8.RuneError {
        return false
    }
    return unicode.IsLetter(r)
}

Second, a belt-and-suspenders check in the main loop: if scanIdentifierOrKeyword returns with s.pos unchanged, force-advance one byte. Even if some future code path reintroduces the same class of bug, the loop can never spin in place. I left a comment on that force-advance because it looks wrong at first glance, and I didn’t want to lose the reasoning.

The takeaway I’ll carry forward, and probably grep for every time I write a Go parser: in Go, byte → rune is a zero-extend, not a decode. There is no valid ASCII-or-UTF-8 context in which dispatching on rune(byte) is correct for non-ASCII input. Any place you’re tempted to write that pattern, you probably wanted utf8.DecodeRune(slice[i:]) instead. The compiler won’t warn you. The type system doesn’t know the semantic difference. And the bug stays invisible until the first real codebase with an emoji in a logging call, or a smart quote in a comment, hits your walker. In my case it was one inside one log line.

External symbol synthesis

A subtler gap: when scip-typescript or scip-php emit a reference to a symbol that was never declared in any document (vendor libraries, stdlib types, framework classes), the SCIP file contains the occurrence but not the SymbolInformation record. The first cut of scry’s parser dropped those on the floor, which meant queries like scry refs DB returned zero results even though DB was the most-referenced facade in the codebase.

The fix was to have the SCIP parser synthesize a SymbolRecord for any occurrence whose symbol ID wasn’t declared as SymbolInformation anywhere. It’s not ideal (you lose the kind information and the docstring), but it closes the general “vendor and framework references aren’t queryable by name” gap, and it was a one-page change. Now scry refs DB returns the facade symbol with all of its occurrences, and the Laravel facade resolver can do its thing on top.

Claude Code integration

The whole point of scry is for Claude Code to use it instead of Read and Grep. That integration ended up being a two-sided thing.

The daemon speaks JSON-RPC over a Unix socket, which isn’t directly consumable by Claude Code. Claude Code wants MCP (Model Context Protocol) servers, which speak JSON-RPC over stdio with a specific schema. So scry ships a scry mcp subcommand that runs a thin MCP stdio server and forwards each tool call to the scry daemon via the existing RPC client. Six tools are exposed: scry_refs, scry_defs, scry_callers, scry_callees, scry_impls, scry_status. The MCP layer also parses compound symbol forms like DB::table, auth->user, and client.Connect so you can query them the way you’d actually write them in code.

scry setup is the command that wires all of this up. It installs a skill file at ~/.claude/skills/scry/SKILL.md with routing instructions for Claude Code (basically “use scry for symbol queries, fall back to Grep for string and regex searches”) and registers scry as a user-scope MCP server via claude mcp add. The skill file is the nudge that makes Claude actually use scry. Without it, the MCP server just sits there and Claude keeps reaching for Read and Grep out of habit. The skill says “for symbol lookups route through scry, it’s 100x faster and more precise.” In practice that tip is enough to change the behavior almost immediately.

Getting the setup command right took longer than getting the server right, and the reason was a silent-failure bug where every check passed.

The config file that was the wrong config file

First iteration of scry setup wrote the MCP server config to ~/.claude/settings.json under an mcpServers key. The file existed, the key looked right, the JSON was valid. I ran scry setup and it printed success. Opened a fresh Claude Code session and the MCP manager UI showed scry · ✔ connected under User MCPs. I piped a test initialize request into scry mcp by hand and got back a valid response with serverInfo, capabilities, and tools/list returning all six tools. I piped a scry_refs call and it returned 252 DB facade call sites on hoopless_crm in under 10ms.

Then I asked Claude “where is DB::table used?” and it ran Search(pattern: "DB::table"). Raw Grep, 34-file result. Not scry.

I blamed session state, told myself to restart Claude Code. Same thing. I thought maybe the MCP server was crashing on load and Claude was silently falling back. Added logging on the server side. It wasn’t crashing. It wasn’t even being asked.

The moment it cracked open was when I used ToolSearch inside the session to search my own tool list for scry. One match, and it was from Figma (because one of Figma’s parameters happened to mention the scry project path). I searched select:mcp__scry__scry_refs and got “no matching deferred tools found.” The tools genuinely weren’t in the session’s tool registry at all. Claude was never routing to scry because scry wasn’t there.

Root cause: Claude Code doesn’t read MCP config from settings.json. At all. MCP servers live in ~/.claude.json at the home root (not inside .claude/), which is a ~185 KB file with around 60 top-level keys for session state, project list, tool usage, OAuth, and, under one of those keys, mcpServers. The canonical way to register an MCP server is claude mcp add --scope user --transport stdio scry -- /path/to/scry mcp, which handles the JSON editing for you. settings.json is for hooks and plugins and tool permissions. It never consults MCP config, and nothing in Claude Code reads the mcpServers key I was writing there.

Everything that looked right was looking at the wrong file. The “connected” UI in the MCP manager was polling some independent piece of state and wasn’t disagreeing with me because it wasn’t watching what I’d written. The piped test worked because scry mcp speaks the protocol correctly when you talk to it directly. The whole setup was a Potemkin success.

There’s a subtler thing in here too: MCP tools in Claude Code are loaded at process startup, not hot-reloaded. Even after I eventually fixed the config file, an already-running session still didn’t have the scry tools in its tool registry. A full claude process restart was required. The MCP manager UI showing “connected” was misleading because it polls independently of whatever the current session actually loaded at startup.

The fix was to rip out the JSON editing entirely and delegate to the official CLI. scry setup now shells out to claude mcp add via exec.Command, with a claude mcp get scry check upfront to decide skip-versus-replace. On install it also scans ~/.claude/settings.json for any leftover mcpServers.scry key from the earlier buggy iteration, backs up the file, and strips the stale entry. Best-effort cleanup; the install doesn’t fail if it can’t clean up.

The rule I wrote into docs/DECISIONS.md and want to carry into any future host integration: never silently edit a config file for a tool the user didn’t explicitly target. Use the host tool’s official CLI when one exists. If you have to hand-edit, ask first. Writing valid JSON to the wrong file is worse than writing invalid JSON to the right file, because there’s nothing to catch it. This is the load-bearing reason I haven’t rushed into multi-host MCP registration for Cursor, Codex, Continue, and Zed. Each one is its own config-file footgun, and I’d rather wait until I’m actively using a second host daily before building against its surface area.

Verified green after the fix: claude mcp list returns scry: /Users/jhoot/go/bin/scry mcp - ✓ Connected. Fresh session, same question. Claude runs scry - scry_refs (MCP)(symbol: "DB::table") and returns 92 real call sites filtered to the Illuminate facade class only, not the 276 unrelated table() methods, and not the 34-file Grep.

The postmortem lesson in one line: surface-level green is not the same as real green, especially when the “real” path requires the host tool to have read the right file at the right time.

What works today

TypeScript and JavaScript indexing is full coverage. Go is full coverage for refs and defs, partial coverage for callers and callees because scip-go only populates enclosing_range on some declarations. PHP and Laravel is feature-complete including the four post-processors. The daemon auto-spawns, handles its own lifecycle, watches for file changes, reindexes in the background without blocking queries, and cleans up on stop. The MCP integration is wired into Claude Code via scry setup. The curl | sh install script works on darwin and linux for both amd64 and arm64, with SHA256 verification against the GoReleaser artifacts.

What doesn’t work yet, and I’m being honest about it in the README because honesty is more useful than spin:

The stack

The whole thing is a single static binary. Drop it on your PATH, run scry setup, and the next Claude Code session will start routing symbol queries through it. The first time you see a 6ms response to scry refs <something you care about> followed by the agent just writing the correct change, it feels a little bit like cheating. Which is kind of the point.

github.com/jeffdhooton/scry