ripgrep
Rust
Top three hits: ignore/walk.rs, printer/standard.rs,
searcher/mod.rs. These are ripgrep's known complexity centers —
any contributor recognizes them on sight. The tool found them without being told.
v0.1 · AGPL-3.0 · local-first · deterministic
A forensic instrument for codebases.
Measurements. Not opinions.
entropyx scans your git repository and projects its temporal, structural, and authorship trajectory onto seven physical axes. It surfaces the files that will break, the commits that carry blast radius, and the ownership patterns nobody's been tracking.
emergency_heat Why this exists
A room full of senior engineers. A graph that just spiked. Nobody can answer the two questions that matter.
This is absurd. The truth is already in the repository. Every commit, every diff, every authorship shift, every rename, every file that grew a surface and never grew a test — recorded, timestamped, bitwise-deterministic. We just hadn't built an instrument that reads it back to us as measurement instead of folklore.
entropyx is that instrument.
query_stats The physics
Consumers — human or AI — never receive a bare scalar. Every composite score comes with the full seven-component breakdown and a deterministic provenance chain back to the commits.
| Sym | Axis | What it captures | Range |
|---|---|---|---|
D_n | Change density | Energy absorbed per commit touching this file. | [0, 1] |
H_a | Author dispersion | Shannon entropy across who's modified it. Bus factor, inverted. | [0, 1] |
V_t | Temporal volatility | Burstiness of activity — calm file vs. panic file. | [0, 1) |
C_s | Coupling stress | How much the rest of the system moves when this file moves. | [0, 1] |
B_y | Blame youth | Fraction of code written in the repo's most-recent quarter. | [0, 1] |
S_n | Semantic drift | Public-API delta — the surface changing, not just the body. | [0, 1] |
T_c | Test co-evolution | A discount. Tested change is healthier change. | [0, 1] |
sensors Six signal classes
Not predictions. Labels for dynamics that are already true. The classifier is rule-based and deterministic — no ML models, no version drift.
Bursts of temporal volatility clustered around fix: / hotfix: commits. The firefighting zone.
Small files with systemic blast radius. The innocuous 80-line helper that owns the whole stack.
Semantic drift rising, authorship narrowing, test coverage rising. A deliberate redesign in progress.
Public-API churning without test co-evolution. Silent interface rot.
Authorship spreading without a density drop. Team reorg or bus-factor erosion.
Low everything, old blame, no tests touching it. Rot hiding as stability.
smart_toy Design philosophy
Most AI tooling bends the tool to fit the LLM. entropyx inverts it: one rigid, typed, self-describing contract — and the AI adapts to that contract once and forever.
stdin, stdout, exit codes, JSON. The most boring, most universal, most LLM-friendly interface there is.
entropyx describe returns the whole contract as JSON. Claude calls it once and knows how to use everything else.
Compact Summary up front. Fetch evidence by Handle on demand. Tokens are money; entropyx respects that.
tq1 envelope with a JSON Schema pinned to CONTRACT_VERSION. Validate, generate bindings, or just trust the shape.
No ML, no wall-clock reads, no nondeterminism. Run it twice — bitwise identical output. An AI that can't trust its instruments is just hallucinating with extra steps.
No API keys, no rate limits, no org admin. Runs on the dev's laptop or in CI against a cloned repo. Zero network dependency.
fact_check How we know it works
Every claim here was verified by running entropyx against production open-source repos and checking whether the signals matched known ground truth. Dogfooding the tool on real repos also found two bugs we then fixed — the instrument keeps getting more honest because we keep running it.
Top three hits: ignore/walk.rs, printer/standard.rs,
searcher/mod.rs. These are ripgrep's known complexity centers —
any contributor recognizes them on sight. The tool found them without being told.
rich/console.py surfaced at the top — 567 commits, 76% by the creator,
V_t saturated. Calibrator told to weight tests as "hot": ridge regression
pushed 98% of the weight onto S_n. Genuine forensic truth — tests are where
APIs get defined.
Top hits document.rb, site.rb, commands/serve.rb.
Verified against the Ruby parser directly: 55 defs in document.rb,
9 in a private section — parser captured exactly 46 public methods, zero leakage.
Top hits dfa.cc, parse.cc, regexp.cc — the DFA
engine, parser, and representation. Anyone who's worked on re2 will tell you those
are the three hardest files. C++ access-specifier filter verified: 3 private
methods on the outer RE2 class, none leaked.
play_arrow Quick start
Install from crates.io, point it at a real repo. That's the whole loop.
# install from crates.io cargo install entropyx-cli # scan a repo → tq1 Summary JSON entropyx scan /path/to/repo > summary.json # drill into a file by its blob-hash handle entropyx explain /path/to/repo "file:0bdce769874b" # emit the typed protocol schema for your LLM entropyx schema > tq1-schema.json
Prefer source? git clone the repo and cargo build --release.
Five commands total:
describe, scan, explain,
calibrate, schema.
See the README
for the full contract and the
CLAUDE.md
for the engineering detail.
rocket_launch The promise
The next time a production system falls over, nobody in the room should have to guess at what changed. The answer is in the repository. The repository already knows. entropyx just reads it back.