entropyx

v0.1 · AGPL-3.0 · local-first · deterministic

A quantifiable way
to tell the truth.

A forensic instrument for codebases.
Measurements. Not opinions.

entropyx scans your git repository and projects its temporal, structural, and authorship trajectory onto seven physical axes. It surfaces the files that will break, the commits that carry blast radius, and the ownership patterns nobody's been tracking.

verified bitwise deterministic cloud_off no network bolt 2.4s on ripgrep translate 7 languages schema typed tq1 protocol

emergency_heat Why this exists

Every incident starts
with the same scene.

A room full of senior engineers. A graph that just spiked. Nobody can answer the two questions that matter.

What shipped yesterday?
I don't know, let me check Slack.
Who was working on that file?
Ask the on-call.
Was anything flagged as risky?
I think Jim had a PR open, but I'm not sure.

This is absurd. The truth is already in the repository. Every commit, every diff, every authorship shift, every rename, every file that grew a surface and never grew a test — recorded, timestamped, bitwise-deterministic. We just hadn't built an instrument that reads it back to us as measurement instead of folklore.

entropyx is that instrument.

query_stats The physics

Seven axes. Every score decomposes.

Consumers — human or AI — never receive a bare scalar. Every composite score comes with the full seven-component breakdown and a deterministic provenance chain back to the commits.

SymAxisWhat it capturesRange
D_nChange densityEnergy absorbed per commit touching this file.[0, 1]
H_aAuthor dispersionShannon entropy across who's modified it. Bus factor, inverted.[0, 1]
V_tTemporal volatilityBurstiness of activity — calm file vs. panic file.[0, 1)
C_sCoupling stressHow much the rest of the system moves when this file moves.[0, 1]
B_yBlame youthFraction of code written in the repo's most-recent quarter.[0, 1]
S_nSemantic driftPublic-API delta — the surface changing, not just the body.[0, 1]
T_cTest co-evolutionA discount. Tested change is healthier change.[0, 1]

sensors Six signal classes

Patterns the repository already carries.

Not predictions. Labels for dynamics that are already true. The classifier is rule-based and deterministic — no ML models, no version drift.

e911_emergency

IncidentAftershock

Bursts of temporal volatility clustered around fix: / hotfix: commits. The firefighting zone.

hub

CoupledAmplifier

Small files with systemic blast radius. The innocuous 80-line helper that owns the whole stack.

design_services

RefactorConvergence

Semantic drift rising, authorship narrowing, test coverage rising. A deliberate redesign in progress.

waves

ApiDrift

Public-API churning without test co-evolution. Silent interface rot.

group_work

OwnershipFragmentation

Authorship spreading without a density drop. Team reorg or bus-factor erosion.

ac_unit

FrozenNeglect

Low everything, old blame, no tests touching it. Rot hiding as stability.

smart_toy Design philosophy

Built for my buddy Claude.

Most AI tooling bends the tool to fit the LLM. entropyx inverts it: one rigid, typed, self-describing contract — and the AI adapts to that contract once and forever.

terminal

CLI, not SDK

stdin, stdout, exit codes, JSON. The most boring, most universal, most LLM-friendly interface there is.

menu_book

Self-describing

entropyx describe returns the whole contract as JSON. Claude calls it once and knows how to use everything else.

link

Handle-addressable

Compact Summary up front. Fetch evidence by Handle on demand. Tokens are money; entropyx respects that.

schema

Typed protocol

tq1 envelope with a JSON Schema pinned to CONTRACT_VERSION. Validate, generate bindings, or just trust the shape.

verified

Deterministic forever

No ML, no wall-clock reads, no nondeterminism. Run it twice — bitwise identical output. An AI that can't trust its instruments is just hallucinating with extra steps.

cloud_off

Local-first

No API keys, no rate limits, no org admin. Runs on the dev's laptop or in CI against a cloned repo. Zero network dependency.

fact_check How we know it works

We pointed it at real codebases.

Every claim here was verified by running entropyx against production open-source repos and checking whether the signals matched known ground truth. Dogfooding the tool on real repos also found two bugs we then fixed — the instrument keeps getting more honest because we keep running it.

ripgrep

Rust
235 files 92 authors 300 commits 2.4s scan

Top three hits: ignore/walk.rs, printer/standard.rs, searcher/mod.rs. These are ripgrep's known complexity centers — any contributor recognizes them on sight. The tool found them without being told.

rich

Python
616 files 299 authors 3,830 commits 5.8 yr history

rich/console.py surfaced at the top — 567 commits, 76% by the creator, V_t saturated. Calibrator told to weight tests as "hot": ridge regression pushed 98% of the weight onto S_n. Genuine forensic truth — tests are where APIs get defined.

Jekyll

Ruby
817 files 117 authors 500 commits 9.4s scan

Top hits document.rb, site.rb, commands/serve.rb. Verified against the Ruby parser directly: 55 defs in document.rb, 9 in a private section — parser captured exactly 46 public methods, zero leakage.

re2

C++
158 files 17 authors 300 commits 74 events

Top hits dfa.cc, parse.cc, regexp.cc — the DFA engine, parser, and representation. Anyone who's worked on re2 will tell you those are the three hardest files. C++ access-specifier filter verified: 3 private methods on the outer RE2 class, none leaked.

play_arrow Quick start

One install. Three commands.

Install from crates.io, point it at a real repo. That's the whole loop.

# install from crates.io
cargo install entropyx-cli

# scan a repo → tq1 Summary JSON
entropyx scan /path/to/repo > summary.json

# drill into a file by its blob-hash handle
entropyx explain /path/to/repo "file:0bdce769874b"

# emit the typed protocol schema for your LLM
entropyx schema > tq1-schema.json

Prefer source? git clone the repo and cargo build --release.

Five commands total: describe, scan, explain, calibrate, schema. See the README for the full contract and the CLAUDE.md for the engineering detail.

rocket_launch The promise

Stop guessing.
Start measuring.

The next time a production system falls over, nobody in the room should have to guess at what changed. The answer is in the repository. The repository already knows. entropyx just reads it back.