anomalyx
Contract-first anomaly detection over arbitrary corpora.
anomalyx is a deterministic Rust CLI built on the thesis of AI Tools Need Contracts, Not Prompts: the executable is the contract. Point it at ~30 formats — logs, security telemetry, packet captures, flow records, observability streams, spreadsheets, and data-lake files (the full set) — and it normalizes each into one typed record model, runs a battery of typed anomaly detectors, and returns a dense, versioned, machine-readable envelope an agent (or a human) can trust — not pretty text that has to be scraped.
$ printf 'id,amount\n1,10\n2,11\n3,9\n4,10\n5,12\n6,11\n7,10\n8,9\n9,9999\n' | anomalyx scan
{"protocol":"anomalyx/tq1",...,"rows":[[0,1,2,1.0,3,4544.43,4]],...,"exit":1}
$ ... | anomalyx explain cell:amount:8
{"evidence":{"kind":"cell","column":"amount","row":8,"value":{"t":"int","v":9999}},"findings":[...]}
Why it exists
Humans paper over vague tools with context and memory; agents can’t. A tool whose behavior lives in prose, convention, and tribal knowledge is one an agent will eventually step on. anomalyx is shaped as an executable contract:
- A minimal, discoverable surface — four verbs:
describe,schema,scan,explain. - Typed, dense output — a versioned
tq1JSON envelope with a dictionary-pinned string table and stable evidence handles, not prose. - Determinism as UX — same input + same config fingerprint yields byte-identical output. No wall-clock, no RNG in the measurement path.
- Honest absence — a detector that can’t run says so; it never fabricates a
clean result. Exit codes are committed:
0clean,1anomalies,2error.
What makes it trustworthy
- Nine detectors across a seven-class taxonomy — point, distributional, structural, multivariate, contextual, collective, and cadence anomalies.
- Any corpus — CSV, TSV, NDJSON, JSON, Parquet, and Arrow IPC, all lowered to one engine-independent record model.
- Proven correct — the statistical core is validated against the NIST Statistical Reference Datasets (certified to 15 digits), and every crate passes a 0-surviving-mutant mutation gate on top of property-based tests.
Start with Install, then the four-verb contract.
Install
From crates.io
cargo install anomalyx
This installs the anomalyx binary. It pulls in the library crates
(anomalyx-core, anomalyx-normalize, anomalyx-detect) automatically.
From source
git clone https://github.com/copyleftdev/anomalyx
cd anomalyx
cargo install --path crates/anomalyx
Feature flags
Binary columnar formats (Parquet, Arrow IPC) are read through the Polars
backbone, behind the default-on polars feature of anomalyx-normalize. A
lean, text-only build drops that heavy dependency:
# text formats only (CSV / TSV / NDJSON / JSON), no Polars
cargo build -p anomalyx-normalize --no-default-features
Without the feature, a Parquet/Arrow input fails cleanly with an explicit “requires the ‘polars’ feature” error rather than misbehaving — honest absence at the build level.
Using the libraries
The detection engine is usable as a library. The crates.io packages are
namespaced (anomalyx-*) but expose conventional module names:
[dependencies]
anomalyx-core = "0.1"
anomalyx-detect = "0.1"
anomalyx-normalize = "0.1"
use ax_detect::{Registry, ScanContext, DetectConfig};
let rs = ax_normalize::normalize("data.csv", &bytes)?;
let report = Registry::default_set()
.run(&ScanContext::single(&rs), &DetectConfig::default());
The four-verb contract
anomalyx exposes a deliberately small, discoverable surface. An agent can answer “what is this, what does it produce, what did it find, and why” with four verbs.
anomalyx describe Protocol metadata
anomalyx schema JSON Schema of scan output
anomalyx scan [--baseline B] [--period N] [--cadence COL] [PATH]
anomalyx explain <HANDLE> [--baseline B] [--period N] [--cadence COL] [PATH]
Input is a PATH or stdin (-). Exit codes are part of the contract:
| code | meaning |
|---|---|
0 | clean — no anomalies |
1 | anomalies found |
2 | tool error (bad input, unresolved handle, …) |
describe — what this is
Emits protocol metadata: the supported input formats, the registered detectors,
the anomaly classes, the exit-code semantics, and the current deterministic
config fingerprint. Everything is derived from the same registries scan uses,
so the description can’t drift from behavior.
schema — the shape of the output
Emits a JSON Schema (draft 2020-12) pinning the tq1 envelope. Validate against
it instead of reverse-engineering field names. See The tq1 envelope.
scan — normalize, then detect
Reads the corpus, normalizes it to the internal record model, runs every
detector, and prints one dense tq1 envelope.
$ anomalyx scan sales.csv
{"protocol":"anomalyx/tq1", ... ,"exit":1}
explain — drill into a finding
Findings carry a stable handle (e.g. cell:amount:8, dist:score,
row:42, range:ts:20:40). explain resolves one back to its underlying
evidence, and re-attaches any findings pointing at it. An unresolvable handle
fails cleanly with exit 2 — never a fabricated hit.
$ anomalyx explain cell:amount:8 sales.csv
{"protocol":"anomalyx/tq1","handle":"cell:amount:8",
"evidence":{"kind":"cell","column":"amount","row":8,"value":{"t":"int","v":9999}},
"findings":[{"detector":"point.modz","class":"point","confidence":1.0, ... }]}
Stability (1.0)
As of 1.0, the tq1 contract is stable and committed. An agent can rely on
these without pinning a patch version:
- the protocol id
anomalyx/tq1(envelope::PROTOCOL); - the exit codes —
0clean,1anomalies found,2error; - the dense finding-row layout (
[detector, class, handle, confidence, severity, score, reason]) and the dictionary-pinned string table; - the handle forms (
column:/cell:/row:/range:/dist:) and their canonical string shapes; - the envelope’s required fields and the
severityladder (info < low < medium < high < critical).
Breaking any of these requires a major bump and a PROTOCOL change — they
will not change quietly under 1.x.
What still evolves additively under 1.x: new detectors, new input formats,
new optional CLI flags, and new optional envelope fields (consumers must ignore
unknown fields). Anything that changes detector output for a given input —
a new threshold default, a recalibration — moves the config_version
fingerprint, so “the tool changed” stays distinguishable from “the data changed.”
Determinism remains absolute: same input + same config_version ⇒ byte-identical
output.
Anomaly taxonomy
“Anomaly” is not one thing. anomalyx classifies every finding into one of seven classes, so you reason about the kind of deviation, not just that “something is off.” Nine detectors implement the taxonomy today.
| Class | What it catches | Detector(s) |
|---|---|---|
point | a single value far from its column’s distribution | point.modz |
distributional | the distribution shifted vs. a baseline | dist.ks, dist.psi, dist.chi2 |
structural | schema / type / null-rate / cardinality violations | struct.schema |
multivariate | a row that breaks the joint structure across columns | mv.mahalanobis |
contextual | a value anomalous only in context (seasonal) | ctx.seasonal |
collective | a subsequence that is jointly anomalous (level shift) | coll.cusum |
cadence | timing too regular to be organic (automation) | cad.regularity |
Every detector is deterministic — no RNG, no wall-clock — which is what lets anomalyx meet its byte-reproducibility guarantee. Where an off-the-shelf method would fight that (an isolation forest’s RNG, for instance), anomalyx uses a deterministic equivalent.
point — point.modz
Per-column univariate outliers via the Iglewicz–Hoaglin modified z-score,
M = 0.6745·(x − median)/MAD. MAD (median absolute deviation) is robust: a few
wild values don’t inflate the spread and mask each other. Falls back to mean/σ
when MAD collapses; a truly constant column flags nothing. Emits a cell handle.
distributional — dist.ks / dist.psi / dist.chi2
Compare the current corpus against a --baseline:
dist.ks— two-sample Kolmogorov–Smirnov on numeric columns (shape/location shift), with an asymptotic p-value.dist.psi— Population Stability Index over baseline-quantile bins (how much mass moved); the binned cousin of KL divergence.dist.chi2— chi-square over category frequencies for categorical columns; also surfaces brand-new categories.
Without a baseline these report honest absence. Emit dist handles.
structural — struct.schema
Shape, not values. Single-corpus: columns with conflicting cell types (Mixed)
and columns whose null fraction exceeds a threshold. With a --baseline: a
schema diff — columns added, dropped, or whose inferred type changed. Emits
col handles.
multivariate — mv.mahalanobis
A row can be unremarkable on every axis yet a glaring joint outlier — e.g.
it breaks the correlation the rest of the data obeys. The Mahalanobis
distance measures distance from the centroid in units that account for each
feature’s spread and the covariance between features. Squared distance ~ χ²(d),
so a principled per-row p-value falls out. Own deterministic Cholesky solve, no
RNG. Emits a row handle.
contextual — ctx.seasonal
A daytime traffic level at 3am; a weekday volume on a Sunday. Given a period
--period N, each point is scored only against its own phase (row mod N) — its
seasonal peers — using the same robust modified z-score. Seasonality is never
guessed: without a period it reports honest absence.
collective — coll.cusum
A sustained shift in level is the canonical collective anomaly. CUSUM finds the
change point that maximizes the cumulative deviation from the mean; when the
standardized two-segment shift is large, the post-change segment is flagged as a
range handle.
cadence — cad.regularity
The inverse of every other detector: timing too regular to be organic — the
metronomic signature of automation. On a column named by --cadence COL, the
inter-arrival intervals’ coefficient of variation (CV = σ/μ) near zero is the
tell. Opt-in, because which column means “time” is never guessed.
Input & normalization
“Given any corpus of information regardless of its format, we’ll normalize it.”
anomalyx meets your data where it already lives. Every supported format —
whether a packet capture, a SIEM event stream, a Kubernetes manifest, or a
data-lake file — is lowered to one engine-independent record model, a
RecordSet of named, typed columns, and the detectors only ever see that. The
contract stays stable while the backend underneath it changes.
Supported formats
32 built-in parsers across five domains. Each is an independent plugin
(crates/ax-normalize/src/parsers/); adding one doesn’t touch the others.
Tabular & structured data
| Format | Extensions | Notes |
|---|---|---|
| CSV / TSV | .csv, .tsv, .tab | lean deterministic reader |
| NDJSON / JSON | .ndjson, .jsonl, .json | array, object, or one-record-per-line |
| YAML | .yaml, .yml | Kubernetes / CI manifests; multi-document |
| TOML / INI | .toml, .ini, .cfg, .conf | config drift via struct.schema |
| XML | .xml, .nessus | Nessus/OpenVAS, SOAP; repeated element → rows |
Columnar, data-lake & databases
| Format | Extensions | Backend |
|---|---|---|
| Parquet | .parquet, .pq | Polars / Arrow |
| Arrow IPC | .arrow, .ipc, .feather | Polars / Arrow |
| Avro | .avro | apache-avro |
| ORC | .orc | orc-rust → Arrow |
| Excel / ODS | .xlsx, .xls, .xlsb, .ods | calamine (first sheet) |
| SQLite | .db, .sqlite, .sqlite3, .db3 | rusqlite (first table, in-memory deserialize) |
Logs & observability
| Format | Detected by | Anomaly angle |
|---|---|---|
| logfmt | key=value shape | structured app logs |
| Web access logs (Combined/Common) | [time] "request" status | status-mix dist, latency point, bursts coll |
| syslog (RFC 3164 / 5424) | <PRI> header | event-rate dist, off-hours contextual |
| systemd journal | journalctl -o json | event-rate cadence/coll, rare-unit dist |
| Prometheus / OpenMetrics | exposition lines | per-series point spikes, dist drift |
| OpenTelemetry (OTLP/JSON) | resourceSpans | span-duration point, error-rate dist, emit cadence |
Security telemetry
| Format | Detected by | Anomaly angle |
|---|---|---|
Zeek (conn.log family) | #separator header | connection analytics |
| CEF / LEEF | CEF: / LEEF: prefix | signature/category mix shift via dist.chi2 |
| auditd | msg=audit( | exec/syscall mix dist, bursty activity coll |
| EVTX (Windows Event Log) | ElfFile magic | rare event-ID point, logon dist, off-hours contextual |
| Suricata/Zeek EVE | event_type + timestamp | alert-type drift via dist.chi2; new classes surface |
| osquery results | hostIdentifier + columns/snapshot | fleet-posture drift via structural/dist |
| AWS CloudTrail | Records[].eventName | off-hours contextual/cadence, rare-API dist |
Network
| Format | Detected by | Anomaly angle |
|---|---|---|
| PCAP / PCAPNG | libpcap / SHB magic | beaconing/C2 via cadence on inter-arrival times |
| NetFlow / IPFIX (nfdump CSV) | nfdump header | exfil via mv.mahalanobis on (bytes, packets, duration) |
| AWS VPC Flow Logs | srcaddr dstaddr dstport header | same flow anomalies, zero new infra |
| DNS query logs (dnsmasq) | query[TYPE] … from | DGA/exfil via point on name entropy/length + cadence |
Several parsers compute the features the detectors want rather than just
extracting fields — DNS query-name Shannon entropy and length, flow duration
(end - start), span durationNanos, normalized epoch timestamps — and rename
cryptic source fields to a canonical schema (e.g. nfdump ibyt→bytes,
td→duration).
Resolution
Format is resolved by file extension first, then by content sniff —
binary magic numbers (PAR1, ORC, SQLite format 3\0, …) are checked at high
confidence, then distinctive text signatures, then a CSV last-resort fallback.
Resolution is deterministic: the highest-confidence match wins, ties break by
registration order. An unrecognized stream is an explicit error, never a silent
guess.
Several formats deliberately claim no extension (Zeek, syslog content,
journald, EVE, osquery, auditd, DNS, NetFlow, VPC) because their files are
generically *.log/*.json; pipe them on stdin and the content signature
routes them.
Feature flags & the lean build
The binary and heavyweight parsers sit behind default-on feature flags, so a
default build reads everything but a --no-default-features build is a lean,
text-only normalizer with no binary dependencies:
| Feature | Parsers |
|---|---|
polars | Parquet, Arrow IPC |
evtx | EVTX |
pcap | PCAP / PCAPNG |
xlsx | Excel / ODS |
sqlite | SQLite |
datalake | Avro, ORC |
The record model
A RecordSet is named columns of equal length, each with an inferred type:
Int, Float, Bool, Str, Unknown, or Mixed (conflicting concrete types
— itself a structural signal). Values collapse into a small closed set, and
absence is explicit: a missing cell is Null, never a sentinel 0.0 that
would skew a mean.
amount,tier → column "amount": Int [10, 11, 9, …]
10,a column "tier": Str ["a", "b", "c", …]
11,b
Binary and library-backed formats live entirely behind this boundary: a Polars
DataFrame, an Arrow RecordBatch, a calamine sheet, or a SQLite row is
converted to a RecordSet (integers fold to i64, floats to f64 with
non-finite → Null, unsupported logical types preserved as their string form),
so no library type ever reaches a detector. Text formats touch none of it.
Scan modes
A plain scan runs the single-corpus detectors (point, structural shape
checks). Three flags activate the rest; when a flag is absent, the detectors it
would enable report honest absence rather than guessing. A
fourth pair of flags — --columns / --exclude — narrows which columns are
analyzed at all.
--baseline B — drift & schema diff
Compares the current corpus against baseline B. Activates the distributional
detectors (dist.ks, dist.psi, dist.chi2) and the schema-diff half of
struct.schema.
$ anomalyx scan --baseline last_week.parquet this_week.parquet
# flags columns whose distribution shifted, plus added/dropped/type-changed columns
The envelope gains a baseline field recording the comparison source.
--period N — seasonal / contextual
Treats rows as an ordered time series of period N and runs ctx.seasonal,
comparing each point to its phase peers (row mod N).
$ anomalyx scan --period 7 daily_metrics.csv # weekly seasonality
A value can be perfectly ordinary globally yet wrong for its phase — e.g. a
50 where phase 0 normally sits near 0. Without --period, ctx.seasonal is
honestly absent; seasonality is never inferred.
--cadence COL — metronomic timing
Reads column COL as event times and runs cad.regularity, flagging
suspiciously regular inter-arrival intervals (automation).
$ anomalyx scan --cadence ts events.csv
# flags COL if its inter-arrival coefficient of variation is near zero
Organic streams are ragged; a metronome is a tell. Opt-in, because which column means “time” is never guessed.
The regularity bar is the inter-arrival coefficient of variation (CV =
stddev / mean); cad.regularity fires when CV is below a threshold (default
0.05). Tune it with --cad-max-cv F:
$ anomalyx scan --cadence timestamp beacon.pcap # default 0.05
$ anomalyx scan --cadence timestamp --cad-max-cv 0.15 beacon.pcap # catch jittered beacons
A perfectly periodic beacon has CV ≈ 0; real C2 channels add timing jitter to
evade exactly this kind of test. A ~10% jitter (CV ≈ 0.10) slips past the
default but is caught at --cad-max-cv 0.15 — at the cost of flagging more
merely-regular traffic. The threshold is folded into the envelope’s
config_version (cdcv=), so a non-default bar is a versioned, reproducible
choice, never a hidden one.
Rows are treated in their given order as the time axis. If your data isn’t already time-ordered, sort it first.
Column roles (and --no-column-roles)
Every scanned column is classified into a role — measurement, identifier,
categorical, sequence, or constant — and the full map ships in the envelope’s
roles array. Detectors consult it to skip columns where their statistic is
meaningless: the point detector ignores identifier and sequence columns,
because a “large process-id” or the endpoint of a monotonic counter is not an
anomaly.
$ anomalyx scan app.log # roles on (default)
$ anomalyx scan --no-column-roles app.log # report roles, but skip nothing
Identifiers are recognized by name (*_id, uid, gid, pid, tid,
session, uuid, …) — the only reliable signal, since a process-id column is
statistically indistinguishable from a discrete measurement. A continuous
measurement (fare, durationNanos, DAYS_LOST) is never named like an id, so
it is never skipped. Cardinality is deliberately not used to call a numeric
column categorical — a column that is one value with a few wild outliers has low
cardinality yet is exactly what point detection should catch.
This is heuristic, but never silent: the role of every column is in the
envelope (audit it), and --no-column-roles disables the skipping entirely. On a
real 20k-entry journald capture it cuts point findings from ~12,500 to ~240 (the
_PID/_UID/JOB_ID/timestamp columns) while leaving genuine measurements
untouched. The setting is part of config_version (cr=).
--set KEY=VALUE — tune detector config
Every detector threshold is a field of the config that describe reports.
--set overrides any of them by name (repeatable):
$ anomalyx scan --set point_threshold=4.0 --set dist_alpha=0.01 data.csv
$ anomalyx describe | jq .config # the settable keys + their defaults
An unknown key or a value that doesn’t fit the field is a hard error (exit 2).
Overrides flow into config_version, so a tuned run is just as reproducible and
self-describing as a default one — the knob is never hidden. (The common knobs
also have dedicated flags: --fdr, --cad-max-cv, --period, --cadence.)
--top N / --min-severity S — output scoping
Detection can surface tens of thousands of findings on a large corpus. These two
flags scope what scan emits without touching what it detects:
$ anomalyx scan --top 50 big.parquet # the 50 most severe
$ anomalyx scan --min-severity high big.parquet # only high/critical
$ anomalyx scan --fdr 0.01 --min-severity high --top 25 big.parquet # compose
--top N keeps the N most severe findings (the row list is already sorted
severity-first); --min-severity S keeps findings at or above S
(info < low < medium < high < critical).
The scoping is honest. summary (total, by_class, max_severity) and
the exit code always describe everything detected — so filtering the view
can never make anomalies look absent or flip exit 1→0. When findings are
withheld, the envelope gains a scope block recording the filter and the
detected / emitted / dropped counts; rows carries only the emitted
subset. Without these flags the block is absent and rows is complete.
This is the volume complement to
--fdr(which controls correctness): FDR makes findings statistically defensible, output scoping makes the list consumable. Together: “the top N, ≥ severity S, among the FDR-significant set.”
--fdr Q — false-discovery-rate control (point detector)
By default the point detector flags every cell whose modified z-score clears a
fixed cutoff. With thousands of cells, a fixed cutoff has no notion of how many
cells were tested. --fdr Q converts each cell’s score to a two-sided p-value
and applies the Benjamini–Hochberg procedure within each column, bounding the
expected proportion of false flags at Q:
$ anomalyx scan --fdr 0.05 events.parquet # ≤5% expected false discoveries
This is principled, not arbitrary: a column that is really just noise stops
contributing chance flags, and the same outlier can be significant in a small
column yet not in a large one (the per-rank bar (k/m)·Q shrinks with the number
of cells m). The threshold is folded into config_version (pfdr=), so a
non-default level is a versioned, reproducible choice.
--fdrcontrols correctness, not output volume. On genuinely heavy-tailed data it can flag more cells than the fixed cutoff — those cells really are significant atQ. To cap volume, pair it with--columns/--exclude(and the planned severity / top-N output scoping).
--columns C,.. / --exclude C,.. — column scope
Restrict detection to a chosen set of columns (--columns, an allowlist) or to
everything but a set (--exclude, a denylist). The two are mutually exclusive.
The projection is applied before any detector runs, and to the --baseline too,
so drift comparison stays consistent.
# focus a wide log on the columns that carry signal
$ journalctl -o json | anomalyx scan --columns PRIORITY,_SYSTEMD_UNIT
# or keep everything except journald's identifier/counter/timestamp noise
$ journalctl -o json | anomalyx scan \
--exclude JOB_ID,_PID,__MONOTONIC_TIMESTAMP,__REALTIME_TIMESTAMP,N_RESTARTS
This is the answer to identifier noise on wide corpora. The point detector
will dutifully flag statistical outliers in every numeric column — including
JOB_ID, PIDs, monotonic timestamps and restart counters, where an “outlier” is
real but meaningless. On a raw 20k-entry journald capture that’s ~10k findings of
noise; excluding those fields collapses it to a couple hundred that matter.
The scope is explicit, never heuristic. anomalyx will not auto-guess which
columns are “interesting” — that would be a guess, and the obvious guess
(drop near-unique columns) would wrongly discard exactly the near-unique numeric
measurements the marquee detectors depend on (packet durationNanos, span
durations, latencies). You name the scope; the result stays deterministic.
A column named in
--columns/--excludethat doesn’t exist in the corpus is a hard error (exit2), so a typo can’t silently scope a scan down to nothing and read as “clean”. (The baseline is projected leniently — it’s a different corpus and need not carry every scoped column.)
The tq1 envelope
scan emits a single JSON object — the tq1 envelope. It is dense and typed,
not pretty text: a dictionary-pinned string table with findings encoded as
fixed-shape rows that reference it. Changing any field is an API change and is
guarded by a contract test. Run anomalyx schema for the machine-readable
JSON Schema.
{
"protocol": "anomalyx/tq1",
"config_version": "anomalyx-cfg/5;pt=3.5000;...",
"source": "sales.csv",
"format": "csv",
"baseline": "last_week.csv", // present only in --baseline mode
"rows_scanned": 9,
"dict": ["point.modz", "point", "cell:amount:8", "critical", "amount = 9999 …"],
"columns": ["detector","class","handle","confidence","severity","score","reason"],
"rows": [ [0, 1, 2, 1.0, 3, 4544.43, 4] ],
"absent": [ {"detector":"dist.ks","reason":"no baseline provided …"} ],
"summary": { "total": 1, "max_severity": "critical", "by_class": [ … ] },
"exit": 1
}
Fields
protocol—"anomalyx/tq1". Bumps on any breaking envelope change.config_version— a fingerprint of every setting that affects output. Same input + same fingerprint ⇒ byte-identical output. Lets you tell “the data changed” from “the configuration changed.”dict— the string table. Every repeated string (detector ids, class tokens, handles, severities, reasons) appears once here; rows reference it by index. No magic constants.columns— the fixed column order of each dense finding row.rows— one array per finding, aligned tocolumns:[detector_idx, class_idx, handle_idx, confidence, severity_idx, score, reason_idx].confidenceis calibrated to one scale across every detector: a logistic of how far the detector’s statistic sits past its firing threshold, measured relatively (so units cancel) —0.5at the threshold, rising toward1.0. A finding “2× past threshold” earns the same confidence whether it came from a modified z-score, a KS p-value, a PSI, or a cadence CV, soseverity(derived from confidence) ranks findings from different detectors on one scale.scoreis the detector’s raw statistic (uncalibrated), for drill-down.absent— detectors that declined to run, each with a machine-readable reason. See honest absence.summary— total count, max severity, and per-class counts for at-a-glance triage.exit— the committed exit code, mirrored into the envelope.
Handles
Findings are compact but drill-able. Each carries a stable handle whose
canonical string is consistent across runs, so an agent can cache it and later
explain it:
| Handle | Form | Used by |
|---|---|---|
| column | col:<name> | structural |
| cell | cell:<column>:<row> | point |
| range | range:<column>:<start>:<end> | collective |
| dist | dist:<column> | distributional |
| row | row:<n> | multivariate |
Findings are sorted deterministically (severity desc, then class, handle, detector), so the envelope is stable regardless of the order detectors ran.
Determinism & honest absence
Two principles run through the whole tool. Both exist because the primary consumer is an agent, and an agent can’t paper over surprises the way a human can.
Determinism is UX
“Determinism is not just a testing preference. It is user experience for agents.”
Same input + same config_version ⇒ byte-identical output. Concretely:
- Order-independent reductions. Floating-point addition is neither associative nor commutative, so a naïve sum depends on order. Every reduction (mean, variance, MAD, quantiles, PSI, …) sorts its inputs by total order and accumulates with compensated (Neumaier) summation — the same multiset of values yields the same bits regardless of arrangement. This is exercised on real NIST data under reversal and rotation.
- No wall-clock, no RNG in the measurement path. Detectors that elsewhere rely on randomness (e.g. isolation forests) are replaced with deterministic equivalents (Mahalanobis distance).
- Stable interning and sorting. The envelope’s string table and finding order are deterministic, so two runs diff cleanly.
- A config fingerprint. Any threshold that could change output also changes
config_version, so you can always tell the data changed from the tool’s configuration changed.
Honest absence
“An AI-first instrument should not try to sound intelligent.”
A detector that cannot meaningfully run says so — it never fabricates a clean
result. Absences are first-class, recorded in the envelope’s absent array with
a machine-readable reason:
"absent": [
{"detector":"dist.ks","reason":"no baseline provided; distributional drift requires --baseline"},
{"detector":"ctx.seasonal","reason":"contextual detection needs a declared period ≥ 2 (pass --period N)"},
{"detector":"mv.mahalanobis","reason":"needs at least 2 numeric columns for a multivariate distance"}
]
The same honesty appears at every level:
- A missing cell is
Null, never0.0. - An unavailable detector contributes nothing, not an implied “looks fine.”
- An unresolved
explainhandle fails with exit2, not a fabricated hit. - A format built without Polars support rejects Parquet explicitly.
Validation against NIST
Every detector rests on a small set of deterministic reductions (mean, standard deviation, …). “anomalyx is mathematically correct” is therefore a checked claim, not an assertion: those reductions are validated against the NIST Statistical Reference Datasets (StRD) — the canonical, certified-to-15-digits truth for univariate summary statistics. The datasets are vendored offline, so validation is reproducible with no network.
Results are scored by NIST’s own metric, the log relative error (the number of correct significant digits):
meanreproduces every certified value to ≥ 15 digits.std_devreaches ≥ 13 digits on well-conditioned data.
The precision proof
The NumAcc3 / NumAcc4 datasets are torture tests: a mean near 10⁶–10⁷ with
a standard deviation of exactly 0.1. The textbook one-pass variance
(Σx² − (Σx)²/n) suffers catastrophic cancellation here. anomalyx’s compensated
two-pass reduction does not:
| dataset | anomalyx std (correct digits) | naïve one-pass |
|---|---|---|
NumAcc3 | 9.46 | 1.14 |
NumAcc4 | 8.25 | 0.00 — zero correct digits |
Michelson | 13.84 | 8.28 |
On NumAcc4 the textbook formula gets nothing right; anomalyx tracks NIST to
~8 digits — the ceiling imposed by the f64 representation of the inputs
themselves, which is all NIST expects. This is a checked demonstration that the
determinism-and-precision design is load-bearing, not decorative.
Stress tests
Beyond certified values, the harness verifies behavior against known ground truth:
- Ground-truth recovery — planted outliers are flagged exactly, with no false positives or negatives.
- Order independence —
det_sumis bit-identical under reversal and rotation on real 5000-point NIST data. - Reproducibility at scale — a 40k-row scan serializes identically across runs.
Architecture
A small workspace of focused crates. The guiding rule: the contract is engine-independent, so the heavy machinery can change without the output shape moving.
crates/
ax-core contract types: RecordSet, the anomaly taxonomy, the tq1
envelope, evidence handles, deterministic reductions.
Deliberately no heavy deps — keeps the contract independent
and the mutation gate fast. (crate: anomalyx-core)
ax-normalize any input format → RecordSet. CSV/TSV/NDJSON/JSON via a lean
deterministic reader; Parquet/Arrow IPC via the Polars
backbone, behind the default-on `polars` feature.
(crate: anomalyx-normalize)
ax-detect the Detector trait + registry; the nine detectors and their
math (assembled from statrs, not reinvented).
(crate: anomalyx-detect)
anomalyx the four-verb CLI surface — the installable binary.
ax-validate NIST StRD validation + stress harness (publish = false).
Engine independence
Polars lives only inside ax-normalize’s binary-format reader. It reads a
DataFrame and lowers it to a RecordSet; no Polars type ever reaches a
detector, the envelope, or the contract. That’s what lets the text-only build
drop Polars entirely, and what keeps ax-core — where the taxonomy and envelope
live — a tiny, dependency-light crate that the mutation gate can sweep quickly.
Adding a format (the parser plugin system)
ax-normalize is a parser-plugin registry. Each format is an independent
FormatParser (id, extensions, content sniff, parse) living in its own
file under crates/ax-normalize/src/parsers/. The ParserRegistry resolves a
byte stream by file extension first, then by the highest-confidence sniff
(deterministic: confidences are registered in descending order). Adding a format
is a new parsers/<fmt>.rs plus one register(...) line in default_registry —
no central match to edit. See the open format issues for the backlog.
The detector contract
A Detector is itself a contract. Given a ScanContext { current, baseline }
it either runs and emits Findings, or declares honest Absence. The
Registry runs the set deterministically and merges everything into one
Report, which the CLI turns into a tq1 envelope. Adding a detector is:
implement the trait, register it, and gate it.
Naming
The crates.io packages are namespaced under the brand (anomalyx-core,
anomalyx-normalize, anomalyx-detect) because the short ax-* names were
taken; the in-source module/import names remain ax_core etc. via Cargo’s
dependency-rename, so the code reads cleanly.
Quality gates
Two load-bearing test gates back every change, run locally by scripts/gates.sh
and in CI on every push.
Property-based testing
Invariants are pinned across all inputs with proptest, not just hand-picked
cases — for example:
- the point detector is shift-, scale-, and permutation-invariant;
- KS is symmetric and lies in
[0, 1]; PSI is non-negative; - Mahalanobis flagging is translation-invariant;
- reductions are order-independent and reproducible.
Mutation testing
Property tests are only as good as their teeth. cargo-mutants mutates the
source and checks that some test fails for each change. The gate is zero
surviving mutants across the workspace.
Getting there surfaced — and killed — real test gaps, and forced exact-value
pins (e.g. validating reductions against NIST). A handful of mutants are
genuinely equivalent (they cannot change observable behavior for any input —
a measure-zero p == α boundary, or a sign flip that the Σ(deviations) == 0
identity cancels); those are documented individually in .cargo/mutants.toml,
never blanket-suppressed. Loop-bound mutations that hang are detected as
timeouts (a hang is caught, not a survivor), so the gate is precisely “no
mutant survives.”
CI
.github/workflows/ci.yml runs the fast gates on every push and pull request:
cargo fmt --check, cargo clippy -D warnings, the full test suite, and the
text-only --no-default-features build.
The mutation gate runs locally, not in CI — cargo mutants is far too
minutes-expensive on hosted runners. It is enforced before pushing via:
./scripts/gates.sh # fmt · clippy · test · mutation (0 surviving mutants)
and is the contributor’s responsibility (the gate workflow can fan it out
per-crate). Treat a green local mutation run as part of “done.”
Worked examples
The repository’s examples/
directory holds small, runnable programs that use anomalyx on real data. They
exist to demonstrate one thing the contract makes possible: an agent (or a
30-line script) can consume the tq1 envelope directly — parse the dense
finding rows and the dict-pinned string table, then map each
handle back to a row, cell, or timestamp — rather than scraping
human-readable text.
They live outside the Cargo workspace and shell out to the installed anomalyx
binary, so they have no effect on the build or the quality gates.
Each mirrors anomalyx’s exit code (0 clean, 1 anomalies, 2 error).
The examples
| Example | Data | What it surfaces |
|---|---|---|
stock_anomalies.py | Yahoo Finance daily history | anomalous trading days; distributional drift vs. another ticker |
journal_anomalies.py | journalctl -o json (systemd) | rare priorities, bursts, per-unit content spikes; drift between two windows |
polymarket_anomalies.py | Polymarket public APIs | information shocks (point/mv) and odds regime shifts (coll.cusum) |
synergy_market.py | Yahoo Finance + agent-calc | anomalyx finds; the exact-math kernel proves (tail probability, a t-test across the regime break, exact correlations) |
Each maps the handle in every finding back to a calendar date / timestamp, so the output reads as “this day, this column, this kind of deviation”.
Contracts composing with contracts
synergy_market.py is the clearest illustration of why a machine-readable
contract matters. anomalyx is descriptive and assumption-free — it reports
which days and regimes broke the pattern (point.modz, mv.mahalanobis,
coll.cusum), never assuming a distribution. Its findings then flow, as typed
JSON, straight into agent-calc —
a sibling contract-first CLI that does exact statistics: the return
distribution’s fat-tailed kurtosis, the worst day’s tail probability under a
fitted Gaussian (routinely one-in-millions — i.e. the naive risk model is what
is broken), a two-sample t-test across the detected regime break (a real shift
in the mean, or only the trajectory?), and exact correlations across a basket.
Two executables, two contracts, no prose and no float drift in between — which is the whole thesis: the executable is the contract.
See examples/README.md
for the exact commands and prerequisites.
Changelog
All notable changes to this project are documented here. The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased
[1.1.2] - 2026-06-01
No library or contract changes — the tq1 envelope, exit codes, and
config_version are byte-for-byte identical to 1.1.1. This is a
documentation/examples release; it also wires the anomalyx crate’s README so
the crates.io landing page finally renders it.
Examples
examples/synergy_market.py— pairs anomalyx withagent-calc(a sibling contract-first exact math CLI) on the live market: anomalyx finds the anomalous days and the price regime shift, thenagent-calccomputes the exact return distribution, the worst day’s tail probability under a fitted Gaussian, a two-sample t-test across the detected CUSUM break, and exact Pearsonrof each basket name to the market. Two typed-JSON contracts chained end to end.examples/polymarket_anomalies.py— find information shocks in a Polymarket prediction market: pulls a market’s price history from Polymarket’s public APIs (read-only, no key), enriches with the per-step probability change, and scans — sharp probability jumps (point/mv) and sustained regime shifts in the odds (coll.cusum), each mapped back to its UTC timestamp.
Documentation
- README’s Examples section now lists all four worked examples (stock, journal,
polymarket, synergy) with the
agent-calcsynergy called out; the journal example is also listed inexamples/README.md. - New mdbook page “Worked examples” (
docs/src/examples.md) framing the examples as consuming thetq1contract. - The
anomalyxbinary crate now setsreadme = "../../README.md", so the crates.io page renders the project README (it had none before).
1.1.1 - 2026-06-01
Fixed
- Timestamp columns are now recognized as sequences and skipped by the value
detectors.
Role::Sequencerequired strict monotonicity, but real clock columns (journald’s__REALTIME_TIMESTAMP/__MONOTONIC_TIMESTAMP, a pcaptimestamp) tie or regress just often enough to fail it — so they were treated as measurements, andcoll.cusumflagged their “level shift” (time advancing) andpointtheir jumps. Atimestamp/tsname token now classifies a column assequence, kept deliberately narrow soresponse_time-style measurements (which you do want outliers on) are unaffected. Surfaced by the new journald example. Noconfig_versionchange — a classifier refinement, like 1.0.1’sprocid.
Examples
examples/journal_anomalies.py— find anomalies in the systemd journal: point / structural / collective within one capture (e.g. CPU-usage spikes per unit), or distributional drift of_SYSTEMD_UNIT/PRIORITYbetween two windows (--baseline-since). Pipes journald JSON on stdin (so it sniffs asjournal, not plain JSON) and maps findings back to timestamp / unit / message.examples/stock_anomalies.py— fetch a ticker’s daily history from Yahoo Finance and find its anomalous trading days (point / multivariate / collective), or its distributional drift against another ticker (--baseline). A worked example of consuming thetq1envelope: it parses the dense JSON contract and maps each finding’s handle back to a calendar date.- Both live outside the Cargo workspace (they shell out to the installed binary), so they don’t affect the build or gates.
1.1.0 - 2026-06-01
Changed
- Column roles now gate every value-distribution detector, not just
point.ctx.seasonal,coll.cusum,dist.ks/dist.psi/dist.chi2, andmv.mahalanobisnow skipidentifierandsequencecolumns (and exclude them from the Mahalanobis feature space). A seasonal subseries, level-shift, drift test, or joint distance over arbitrary ids or a monotonic ramp is noise, not signal — this fixes, e.g.,coll.cusumflagging a shift in a syslogprocid. A sharedRole::skips_value_detection()keeps the rule in one place. (struct.schemastays role-agnostic — null-rate/schema-diff are meaningful for any column;cad.regularityonly ever uses the explicit--cadencecolumn.) - This changes detector output when
column_roles = true, so theconfig_versionfingerprint is bumped (anomalyx-cfg/9). Envelope shape andPROTOCOLare unchanged;--no-column-rolesrestores the pre-roles behavior across all detectors.
Testing
- Scoped the parser-robustness harness’s magic-prefixed fuzz test to formats
whose decode allocation anomalyx bounds (
sqlite). The binary container decoders (parquet/arrow,avro,orc,evtx,pcap) delegate to crates that trust the file’s internal length fields and can attempt a large allocation on adversarial input — a property of binary-format parsing, now documented rather than asserted (it surfaced as an intermittent CI OOM). Those parsers are still fuzzed with arbitrary bytes (rejected at the magic check).
1.0.1 - 2026-06-01
Fixed
- Syslog: the PRI-less file format now parses. rsyslog/syslog-ng write
/var/log/syslogwithout the<PRI>wire header (an ISO-8601 or BSD timestamp, then host and tag), but the parser’s sniff required a<PRI>— so a real/var/log/syslogwas misdetected asiniand collapsed to a single garbage row. It is now recognized (timestamp + host + app) and parses one row per line;facility/severityare present only when a<PRI>is. Found by dogfooding the host’s real syslog (50k lines →ini/1 row, now →syslog/50k rows). - Column roles:
procidis recognized as an identifier. The syslogprocid(process id) column was classed ameasurement, so PIDs were flagged as point outliers (~18.5k noise findings on a 50k-line syslog).procidjoins the identifier name set, so it is skipped like other ids (→ 1 finding).
1.0.0 - 2026-06-01
First stable release. No code changes from 0.9.0 — this commits the contract.
Stable
- The
tq1contract is now stable: the protocol idanomalyx/tq1, the exit codes (0/1/2), the dense finding-row layout, the handle forms (column:/cell:/row:/range:/dist:), the required envelope fields, and the severity ladder. Breaking any of these requires a major bump and aPROTOCOLchange — they will not change quietly under1.x. See the contract’s Stability section. - Continues to evolve additively under
1.x: new detectors, formats, optional flags, and optional envelope fields. Output-affecting config changes move theconfig_versionfingerprint; determinism (same input + sameconfig_version⇒ byte-identical output) is absolute. The golden-envelope tests guard all of this against accidental drift.
0.9.0 - 2026-06-01
Added
scan/explaingain--set KEY=VALUE(repeatable) — override any detector-config field by name (--set point_threshold=4.0,--set dist_alpha=0.01,--set column_roles=false, …). The settable keys and their defaults are exactly whatdescribe’sconfigobject lists. An unknown key, or a value that doesn’t fit the field’s type, is a hard error (exit2). Overrides flow intoconfig_version, so a tuned run stays reproducible and self-describing — tuning is never silent. (The common knobs keep their dedicated flags:--fdr,--cad-max-cv,--period,--cadence.)- Implemented as a JSON round-trip over the serialized
DetectConfig, so every field is settable with no per-field code; no envelope/PROTOCOLchange.
Testing
- Golden-envelope snapshot tests (
anomalyx/tests/golden.rs). Run the actual binary and pin its byte-exact stdout forschema,describe, and a representativescanenvelope against committed goldens — so any accidental contract drift (renamed field, changed dense-row layout, shiftedconfig_version, recalibrated confidence) fails CI as a visible diff. Regenerate intentional changes withBLESS=1. - Million-row scale test (
ax-validate): a 1,000,000-row scan must be byte-identical across runs and recover exactly the injected outliers — determinism and correctness verified at scale, not just on toy inputs.
0.8.0 - 2026-06-01
Changed
- Unified confidence calibration across all detectors. Confidence was
computed three incompatible ways (
1 − pfor the distributional/multivariate detectors, a logistic-over-threshold for point/contextual/collective/PSI, and a linear map for cadence), so a0.9meant different things depending on which detector produced it — and severity (and--top/--min-severity) couldn’t rank across detectors. Now every detector routes through one shared function: confidence is a logistic of how far its statistic sits past its firing threshold, measured relatively so units cancel. At the threshold →0.5, rising toward1.0; a finding “2× past threshold” earns the same confidence on any detector. Newax_detect::calibratemodule (from_exceedance/from_undercut); the duplicatedshift_confidence/psi_confidence/robustz::confidencehelpers are gone. - This recalibrates every published confidence and severity. The
config_versionfingerprint is bumped (anomalyx-cfg/8) so the change is visible to agents. The envelope shape andPROTOCOLare unchanged.
Testing
- Parser robustness harness (
ax-normalize/tests/robustness.rs). Property tests assert that no parser panics, hangs, or over-allocates on arbitrary, magic-prefixed-garbage, or truncated byte streams — fed both through auto-detection and straight to every registered parser — and that normalization is deterministic over fuzz inputs. Untrusted-input hardening: a malformed file must fail cleanly, never crash.
0.7.0 - 2026-06-01
Added
- Column roles. Every scanned column is classified into a role —
measurement/identifier/categorical/sequence/constant— and the full map ships in the envelope’s newrolesarray. The point detector skipsidentifierandsequencecolumns (a “large process-id” or a counter’s endpoint is not an anomaly), attacking noise at the detection layer. On a real 20k journald capture this cuts point findings from ~12,500 to ~240 while leaving genuine measurements (e.g. a parquet’s heavily-skewedDAYS_LOST) untouched. --no-column-rolesdisables role-based skipping (roles are still reported). The setting is part of theconfig_versionfingerprint (cr=).
Design
- Identifiers are recognized by name (
*_id,uid,gid,pid,tid,session,uuid, …) — the only reliable signal, since a process-id column is statistically indistinguishable from a discrete measurement. Cardinality is deliberately not used to call a numeric column categorical (a near-constant column with a few outliers has low cardinality yet is exactly what point detection should catch). Heuristic, but never silent: every role is in the envelope and the skipping is one flag away from off. - New
ax_core::rolesmodule (Role,ColumnRole,Column::role);rolesadded to the envelope andschema. Additive;PROTOCOLunchanged.
0.6.0 - 2026-06-01
Added
scangains output scoping:--top Nand--min-severity S.--top Nemits only the N most severe findings;--min-severity Semits only findings at or aboveS(info/low/medium/high/critical). This is the volume complement to--fdr— on a large corpus it shrinks the envelope dramatically (a real 127k-row parquet: ~3 MB → ~5.6 KB with--top 25) while keeping the full picture insummary.- Honest truncation.
summary(total,by_class,max_severity) and the exit code always describe everything detected, never the scoped view — so filtering can’t make anomalies look absent or flip exit1→0. When findings are withheld, the envelope gains ascopeblock with the applied filter anddetected/emitted/droppedcounts;rowscarries only the emitted subset. Absent when no scoping was applied (default output unchanged).
Changed
- The envelope
summary.totalnow reports the number of findings detected (unchanged when no output scoping is applied, since detected == emitted then).rows.len()equalsscope.emittedwhen scoping is active. Thescopefield and updatedschemaare additive;PROTOCOLis unchanged.
0.5.0 - 2026-05-31
Added
scan/explaingain--fdr Q— false-discovery-rate control for the point detector via the Benjamini–Hochberg procedure, applied per column. When set, each cell’s modified z-score is converted to a two-sided p-value and the fixedpoint_thresholdis replaced by a multiplicity-aware cutoff that bounds the expected proportion of false flags atQ(e.g.--fdr 0.05). Opt-in: omitted, the detector behaves exactly as before. The level is part of theconfig_versionfingerprint (pfdr=), so it is a versioned, reproducible choice.- New
ax_detect::fdrmodule:two_sided_p(normal-tail p-value viaerfc) andbenjamini_hochberg(deterministic step-up cutoff), each property/exact tested and mutation-gated.
Notes
- FDR is a correctness control, not a volume knob. It replaces an arbitrary
fixed cutoff with a principled error-rate guarantee and adapts to how many
cells were tested (a noise column stops contributing chance flags; the same
outlier can be significant in a small column yet not a large one). On genuinely
heavy-tailed data it may flag more cells than the old fixed threshold — those
cells really are significant at the chosen
Q; the fixed cutoff was simply stringent in an uncalibrated way. To cap output volume, combine with column scoping (--columns/--exclude) and the planned severity / top-N output scoping. - The p-value uses the consistent-σ standardized deviation
(x − center)/scale(≈N(0, 1)under the null), notrobustz’s display-scaled modified z-score.
0.4.1 - 2026-05-31
Fixed
- SQLite: WAL-mode databases now read. The parser loads a database from its
main-file byte image via SQLite’s read-only deserialize. A database in WAL
journal mode carries read-version
2in its file header (byte 19), and SQLite refuses to open such an image read-only without the-walcompanion (which never travels in a byte stream) — failing withunable to open database file(SQLITE_CANTOPEN). Since the main image of a checkpointed WAL database is a complete, valid database, the parser now reinterprets it as legacy (read-version1) on a private copy and reads its checkpointed state. This unblocks the common case: most production.dbfiles (browsers, peewee, and countless apps) default to WAL. Found by dogfooding real on-disk databases.
0.4.0 - 2026-05-31
Added
scan/explaingain--cad-max-cv F— the maximum inter-arrival coefficient of variation below whichcad.regularityflags a column as metronomic (automated) timing. Defaults to0.05(unchanged behavior). Raise it to catch jittered beacons: a C2 channel with ~10% timing jitter (CV ≈ 0.10) slips past the default but is caught at--cad-max-cv 0.15.- The threshold is part of the
config_versionfingerprint (cdcv=), so overriding it is a visible, versioned change in the envelope — not a silent knob. Same input + sameconfig_versionstill yields byte-identical output.
Notes
- Validated against a deterministic jitter sweep: at the default
0.05the detector fires up to CV ≈ 0.0494 and goes quiet at ≈ 0.0504 (it uses the sample/Bessel-corrected standard deviation); raising the threshold shifts that boundary exactly as expected.
0.3.0 - 2026-05-31
Column scoping — focus detection on the columns that matter in a wide corpus, deterministically and without guessing.
Added
scan/explaingain--columns C,..(analyze only these columns) and--exclude C,..(analyze every column except these). The two are mutually exclusive. Projection is applied before detection and to the baseline as well, so drift comparison stays consistent.- Column scoping is explicit, never heuristic. anomalyx will not guess which
columns are “interesting” — a silent auto-skip would itself be a guess, and
would wrongly drop exactly the near-unique numeric measurements the marquee
detectors rely on (packet
durationNanos, span durations, latencies). You name the scope; the result stays deterministic and reproducible. - An unknown column name in
--columns/--excludeon the primary corpus is a hard error (exit2) — a typo can never silently scope a scan down to nothing and read as “clean”. The baseline is projected leniently (it is a different corpus and need not carry every scoped column).
Notes
- This directly tames wide, identifier-heavy corpora. On a real 20k-entry
journalctl -o jsoncapture,scanemits ~10k mostly-noisepointfindings across journald’s many ID/counter/timestamp fields;scan --excludeof those fields (or--columnsof the meaningful ones) collapses that to a couple hundred focused findings without touching detector configuration. - New
RecordSet::select/RecordSet::withoutprojection primitives inax-core. No envelope orconfig_versionchange — column scope is an input-side projection, so the determinism contract is unchanged.
0.2.2 - 2026-05-31
Fixed
- A plain-text stream that merely starts with
[or{(e.g. an Apacheerror_log) was grabbed by the JSON parser’s cheap content sniff and then failed with a misleadingfailed to parse json input. Now a parse failure under a weak (TEXT/FALLBACK) content guess is reported honestly asUnknownFormat— “I don’t recognize this” rather than “your JSON is broken”. A format identified confidently (by file extension, or aMAGIC/STRONGsignature) still surfaces a genuine malformed-file parse error as before.
0.2.1 - 2026-05-31
Fixed
describeadvertised only the original sixinput_formats(csv/tsv/ndjson/json/parquet/arrow) — a stale literal that never tracked the 26 parsers added since. It now derives the list from the live parser registry, so it reflects exactly what the build reads (all 32 with default features; fewer under--no-default-features). A guard test assertsdescribe’s formats equal the registry, so it can’t drift again.
Added
anomalyx --version(-V/version) prints the crate version.
0.2.0 - 2026-05-31
Format explosion — anomalyx now normalizes ~30 formats spanning logs, security telemetry, network captures, observability streams, spreadsheets, and data-lake files, all behind the same record-model boundary and detector taxonomy.
Added
- Logs & observability parsers:
logfmt, web access logs (Combined/Common),syslog(RFC 3164/5424),systemd journal(journalctl -o json),Prometheus/OpenMetrics, andOpenTelemetry(OTLP/JSON traces). - Security telemetry parsers:
CEF/LEEF, Linuxauditd,EVTX(Windows Event Log), Suricata/ZeekEVEJSON,osqueryresults, and AWSCloudTrail. - Network parsers:
PCAP/PCAPNG(beaconing/C2 viacadence),NetFlow/ IPFIX (nfdump CSV), AWSVPC Flow Logs, and DNS query logs (DGA/exfil viapointon query-name entropy/length). - Structured-data parsers:
YAML,TOML/INI, andXML(Nessus/OpenVAS/SOAP). - Columnar, data-lake & database parsers:
Avro,ORC, Excel/ODS(xlsx/xls/xlsb), andSQLite— joining the existing Parquet/Arrow. - Several parsers compute detection features (DNS name entropy/length, flow
duration, span durations, normalized epoch timestamps) and rename source fields to a canonical schema. - Binary/heavyweight parsers sit behind default-on feature flags
(
evtx,pcap,xlsx,sqlite,datalake,polars), so--no-default-featuresis a lean text-only normalizer.
Notes
- 32 parser plugins total; each ships its own property/exact tests and passes the workspace-wide 0-surviving-mutant gate.
0.1.0 - 2026-05-30
Initial release — a contract-first anomaly-detection CLI over arbitrary corpora.
Added
- Contract surface (
anomalyx): the four discoverable verbsdescribe,schema,scan,explain; a dense, versionedtq1JSON envelope with a dictionary-pinned string table and stable evidence handles; committed exit codes (0clean /1anomalies /2error); honest absence for detectors that cannot run. - Normalization (
ax-normalize): CSV, TSV, NDJSON and JSON via a lean deterministic reader; Parquet and Arrow IPC via the Polars backbone (behind the default-onpolarsfeature). Every format is lowered to one engine-independentRecordSet, so detectors never see a Polars type. - Detectors (
ax-detect) — nine across the full seven-class taxonomy:point.modz— Iglewicz–Hoaglin modified z-score (robust MAD).dist.ks— two-sample Kolmogorov–Smirnov drift.dist.psi— Population Stability Index over baseline-quantile bins.dist.chi2— chi-square over category frequencies (surfaces new categories).struct.schema— mixed-type and high-null-rate columns; added / dropped / type-changed columns against a baseline.mv.mahalanobis— multivariate Mahalanobis distance (own deterministic Cholesky solve; chi-square p-value).ctx.seasonal— contextual seasonal-subseries modified z-score (--period).coll.cusum— collective CUSUM level-shift detection.cad.regularity— metronomic-cadence (inter-arrival CV) detection (--cadence).
- Modes: single-corpus scan;
--baseline Bfor distributional drift and schema diff;--period Nfor seasonal/contextual;--cadence COLfor timing. - Determinism: order-independent (Neumaier-compensated) reductions, no RNG or wall-clock in the measurement path, and a config-version fingerprint — same input + same fingerprint yields byte-identical output.
- Validation (
ax-validate): the math core is checked against the NIST Statistical Reference Datasets (certified to 15 digits), plus stress tests for ground-truth anomaly recovery and reproducibility at scale. - Quality gates: property-based tests (
proptest) and acargo-mutants0-surviving-mutant gate across the workspace; GitHub Actions CI runs the same gates on every push. - Dual-licensed under MIT OR Apache-2.0.