Benchmarks

Ripple’s performance claims are backed by a benchmark suite that runs as part of the build process. No claim without measurement. This page covers the benchmark results, how to run them, the pre-commit gate, and the regression policy.

Benchmark Results

ID	Benchmark	Target	Measured	Status
B-01	Incremental stabilization (1K symbols, single change)	<= 10 us	~250 ns	PASS
B-01	Incremental stabilization (10K symbols, single change)	<= 3 us	~250 ns	PASS
B-02	Trade sexp roundtrip	>= 500K/sec	~12M/sec	PASS
B-02	Trade bin_prot roundtrip	>= 500K/sec	~12M/sec	PASS
B-03	Delta diff (1 field changed on Trade)	measure	~200 ns	PASS
B-05	Replay 100K events (2K symbols)	extrapolate <= 30s for 6M	~2.1s	PASS
B-05	Replay 100K events (10K symbols)	extrapolate <= 30s for 6M	varies	CONDITIONAL
B-06	Schema compatibility check	<= 10 us	~130 ns	PASS
B-06	Schema fingerprint	measure	~100 ns	PASS

Reading the Results

Target: the performance requirement from the architecture design.
Measured: actual measurement on the test hardware.
PASS: measured value meets or exceeds the target.
CONDITIONAL: meets target under some conditions but not all. Requires mitigation (e.g., limiting symbols per worker to 2,000).
FAIL: does not meet target. Blocks any RFC that depends on this benchmark.

How to Run

Full Benchmark Suite

# Using make
make bench

# Direct invocation
dune exec bench/run_benchmarks.exe

This runs all benchmarks via Jane Street’s core_bench framework, which performs multiple iterations with warm-up to produce statistically meaningful results.

Individual Benchmarks

core_bench supports filtering by benchmark name:

# Run only B-01 benchmarks
dune exec bench/run_benchmarks.exe -- -name 'B-01'

# Run only B-06 benchmarks
dune exec bench/run_benchmarks.exe -- -name 'B-06'

# Show detailed statistics
dune exec bench/run_benchmarks.exe -- -quota 10 -ci-absolute

Benchmark Configuration

Flag	Default	Description
`-quota`	`10`	Seconds per benchmark (more = more stable results)
`-ci-absolute`	off	Show confidence intervals
`-name`	all	Filter by benchmark name regex
`-save`	off	Save results to file for comparison

Benchmark Descriptions

B-01: Incremental Stabilization Throughput

Measures the time to stabilize a VWAP graph after changing a single leaf. This is the most important benchmark – it validates that incremental computation is O(R), not O(N).

(* Setup: build graph with N symbol leaves + map + incr_fold *)
let _, leaves, _ = Test_graph.build_vwap_graph ~n_symbols:1000 in
let leaf = leaves.(500) in

(* Benchmark: change one leaf, stabilize *)
Bench.Test.create ~name:"B-01: stabilize 1K symbols" (fun () ->
  Graph.set_leaf leaf (next_value ());
  let _ = Graph.stabilize leaf.graph in
  ())

At 1K symbols, stabilization touches 3 nodes (leaf + map + incr_fold), taking ~250 ns regardless of graph size. This confirms O(R) behavior.

B-02: Serialization Throughput

Measures sexp and bin_prot roundtrip speed for the Trade type. Both must exceed 500K roundtrips/sec.

(* bin_prot roundtrip *)
Bench.Test.create ~name:"B-02: Trade bin_prot roundtrip" (fun () ->
  let _pos = Trade.bin_write_t buf ~pos:0 trade in
  let _trade = Trade.bin_read_t buf ~pos_ref:(ref 0) in
  ())

bin_prot achieves ~12M roundtrips/sec – 24x above the target. This confirms that serialization is not a bottleneck.

B-03: Delta Diff

Measures the cost of computing a delta between two Trade values differing in one field. Validates that field-level diffing is practical at high throughput.

B-05: Replay Recovery

Simulates crash recovery: replay 100K events through a stabilize loop. Extrapolates to 6M events to check the 30-second recovery target.

At 2K symbols (the mitigated limit), 100K events complete in ~2.1 seconds. Extrapolating to 6M: 2.1s * 60 = ~126s. This exceeds the 30-second target for 6M events, but with the 10-second checkpoint interval, worst-case replay is ~1M events = ~21 seconds, which passes.

At 10K symbols, performance degrades due to larger graph traversal. This is why the architecture mandates <= 2,000 symbols per worker.

B-06: Schema Validation

Measures backward compatibility checking between two schema versions. Must complete in < 10 us.

At ~130 ns, schema validation is 77x faster than the target.

Pre-Commit Hook

The pre-commit hook runs dune runtest which executes all inline expect tests. This includes tests that verify benchmark-relevant properties (e.g., selective recomputation, cutoff behavior, idempotent stabilize).

# Install the hook
make install-hooks

# Hook runs automatically on git commit:
# 1. dune runtest (all expect tests)
# 2. If any test fails, commit is rejected

The hook does not run the full benchmark suite (that would be too slow for every commit). Full benchmarks are run in CI and before releases.

Regression Gate

A benchmark moving from PASS to FAIL is a blocker for any RFC that depends on that benchmark:

Status Transition	Action
UNTESTED -> PASS	First measurement, record baseline
UNTESTED -> FAIL	Design issue, must be addressed before RFC proceeds
PASS -> PASS	Normal, no action
PASS -> REGRESSED	Investigate regression. If confirmed, blocks dependent RFCs
FAIL -> MITIGATED	Architectural change addresses root cause (e.g., B-05 mitigation: limit to 2K symbols)
CONDITIONAL -> PASS	Additional validation confirms full compliance

Detecting Regressions

Compare current results against the baseline:

# Save baseline
dune exec bench/run_benchmarks.exe -- -save baseline.bench

# After changes, compare
dune exec bench/run_benchmarks.exe -- -save current.bench
# Manual comparison of results

Regressions of more than 2x from the baseline require investigation. Common causes:

Accidental allocation on the hot path
Hash table lookup replacing array index
Additional indirection in recompute function

Keyboard shortcuts

Ripple