Effect Injection
Ripple requires deterministic replay for crash recovery: given the same input sequence from a checkpoint, the system must produce exactly the same output. This means all non-determinism must be injectable through a single interface. This page explains the EFFECT module type, its two implementations, and why this design is mandatory.
The Problem
Consider a node that timestamps its output:
(* BAD: direct call to Time_ns.now *)
let f input =
{ data = compute input; timestamp = Time_ns.now () }
This function is non-deterministic. During replay after a crash, Time_ns.now() returns a different value than during the original computation. The replayed state diverges from the original, violating the determinism invariant.
The same problem applies to:
- Random number generation (used for sampling, jitter)
- Network I/O (reading from sockets)
- File I/O (reading configuration)
- System calls (getpid, hostname)
The EFFECT Module Type
All non-determinism flows through a single module signature:
module type S = sig
val now : unit -> Time_ns.t
val random_int : int -> int
end
Every component that needs time or randomness takes now or random_int as a parameter rather than calling Time_ns.now or Random.int directly.
The graph engine is parameterized by now:
let create ~now =
{ nodes = Array.create ~len:1024 (Obj.magic ())
; node_count = 0
; dirty_heap = Dirty_heap.create ~capacity:1024
; ...
; now (* injected, not called directly *)
}
The tracing system is parameterized by random_int:
let create_root ~random_int =
{ trace_id = gen_trace_id ~random_int
; span_id = gen_span_id ~random_int
; ...
}
Two Implementations
Live: Production
module Live : S = struct
let now = Time_ns.now
let random_int = Random.int
end
Used by the worker binary and CLI. Provides real wall-clock time and pseudorandom numbers.
Test: Deterministic Simulation
module Test : sig
include S
val advance_time : Time_ns.Span.t -> unit
val set_time : Time_ns.t -> unit
val seed_random : int -> unit
end = struct
let current_time = ref Time_ns.epoch
let rng = ref (Random.State.make [| 42 |])
let now () = !current_time
let random_int bound = Random.State.int !rng bound
let advance_time span =
current_time := Time_ns.add !current_time span
let set_time t = current_time := t
let seed_random seed = rng := Random.State.make [| seed |]
end
Used by all tests, benchmarks, and the deterministic simulation harness. Time only advances when explicitly stepped. Random sequences are reproducible from a seed.
Usage Patterns
In Tests
let%expect_test "stabilization timing is deterministic" =
Test.set_time Time_ns.epoch;
let g = Graph.create ~now:Test.now in
(* ... build graph ... *)
Test.advance_time (Time_ns.Span.of_ms 1.0);
let _ = Graph.stabilize g in
(* last_stabilization_ns is exactly 1_000_000 ns *)
In the Worker Binary
let run ~worker_id ~partition_id =
let now = Time_ns.now in (* Live time *)
let worker = Worker.create ~worker_id ~partition_id ~now in
...
In Deterministic Simulation
let simulate ~seed ~ticks =
Test.seed_random seed;
Test.set_time Time_ns.epoch;
let g = Graph.create ~now:Test.now in
for _ = 1 to ticks do
Test.advance_time (Time_ns.Span.of_us 100.0);
(* inject events, stabilize, check invariants *)
done
Why This Design Is Mandatory
Deterministic Replay
The checkpoint-and-replay recovery protocol depends on determinism:
1. Load checkpoint (leaf values + input offsets)
2. Rebuild graph structure
3. Restore leaf values from checkpoint
4. Replay input log from checkpoint's offset
5. Result: same graph state as before crash
Step 5 only holds if every computation produces the same result given the same inputs. If any node calls Time_ns.now() directly, the replayed state diverges at that node and all its descendants.
Simulation Testing
The deterministic simulation harness (inspired by TigerBeetle’s approach) runs millions of simulated operations with injected failures. Each simulation is parameterized by a seed. When a bug is found, the seed reproduces the exact failure sequence.
This is impossible if the system has any direct sources of non-determinism.
The Rule
From the codebase:
“No module in Ripple may call
Time_ns.now(),Random.int, or perform direct I/O. All such operations go through the EFFECT interface. Violations break deterministic replay and are considered bugs.”
This rule is enforced by code review, not by the type system (OCaml does not have an effect system that prevents calling Time_ns.now). A future direction would be to use OCaml 5 effects to enforce this at the type level.
Extending the Interface
When adding new sources of non-determinism, extend the S signature:
module type S = sig
val now : unit -> Time_ns.t
val random_int : int -> int
(* Future additions: *)
(* val hostname : unit -> string *)
(* val getpid : unit -> int *)
end
Both Live and Test must be updated. The Test implementation must return deterministic values controllable by the test harness.