System Overview
Ripple is a distributed incremental stream processing system. This page describes the overall topology, the components within each worker node, the cluster-level services, and how data flows through the system.
System Topology
┌─────────────────────┐
│ Coordinator │
│ (stateless, HA) │
│ │
│ - Hash ring │
│ - Partition assign │
│ - Heartbeat detect │
│ - Checkpoint trigger│
└────────┬─────────────┘
│ gRPC :9200
┌───────────────┼───────────────┐
│ │ │
┌─────────v──────┐ ┌──────v────────┐ ┌────v──────────┐
│ Worker-0 │ │ Worker-1 │ │ Worker-2 │
│ partitions │ │ partitions │ │ partitions │
│ [0,1,2,3] │ │ [4,5,6,7] │ │ [8,9,10,11] │
│ │ │ │ │ │
│ ┌────────────┐ │ │ │ │ │
│ │ Incr Graph │ │ │ ... │ │ ... │
│ └────────────┘ │ │ │ │ │
└───┬────────┬───┘ └───────────────┘ └───────────────┘
│ │
┌────v──┐ ┌──v────┐
│ Kafka │ │ MinIO │
│(input)│ │(ckpt) │
└───────┘ └───────┘
Ports per worker:
:9100 — HTTP health (/health, /ready)
:9101 — Async RPC (delta exchange)
:9102 — Prometheus metrics (/metrics)
Per-Node Components
Each worker contains the following components:
Incremental Graph Engine
The core computation engine. Holds the node array, dirty heap, and stabilization loop. Each worker has exactly one graph instance processing its assigned partitions.
See: The Graph Engine
Connector Layer
Sources and sinks that connect the graph to external systems:
| Connector | Role |
|---|---|
Kafka_connector | Reads trade events from Kafka topics |
File_source | Reads from CSV files or stdin (demo/testing) |
File_sink | Writes CSV output to files or stdout |
Delta Transport
Handles cross-partition communication:
| Component | Role |
|---|---|
Delta_buffer | Buffers incoming deltas, deduplicates by sequence number |
Delta_rpc | Async RPC client/server for delta exchange between workers |
Remote_node | Graph leaf that receives values from a remote worker |
Checkpoint Manager
Manages periodic snapshots of graph state:
| Component | Role |
|---|---|
Checkpoint | Serializable snapshot (leaf values + input offsets) |
Store | Pluggable backend (In_memory, Local_disk, S3) |
Window Manager
Tracks event-time windows and watermarks:
| Component | Role |
|---|---|
Window | Time interval with assigner (tumbling/sliding/session) |
Watermark | Tracks completeness across sources (min of all) |
Observability
| Component | Role |
|---|---|
Metrics | Counters, gauges, histograms (Prometheus-compatible) |
Trace | W3C distributed tracing with adaptive sampling |
Introspect | Graph snapshot to sexp or DOT format |
Alert | Rule-based alert evaluation and notification |
Cluster-Level Services
Coordinator
The coordinator is a stateless service that manages the cluster topology:
- Hash Ring: consistent hashing for partition-to-worker assignment
- Worker Registration: workers register on startup, receive partition assignments
- Heartbeat Detection: detects failed workers via heartbeat timeout (default 30s)
- Checkpoint Triggers: coordinates cluster-wide checkpoint at configurable intervals (default 10s)
- Rebalancing: reassigns partitions when workers join or leave
The coordinator runs as a Kubernetes Deployment with 2 replicas for HA. It is stateless – all persistent state lives in the checkpoint store (S3).
Kafka / Redpanda
The input event stream. Ripple reads from Kafka-compatible brokers (Redpanda recommended for development). Topics are partitioned, and Ripple’s partitioning aligns with Kafka partition IDs for locality.
MinIO / S3
Checkpoint storage. Checkpoints are written atomically (single-object PUT for S3, temp+rename for local disk) to ensure crash safety.
Data Flow
Kafka topic: "trades"
│
│ (partitioned by symbol hash)
│
v
┌─ Worker-0 ───────────────────────────────────────────┐
│ │
│ 1. Read batch from Kafka partition │
│ 2. For each event: │
│ a. Assign to window (tumbling 60s) │
│ b. Update leaf node (set_leaf) │
│ c. Advance watermark │
│ 3. Stabilize graph │
│ 4. Collect changed outputs │
│ 5. Emit deltas to downstream workers (Async RPC) │
│ 6. Write output to Kafka topic: "vwap-output" │
│ │
│ Periodically: │
│ - Checkpoint graph state to S3 │
│ - Send heartbeat to coordinator │
│ - Export metrics to Prometheus │
└───────────────────────────────────────────────────────┘
Cross-Partition Data Flow
When a worker’s output is consumed by another worker (e.g., a second-level aggregation), the delta transport handles the communication:
Worker-0 (partition 0-3) Worker-1 (partition 4-7)
┌──────────────────────┐ ┌──────────────────────┐
│ [graph] │ │ [graph] │
│ | │ │ | │
│ v │ │ v │
│ [output: vwap_p0] │─delta──>│ [remote: vwap_p0] │
│ │ RPC │ | │
└──────────────────────┘ │ v │
│ [fold: global_vwap] │
└──────────────────────┘
The remote node (add_remote) behaves identically to a local leaf from the graph’s perspective. The delta transport applies incoming deltas and sets the remote node’s value, which triggers incremental recomputation in the downstream graph.
Pipeline Topology
Pipelines are defined in OCaml code, not YAML:
let pipeline =
Topology.create ~name:"vwap-60s"
|> Topology.source ~name:"trades" ~topic:"raw-trades"
|> Topology.partition ~name:"by-symbol" ~key_name:"symbol"
|> Topology.window ~name:"1min"
~config:(Tumbling { size = Time_ns.Span.of_sec 60.0 })
|> Topology.fold ~name:"vwap" ~f_name:"Vwap.compute"
|> Topology.sink ~name:"output" ~topic:"vwap-results"
The topology builder validates the pipeline at construction time (source must be first, sink must be last, no duplicate names) and produces a Topology.t that the coordinator uses for partition assignment and deployment planning.