Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Monitoring

GasHammer exposes Prometheus metrics from both the hive and edge processes. This page describes how to set up monitoring with Prometheus and Grafana.

Prometheus Configuration

Add GasHammer targets to your prometheus.yml:

scrape_configs:
  - job_name: 'gashammer-hive'
    static_configs:
      - targets: ['hive:9091']

  - job_name: 'gashammer-edge'
    static_configs:
      - targets: ['edge-1:9091', 'edge-2:9091', 'edge-3:9091']

With Kubernetes and the ServiceMonitor from the Helm chart, Prometheus Operator discovers targets automatically.

Key Metrics to Watch

During a Run

MetricAlert ThresholdMeaning
gashammer_tx_submitted_total ratedrops to 0Edge stopped generating
gashammer_tx_failed_total rate>5% of submittedHigh failure rate
gashammer_tx_latency_seconds p99>10sSevere latency degradation
gashammer_events_dropped_totalany increaseTelemetry backpressure
gashammer_edges_activedrops below expectedEdge disconnection
gashammer_oracle_violations_totalany increaseCorrectness issue detected

System Health

MetricAlert ThresholdMeaning
gashammer_feed_messages_total ratedrops to 0Feed connection lost
Edge heartbeat age>30sEdge may be stale
Hive memory usage>80% of limitRisk of OOM

Grafana Dashboard

Import the GasHammer Grafana dashboard from docs/grafana/dashboard.json or use these example panels:

Transaction Throughput

rate(gashammer_tx_submitted_total[1m])
rate(gashammer_tx_confirmed_total[1m])

Latency Percentiles

histogram_quantile(0.50, rate(gashammer_tx_latency_seconds_bucket[1m]))
histogram_quantile(0.90, rate(gashammer_tx_latency_seconds_bucket[1m]))
histogram_quantile(0.99, rate(gashammer_tx_latency_seconds_bucket[1m]))

Gas Rate

rate(gashammer_gas_submitted_total[1m])

Error Rate

rate(gashammer_tx_failed_total[1m]) / rate(gashammer_tx_submitted_total[1m])

Alerting

Recommended Prometheus alerting rules:

groups:
  - name: gashammer
    rules:
      - alert: GasHammerEdgeDown
        expr: gashammer_edges_active < 1
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "No GasHammer edges connected"

      - alert: GasHammerHighErrorRate
        expr: rate(gashammer_tx_failed_total[5m]) / rate(gashammer_tx_submitted_total[5m]) > 0.05
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "GasHammer error rate exceeds 5%"

      - alert: GasHammerOracleViolation
        expr: increase(gashammer_oracle_violations_total[1m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "Correctness oracle detected a violation"

Structured Logging

Both hive and edge emit structured JSON logs via tracing. Configure your log aggregation system to parse these fields:

FieldDescription
timestampISO 8601 timestamp
levelLog level (TRACE, DEBUG, INFO, WARN, ERROR)
targetRust module path
spanCurrent tracing span
run_idAssociated run (if in a run context)
edge_idEdge identifier (edge logs only)
messageHuman-readable message