MVCC Snapshots

Snapshots are the read interface to the storage engine. They provide a frozen, immutable view of the graph at a point in time. Readers acquire a snapshot once and hold it for the duration of a read operation — they never block writers, and writers never invalidate their data.

What a Snapshot Is

#![allow(unused)]
fn main() {
pub struct Snapshot {
    /// Immutable reference to MemTable at the time of snapshot creation.
    /// The MemTable is never mutated after the snapshot is published.
    memtable: Arc<MemTable>,

    /// Ordered list of segment references (oldest to newest).
    segments: Arc<Vec<SegmentRef>>,
}
}

The Snapshot is wrapped in Arc<Snapshot> so multiple readers can share the same snapshot cheaply. Acquiring a snapshot is Arc::clone — one atomic increment on the reference count.

Snapshot Manager

The SnapshotManager maintains the current published snapshot using arc-swap for lock-free atomic updates:

#![allow(unused)]
fn main() {
pub struct SnapshotManager {
    current: ArcSwap<Snapshot>,
}

impl SnapshotManager {
    /// Called by readers — O(1), no lock.
    pub fn load(&self) -> Arc<Snapshot> {
        self.current.load_full()
    }

    /// Called by the writer after every commit — O(1) atomic swap.
    pub fn publish(&self, snapshot: Snapshot) {
        self.current.store(Arc::new(snapshot));
    }
}
}

arc-swap guarantees that:

A reader loading the snapshot always gets a consistent, complete view.
There is no moment when the snapshot pointer is null or partially updated.
No reader needs to hold a lock to read the snapshot.

Snapshot Lifetime

#![allow(unused)]
fn main() {
// Acquiring: O(1), no lock, no allocation
let snap = engine.snapshot();   // = manager.load()

// Using: reads go through the snapshot — guaranteed consistent view
let entity = snap.get_entity(id);
let hosts = snap.entities_by_class(&EntityClass::new_unchecked("Host"));

// Releasing: O(1) when Arc reference count drops to zero
drop(snap);
}

When a snapshot is dropped, its reference count decrements. If this was the last reference, the Arc frees the MemTable and segment list it held. Old MemTable data can then be freed.

Snapshot Query Methods

#![allow(unused)]
fn main() {
impl Snapshot {
    // Point lookups (O(1) MemTable, O(n) segment scan)
    pub fn get_entity(&self, id: EntityId) -> Option<&Entity>;
    pub fn get_relationship(&self, id: RelationshipId) -> Option<&Relationship>;

    // Index-accelerated scans
    pub fn entities_by_type(&self, t: &EntityType) -> Vec<&Entity>;
    pub fn entities_by_class(&self, c: &EntityClass) -> Vec<&Entity>;
    pub fn entities_by_source(&self, connector_id: &str) -> Vec<&Entity>;
    pub fn relationships_by_source(&self, connector_id: &str) -> Vec<&Relationship>;
    pub fn all_entities(&self) -> impl Iterator<Item = &Entity> + '_;

    // Adjacency
    pub fn outgoing(&self, id: EntityId) -> Vec<&Relationship>;
    pub fn incoming(&self, id: EntityId) -> Vec<&Relationship>;

    // Stats
    pub fn entity_count(&self) -> usize;
    pub fn relationship_count(&self) -> usize;
}
}

Consistency Guarantees

Read-your-writes: Within the same StorageEngine instance, a read snapshot acquired after a write() call will always see the written data.

Snapshot isolation: A snapshot acquired at time T will never see writes committed after T, even if those writes happen on the same thread.

No dirty reads: A snapshot only contains data from committed WriteBatches — data written to the WAL but not yet applied to the MemTable is not visible.

Using Snapshots in async Code

Snapshot contains Arc references, making it Send + Sync. However, GraphReader<'snap> borrows the snapshot and cannot cross await points. The recommended pattern:

#![allow(unused)]
fn main() {
async fn my_handler(engine: Arc<Mutex<StorageEngine>>) -> Vec<Entity> {
    // Block: acquire lock, snapshot, compute, release
    let results = {
        let engine = engine.lock().unwrap();
        let snap = engine.snapshot();
        // All computation here — no await
        snap.entities_by_class(&EntityClass::new_unchecked("Host"))
            .into_iter()
            .filter(|e| !e._deleted)
            .cloned()
            .collect::<Vec<_>>()
        // snap dropped, lock released
    };

    // Now you can await freely with owned Vec<Entity>
    process_results(results).await
}
}

Alternatively, clone the Arc<Snapshot> and pass it to a spawn_blocking task for CPU-intensive graph operations that would otherwise block the async runtime.

Parallax