MemTable

The MemTable is the in-memory write buffer. Every write is applied here immediately after the WAL fsync. Reads check the MemTable first, then fall back to segment files.

Structure

#![allow(unused)]
fn main() {
pub struct MemTable {
    // Primary storage
    entities:      BTreeMap<EntityId, Entity>,
    relationships: BTreeMap<RelationshipId, Relationship>,

    // Secondary indices (maintained in sync with primary)
    type_index:    HashMap<EntityType, Vec<EntityId>>,
    class_index:   HashMap<EntityClass, Vec<EntityId>>,
    source_index:  HashMap<CompactString, Vec<EntityId>>,   // connector_id → entities
    adjacency:     HashMap<EntityId, (Vec<RelationshipId>,  // outgoing
                                      Vec<RelationshipId>)>, // incoming
}
}

Operations

Upsert

#![allow(unused)]
fn main() {
pub fn upsert_entity(&mut self, entity: Entity) {
    let id = entity.id;
    let entity_type = entity._type.clone();
    let entity_class = entity._class.clone();
    let connector_id = entity.source.connector_id.clone();

    // Update primary store
    self.entities.insert(id, entity);

    // Update all secondary indices
    self.type_index.entry(entity_type).or_default().push(id);
    self.class_index.entry(entity_class).or_default().push(id);
    self.source_index.entry(connector_id).or_default().push(id);
    // adjacency is updated by upsert_relationship
}
}

Tombstone (Soft Delete)

When the sync protocol determines that an entity was removed from a source:

#![allow(unused)]
fn main() {
pub fn delete_entity(&mut self, id: EntityId) {
    if let Some(entity) = self.entities.get_mut(&id) {
        entity._deleted = true;  // soft delete — remains for snapshot visibility
    }
}
}

Soft-deleted entities are invisible to queries (INV-S08) but remain in memory until the next compaction cycle removes them.

Adjacency Index

The adjacency index enables O(1) neighbor lookups:

#![allow(unused)]
fn main() {
pub fn upsert_relationship(&mut self, rel: Relationship) {
    let rel_id = rel.id;
    let from_id = rel.from_id;
    let to_id = rel.to_id;

    self.relationships.insert(rel_id, rel);

    // Both endpoints track this relationship
    self.adjacency.entry(from_id).or_default().0.push(rel_id); // outgoing
    self.adjacency.entry(to_id).or_default().1.push(rel_id);   // incoming
}
}

This is the index that makes graph traversal fast. Without it, every hop would require a full scan of all relationships.

Flush to Segment

When memtable.approx_bytes() > config.memtable_flush_size (default: 64MB), StorageEngine::maybe_flush() runs:

#![allow(unused)]
fn main() {
fn maybe_flush(&mut self) -> Result<(), StoreError> {
    if self.memtable.approx_bytes() <= self.config.memtable_flush_size {
        return Ok(());
    }

    // Write current MemTable contents to a new .pxs segment
    let segment_path = self.next_segment_path();
    Segment::write(&segment_path, &self.memtable)?;

    // Drain the MemTable: clears entity/rel data, preserves adjacency index
    let new_segment = SegmentRef::open(segment_path)?;
    let drained = self.memtable.drain_to_flush();
    self.segments.push(new_segment);

    // Publish new snapshot pointing to empty MemTable + new segment
    self.publish_snapshot();
    Ok(())
}
}

The drain_to_flush() operation is carefully designed:

  • Entity and relationship data moves to the segment file
  • The adjacency index is preserved (rebuilt from segments during recovery)
  • Secondary indices are cleared (rebuilt from segment scans as needed)

Memory Accounting

#![allow(unused)]
fn main() {
pub fn approx_bytes(&self) -> usize {
    // Rough estimate: sum of entity and relationship sizes
    self.entities.values().map(|e| std::mem::size_of_val(e)).sum::<usize>()
        + self.relationships.values().map(|r| std::mem::size_of_val(r)).sum::<usize>()
}
}

This is an approximation — it counts stack sizes of the structs but not heap-allocated strings. For memory budgeting, assume 2-4× the struct size per entity due to CompactString heap allocations for long strings.

Query Methods

The MemTable exposes index-accelerated query methods used by Snapshot:

#![allow(unused)]
fn main() {
// O(1) lookup
pub fn get_entity(&self, id: EntityId) -> Option<&Entity>;
pub fn get_relationship(&self, id: RelationshipId) -> Option<&Relationship>;

// Index-accelerated scans
pub fn entities_by_type(&self, t: &EntityType) -> Vec<&Entity>;
pub fn entities_by_class(&self, c: &EntityClass) -> Vec<&Entity>;
pub fn entities_by_source(&self, connector_id: &str) -> Vec<&Entity>;
pub fn all_entities(&self) -> impl Iterator<Item = &Entity>;

// Adjacency (O(1) for the lookup, O(degree) for iteration)
pub fn outgoing_relationships(&self, id: EntityId) -> Vec<&Relationship>;
pub fn incoming_relationships(&self, id: EntityId) -> Vec<&Relationship>;
}