State Recording & Replay Infrastructure

The sim’s snapshottable design (D010) enables a StateRecorder/Replayer pattern for asynchronous background recording — inspired by Valve’s Source Engine StateRecorder/StateReplayer pattern (see research/valve-github-analysis.md § 2.2). The game loop records orders and periodic state snapshots to a background writer; the replay system replays them through the same Simulation::apply_tick() path.

StateRecorder (Recording Side)

#![allow(unused)]
fn main() {
/// Orchestrates background recording of game state to `.icrep` files.
/// Tick-order frames and keyframe snapshot blobs are sent as separate
/// messages to a `BackgroundReplayWriter` (see `network-model-trait.md`
/// § Background Replay Writer). The game thread produces per-tick
/// `ReplayTickFrame`s (cheap — just orders + hash) and periodic
/// keyframe blobs (more expensive — see cost model below).
///
/// Lives in ic-game (I/O concern, not sim concern — Invariant #1).
pub struct StateRecorder {
    /// Background writer that drains tick frames and keyframe blobs,
    /// performs LZ4 compression, and appends to the `.icrep` file.
    /// Crash-safe: Fossilize append-safe pattern (see D010).
    writer: BackgroundReplayWriter,
    /// Keyframe cadence: a keyframe blob is produced every this many ticks
    /// (default: 300 — ~20 seconds at the Slower default of ~15 tps).
    keyframe_interval: u64,
    /// Full snapshot cadence: every Nth keyframe is a full `SimSnapshot`
    /// instead of a `DeltaSnapshot` (default: N=10, i.e., every 3000 ticks).
    full_keyframe_every_n: u64,
    /// Baseline for delta keyframes — the last full SimCoreSnapshot.
    last_full_snapshot: SimCoreSnapshot,
    /// Last-known campaign state for delta comparison (None if no campaign active).
    last_campaign_state: Option<CampaignState>,
    /// Last-known script state for delta comparison (None before first keyframe).
    last_script_state: Option<ScriptState>,
}
}

Per-tick recording (every tick): ic-game constructs a ReplayTickFrame { tick, state_hash, orders } and calls writer.record_tick(frame). Cost: negligible (the frame is a small struct; the background writer handles serialization and I/O).

Keyframe recording (every keyframe_interval ticks): ic-game produces the keyframe on the game thread in two steps — (1) Simulation::delta_snapshot(&self.last_full_snapshot) extracts a SimCoreDelta, (2) ic-game compares current campaign/script state against the recorder’s last_campaign_state / last_script_state baselines and composes the full DeltaSnapshot (or SimSnapshot at full-keyframe cadence), including campaign/script state only if changed since the respective baselines. The composed snapshot is serialized to Vec<u8> and passed to writer.record_keyframe(tick, is_full, blob). LZ4 compression and file I/O happen asynchronously on the background writer thread. After recording, the recorder updates its baselines: on full keyframes, all three baselines reset; on delta keyframes, only last_campaign_state and last_script_state are updated if they were included.

Game-thread cost model: Keyframe production costs ~1–1.5 ms per delta keyframe (every 300 ticks / ~20 seconds at Slower default) and ~2–3 ms per full keyframe (every 3000 ticks / ~200 seconds at Slower default) for a 500-unit game. This is well within the 67 ms tick budget (Slower default). The per-tick record_tick() call adds < 0.1 ms. LZ4 compression and disk I/O are fully async. See formats/save-replay-formats.md § Keyframe serialization threading for the full three-phase breakdown.

Per-Field Change Tracking (from Source Engine CNetworkVar)

To support delta snapshots efficiently, the sim uses per-field change tracking — inspired by Source Engine’s CNetworkVar system (see research/valve-github-analysis.md § 2.2). Each ECS component that participates in snapshotting is annotated with a #[track_changes] derive macro. The macro generates a companion bitfield that records which fields changed since the last snapshot. Delta serialization then skips unchanged fields entirely.

#![allow(unused)]
fn main() {
/// Derive macro that generates per-field change tracking for a component.
/// Each field gets a corresponding bit in a compact `ChangeMask` bitfield.
/// When a field is modified through its setter, the bit is set.
/// Delta serialization reads the mask to skip unchanged fields.
///
/// Components with SPROP_CHANGES_OFTEN (position, health, facing) are
/// checked first during delta computation — improves cache locality
/// by touching hot data before cold data. See `10-PERFORMANCE.md`.
#[derive(Component, Serialize, Deserialize, TrackChanges)]
pub struct Mobile {
    pub position: WorldPos,        // changes every tick during movement
    pub facing: FixedAngle,        // changes every tick during turning
    pub speed: FixedPoint,         // changes occasionally
    pub locomotor_type: Locomotor, // rarely changes
}

// Generated by #[derive(TrackChanges)]:
// impl Mobile {
//     pub fn set_position(&mut self, val: WorldPos) {
//         self.position = val;
//         self.change_mask |= 0b0001;
//     }
//     pub fn change_mask(&self) -> u8 { self.change_mask }
//     pub fn clear_changes(&mut self) { self.change_mask = 0; }
// }
}

SPROP_CHANGES_OFTEN priority (from Source Engine): Components that change frequently (position, health, ammunition) are tagged and processed first during delta encoding. This isn’t a correctness concern — it’s a cache locality optimization. By processing high-churn components first, the delta encoder touches frequently-modified memory regions while they’re still in L1/L2 cache. See 10-PERFORMANCE.md for performance impact analysis.

Crash-Time State Capture

When a desync is detected (hash mismatch via report_sync_hash()), the system automatically captures a full state snapshot before any error handling or recovery:

#![allow(unused)]
fn main() {
/// Called by ic-game when a sync hash mismatch is detected.
/// Captures full composite state immediately — before the sim advances
/// further — so the exact divergence point is preserved for offline
/// analysis, including script-managed state that might be the root cause.
fn on_desync_detected(
    sim: &Simulation,
    script_engine: &ScriptEngine,   // ic-script handle (owned by ic-game)
    campaign: &CampaignState,
    tick: u64,
    local_hash: u64,
    remote_hash: u64,
) {
    // 1. Immediate sim core snapshot (SimCoreSnapshot — sim-internal state)
    let core = sim.snapshot();
    // 2. Collect external state so the dump includes script/campaign data.
    //    This mirrors the full SimSnapshot composition path used by saves
    //    and replay keyframes — if divergence is rooted in script state,
    //    the dump captures it.
    let script_state = script_engine.snapshot_all();
    let full = SimSnapshot {
        core,
        campaign_state: Some(campaign.clone()),
        script_state: Some(script_state),
    };
    // 3. Write to crash dump file (same Fossilize append-safe pattern)
    write_crash_dump(tick, local_hash, remote_hash, &full);
    // 4. If Merkle tree is available, capture the tree for
    //    logarithmic desync localization (see 03-NETCODE.md)
    if let Some(tree) = sim.merkle_tree() {
        write_merkle_dump(tick, &tree);
    }
    // 5. Continue with normal desync handling (reconnect, notify user, etc.)
}
}

This ensures desync debugging always has a snapshot at the exact point of divergence — not N ticks later when the developer gets around to analyzing it. The pattern comes from Valve’s Fossilize (crash-safe state capture, see research/valve-github-analysis.md § 3.1) and OpenTTD’s periodic desync snapshot naming convention (desync_{seed}_{tick}.snap).

ML Training Data Extraction

The same keyframe snapshots and order streams that power replay playback also serve as the foundation for AI training data extraction. The StateRecorder’s output (.icrep files with tick-order frames + periodic keyframe snapshots) can be processed offline into structured training datasets:

Keyframe snapshots → game state observations. Each SimCoreSnapshot or DeltaSnapshot provides the full game state at a point in time. The extraction pipeline applies fog-of-war filtering (same FogFilteredView as live gameplay) to produce per-player observations.
Tick-order frames → action labels. The ReplayTickFrame { tick, state_hash, orders } recorded every tick provides the ground truth for “what the player did at tick T.”
Analysis events → temporal context. The optional analysis event stream (HAS_EVENTS flag) provides unit lifecycle, economy snapshots, and camera tracking between keyframes — enriching training samples with recent event history.
Keyframe seeking → arbitrary tick reconstruction. The keyframe index enables binary-search seeking to any point in the match, then forward simulation to the exact target tick. This means training data can be extracted at any stride without re-simulating from tick 0.

Training extraction operates read-only on .icrep files — no sim modification, no recording changes. The extraction pipeline is a consumer of the same replay format that powers spectator replay, desync debugging, and save games.

See research/ml-training-pipeline-design.md for the complete training pair schema, Parquet export format, headless self-play generation, and the ic training generate / ic training export CLI specification.

Keyboard shortcuts

Iron Curtain — Design Documentation