Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Property-Based Testing Specifications (proptest)

Each property is a formal invariant verified across thousands of randomly generated inputs. Properties that fail produce a minimal counterexample for debugging.

PropertyGeneratorInvariant AssertionShrink TargetTier
Sim determinismRandom seed × random order sequence (up to 200 orders over 500 ticks)Two runs with identical seed+orders produce identical state_hash() at every tickMinimal divergent tick + minimal order sequenceT3
Order validation purityRandom PlayerOrder × random SimStatevalidate_order() never mutates sim state (hash before == hash after)Minimal order type that causes mutationT3
Order validation totalityRandom PlayerOrder with arbitrary field valuesvalidate_order() always returns OrderValidity — never panics, never hangsMinimal panicking orderT3
Snapshot round-trip identityRandom sim state after N random ticksrestore(snapshot(state)) produces state_hash() identical to originalMinimal divergent componentT3
Delta snapshot correctnessRandom sim state + random mutationssim.apply_delta(&sim.delta_snapshot(&baseline)) on a clone restored from baseline produces state_hash() identical to current stateMinimal mutation set that breaks deltaT3
Composite snapshot round-trip (GameRunner)Random sim state + random CampaignState + random ScriptState after N ticksGameRunner::restore_full(SimSnapshot { core, campaign, script }) produces identical state_hash(), identical campaign graph, and script VMs return same values via on_serialize()Minimal divergent composite field (campaign flag, Lua variable)T3
Composite delta round-trip (GameRunner)Random sim state + random campaign/script mutations across tick rangesGameRunner::apply_full_delta(DeltaSnapshot { core, campaign, script }) on top of a restored full snapshot produces state identical to the original — verified across all three sub-statesMinimal composite delta that fails to reconstructT3
Autosave composite fidelityRandom game state with active campaign + Lua scripts, autosave triggered.icsave file loaded via GameRunner::restore_full() produces identical sim hash, campaign state, and script state as the game thread at the autosave tickMinimal campaign/script state that diverges after save-loadT2
Fixed-point arithmetic closureRandom FixedPoint × FixedPoint for add/sub/mul/divResult stays within i32 range; no silent overflow; division by zero returns errorMinimal overflow pairT3
Pathfinding completenessRandom map topology × random start/end where path existsPathfinder always returns a path if one exists (checked against BFS ground truth)Minimal topology where pathfinder failsT3
Pathfinding determinismRandom map × random start/end × two runsIdentical path output for identical inputMinimal map where paths divergeT3
Workshop dependency resolution terminationRandom dependency graphs (1–100 packages, 0–10 deps each)Resolver terminates within bounded time; returns valid order or error; no infinite loopMinimal graph that causes non-terminationT3
Campaign DAG validityRandom mission graphs (1–50 missions, 1–5 outcomes each)CampaignGraph::new() accepts iff acyclic, fully reachable, no dangling refsMinimal invalid graph accepted or valid graph rejectedT3
UnitTag generation safetyRandom pool operations (alloc/free sequences, 10K ops)No two live units ever share the same UnitTag; stale tags always resolve to NoneMinimal sequence producing tag collisionT3
Chat scope isolationRandom chat messages × random scope assignmentsChatMessage<TeamScope> is never delivered to non-team recipientsMinimal routing violationT2
BoundedVec overflow safetyRandom push/pop sequences against BoundedVec<T, N>Length never exceeds N; push beyond N returns Err; no panicMinimal violating sequenceT1
BoundedCvar range enforcementRandom set() calls with values across full T rangeget() always returns value within [min, max]; no value escapes boundsMinimal value that escapes boundsT1
Merkle tree consistencyRandom component mutations × tree rebuildRoot hash changes iff at least one leaf changed; unchanged leaves produce same hashMinimal mutation where root hash is wrongT3
Weather schedule determinismRandom weather configurations × two sim instancesWeather state identical at every tick across instances with same seedMinimal divergent configT2
Anti-cheat NaN pipeline guardRandom f64 sequences (incl. NaN, Inf, subnormal) fed to all anti-cheat scoring paths (EWMA, behavioral_score, TrustFactors, PopulationBaseline)No output field is ever NaN or Inf; NaN inputs produce fail-closed sentinel values (1.0 for suspicion scores, population median for trust factors)Minimal input that produces NaN in any output fieldT3
WASM timing oracle resistanceRandom spatial query inputs × random fog configurations (0–100% fogged entities in query region)ic_query_units_in_range() execution time does not vary beyond ±5% based on fogged entity count (measured over 1000 iterations per configuration; timer resolution ≥ microsecond)Minimal fog configuration where timing variance exceeds thresholdT3
Replay network isolationRandom replay file × random embedded YAML with external URLsDuring SelfContained replay playback, zero network I/O syscalls are issued; all external asset references resolve to placeholderMinimal replay content that triggers network accessT2
Key rotation sequence monotonicityRandom concurrent rotation attempts × random timingrotation_sequence_number is strictly monotonically increasing; no two rotations share a sequence number; cooldown-violating rotations are rejected except EmergencyMinimal concurrent rotation pair that violates monotonicityT2
TOFU connection policy correctnessRandom key state (match/mismatch/first-connect/rotation-chain) × random match context (ranked/unranked/LAN)Ranked rejects key mismatch without valid rotation chain; ranked first-connect requires seed list or manual trust; unranked TOFU-accepts with warning; LAN always warns; valid rotation chain updates cacheMinimal context where wrong connection policy is appliedT2

proptest configuration: 256 cases per property in T1/T2 (PR gate speed), 10,000 cases in T3 (nightly thoroughness). Regression files committed to repository — discovered failures are replayed in T1 forever.

API Misuse Test Matrix

Systematic tests derived from the API misuse analysis in architecture/api-misuse-defense.md. Each test verifies that a specific misuse vector is blocked by either the type system (compile-time) or runtime validation.

Compile-Time Defense Verification

These defenses do not require runtime tests. Some are enforced directly by the Rust type system (borrow checker, !Sync auto-trait); others rely on code review and monitoring to ensure invariants are not weakened by a refactor. The “Monitoring” column specifies how each defense is maintained — only defenses monitored by cargo check or clippy will produce automatic CI failures if removed.

DefenseMechanismWhat Would Break ItMonitoring
S5: ReconcilerToken prevents unauthorized corrections_private: () fieldMaking field pub or adding Default deriveCode review checklist
S8: Simulation is !SyncContains Bevy World (!Sync via UnsafeCell)Adding unsafe impl Sync or replacing World with a Sync containerclippy + code review
O6: OrderBudget unconstructible externally_private: () fieldMaking inner fields pubCode review checklist
O7: Verified<PlayerOrder> restricted constructionpub(crate) on new_verified()Changing to pubCode review checklist
O7b: StructurallyChecked<T> restricted constructionpub(crate) on new() + _private: ()Making new() pub or adding Default deriveCode review checklist
W1: WasmTerminated has no execute()Typestate patternAdding execute() to terminated stateCode review + trait audit
W7: FsReadCapability unconstructible externally_private: () fieldMaking field pubCode review checklist
P1: Workshop extract() requires PkgVerifyingTypestate consumes selfAdding extract() to PkgDownloadingCode review + trait audit
C1: MissionLoading has no complete()Typestate patternAdding complete() to loading stateCode review + trait audit
B4: Read buffer immutabilityread() returns &TReturning &mut T from read()Code review checklist
N7: SyncHashStateHashDistinct newtypes, no From implAdding From<SyncHash> for StateHashclippy + code review
M1: Chat scope brandingChatMessage<TeamScope>ChatMessage<AllScope>Adding From<ChatMessage<TeamScope>> for ChatMessage<AllScope>Code review checklist

Runtime Defense Test Specifications

Tests verifying runtime defenses against misuse vectors. Each test has a specific assertion, exact pass/fail criteria, and measurement metric.

IDMisuse VectorTest MethodExact AssertionMeasurement MetricTier
S1Future-tick ordersCall apply_tick(tick=N+2) when sim is at tick NDebug: panics (debug_assert). Release: returns Err(SimError::TickMismatch { expected: N, got: N+2 })Panic in debug build; Err variant + field values in releaseT1
S2Duplicate orders in one tickReplay with same order injected twice in one TickOrders batchSecond copy rejected by in-sim order validation (e.g., duplicate build on same cell); ValidatedOrder consumed onceSecond order has no effect; sim state identical to single-order runT2
S3Cross-game snapshot restoreSimulation::restore() with snapshot from different seedReturns Err(SimError::ConfigMismatch)game_seed or map_hash don’t matchErr variant returned, sim state_hash() unchangedT2
S4Corrupted save fileFlip random byte in serialized .icsave payload, load via GameRunner’s file-loading layerFile-loading layer detects payload_hash mismatch, returns Err before reaching Simulation::restore()100 random bit-flips, 100% detection rate at file-loading layerT3
S6Float field in sim crateAttempt to add f32/f64 field to any ic-sim structclippy::disallowed_types lint fails CI; post-deser range validation rejects out-of-bounds FixedPoint valuesCI lint blocks compilation; fuzz: no panics from random bytesT3
S7Unknown player orderinject_orders() with non-existent PlayerId(999)Order rejected with OrderRejectionCategory::Ownership (D012); specific variant is implementation-definedRejection fires; telemetry includes player IDT1
S9Out-of-bounds coordinatesMove order to WorldPos { x: 999999, y: 999999, z: 0 }Order rejected with OrderRejectionCategory::Placement (D012); error includes position and map boundsRejection fires; position and bounds available in errorT1
S10Divergent-baseline deltaSimulation::apply_delta() with delta whose baseline_tick/baseline_hash don’t match current stateReturns Err(SimError::BaselineMismatch); sim state unchangedErr variant returned, sim state_hash() unchangedT2
O1Stale UnitTag after deathKill unit, send attack order targeting dead unit’s tagOrder rejected with OrderRejectionCategory::Targeting (D012); error includes stale tag and current generationGeneration mismatch detected; stale tag not resolvedT1
O2Order rate limitSend 201 orders in one tick (budget=200)First 200 accepted, 201st returns Err(BudgetExhausted)Exact count: accepted=200, rejected=1T2
O3Timestamp manipulationsub_tick_time = 999999999 (far future)Relay clamps to envelope max (e.g., 66667µs)Clamped value ≤ tick_window_us; telemetry event firesT2
O8Oversized unit selectionMove order with 100 UnitTags (max=40)Order rejected with OrderRejectionCategory::Custom (D012, game-module-defined selection cap); error includes count and maxBoth count and max available in errorT1
N2Handshake replayCapture challenge response, replay on new connectionConnection terminated with AuthError::NonceReusedConnection drops within 100ms of replayT2
N6Half-open connection floodOpen 10,000 TCP connections, don’t complete handshakeAll timeout within configured window (default: 5s); relay accepts new connections after cleanupPeak memory < 50MB during flood; recovery < 1sT3
W3WASM memory bombmemory.grow(65536) from WASM moduleGrowth denied; module receives trap; host continuesHost memory unchanged; module terminated cleanlyT3
W5WASM infinite looploop {} in WASM entry pointFuel exhausted; module trapped; host continuesExecution terminates within fuel budget; game tick completesT3
L1Lua string bombstring.rep("a", 2^30)Memory limit hit; script receives error; host continuesHost memory unchanged; script terminatedT3
L2Lua infinite loopwhile true do endInstruction limit hit; script terminatedScript terminates within instruction budgetT3
L3Lua system accessCall os.execute("rm -rf /")Returns nil (function not registered)No side effects on host filesystemT1
L5Lua UnitTag forgeryScript creates tag value for enemy unit, calls host APISandboxError::OwnershipViolation { tag, caller, owner }Error includes all three IDsT3
U1Stale UnitTag resolutionAlloc tag, free slot, resolve original tagUnitPool::resolve() returns NoneGeneration mismatch, no panicT1
U2Pool exhaustionAllocate units beyond pool capacity (2049 for RA1)UnitPoolError::PoolExhausted after 2048thExact count: 2048 succeed, 2049th failsT2
F1Negative health YAMLhealth: { max: -100 } in unit definitionSchemaError::InvalidValue { field: "health.max", value: "-100", constraint: "> 0" }Error includes file path + line numberT1
F2Circular YAML inheritanceA inherits B inherits ARuleLoadError::CircularInheritance { chain: "A → B → A" }Chain string matches cycle pathT1
F3Unknown TOML keyunknwon_feld = true in config.tomlDeserializationError::UnknownField { field: "unknwon_feld", valid: [...] }Error lists available fieldsT1
A1Zip Slip in .oramapEntry path ../../etc/passwd in archivePathBoundaryError::EscapeAttempt { path, boundary }Extract produces zero files outside boundaryT3
A2Truncated .mixHeader claims 47 files, data for 31MixParseError::FileCountMismatch { declared: 47, actual: 31 }Both counts in errorT1

Integration Scenario Matrix

End-to-end scenarios testing multiple systems interacting. Each scenario has explicit setup, action sequence, and verification points.

ScenarioSystems Under TestSetupAction SequenceVerification PointsTier
Full match lifecyclesim + net + replay2-player game, relay network, 5-min scenarioLobby → loading → 1000 ticks → surrender → post-game(1) Replay file exists, (2) replay hash matches live hash, (3) post-game stats match sim queryT2
Reconnection mid-combatsim + net + snapshot2-player game, combat in progress at tick 300P2 disconnects → 200 ticks → P2 reconnects with snapshot → 500 more ticks(1) Snapshot accepted, (2) state hashes match after reconnect, (3) no combat resolution errorsT2
Mod load with conflictsmodding + YAML + simTwo mods overriding rifle_infantry.cost with different valuesLoad profile with explicit priority → start game → build rifle infantry(1) Conflict detected and logged, (2) higher-priority mod wins, (3) cost in game matches winner, (4) fingerprint identical across clientsT3
Workshop install → gameplayWorkshop + sim + moddingPackage with new unit type, dependency on base contentInstall package → resolve deps → load mod → start game → build new unit(1) Deps installed in order, (2) unit definition loaded, (3) unit buildable in game, (4) unit stats match YAMLT4
Campaign transition with rostercampaign + sim + snapshotCampaign with 2 missions, transition on victoryPlay M1 → win with 5 units → transition → verify roster in M2(1) 5 units in M2 roster, (2) health/veterancy preserved, (3) story flags accessibleT2
Chat scope in multiplayerchat + net + relay4-player team game (2v2)P1 sends team chat → P1 sends all-chat → verify delivery(1) Team chat: P1+P2 receive, P3+P4 do not, (2) all-chat: all 4 receive, (3) observer sees all-chat onlyT2
WASM mod with sandbox limitsWASM + sim + moddingMalicious mod attempting memory bomb + file access + infinite loopLoad mod → trigger memory.grow → trigger file access → trigger loop(1) Memory growth denied, (2) file access denied, (3) loop terminated by fuel, (4) game continues normallyT3
Desync detection → diagnosissim + net + Merkle tree2-player game, deliberate single-archetype mutation at tick 500Run to tick 500 → corrupt one archetype on P2 → run to tick 510(1) Desync detected within 10 ticks, (2) Merkle tree identifies exact archetype, (3) diagnosis payload < 1KBT2
Anti-cheat → trust score flowsim + net + telemetry + rankingPlayer with 10 clean games, then 1 flagged gamePlay 10 games cleanly → play 1 game with known-cheat replay pattern(1) Trust score starts high, (2) flagged game triggers score drop, (3) subsequent clean games recover slowlyT4
Save/load during weathersim + weather + snapshotGame with active blizzard at tick 300Save at tick 300 → load → run 500 more ticks(1) Weather state matches, (2) terrain surface conditions match, (3) state hash at tick 800 matches fresh runT3
Console dev-mode flaggingconsole + replay + rankingRanked game, player issues /god_modeStart ranked → exec dev command → complete match → check replay + ranking(1) Dev flag set, (2) replay metadata shows dev-mode, (3) match excluded from ranked standingsT2
Foreign replay importreplay + sim + format.orarep file from OpenRAImport → play back via ForeignReplayPlayback → check divergence(1) Import succeeds, (2) playback runs to completion, (3) divergences logged with tick+archetype detailT3

Measurement & Metrics Framework

Every automated test produces structured output beyond pass/fail. These metrics feed into the release-readiness dashboard.

Performance Metrics (collected per benchmark run)

MetricCollection MethodStorageAlert Threshold
Tick time (p50, p95, p99)criterion statistical analysisBenchmark history DB (SQLite)p99 exceeds budget by >10%
Heap allocations per tickCustom global allocator wrapper counting alloc callsPer-benchmark counterAny allocation in designated zero-alloc path
L1 cache miss rateperf stat / platform performance countersBenchmark log> 5% in hot tick loop
Peak RSS during scenario/proc/self/status sampling at 10ms intervalsBenchmark log> 2× expected for unit count
Pathfinding nodes expandedInternal counter in pathfinderPer-benchmark metric> 2× optimal for known map
Serialization throughputBytes/second for snapshot and replay frame writesBenchmark logRegression > 15%

Correctness Metrics (collected per test suite run)

MetricCollection MethodStorageAlert Threshold
Determinism violationsHash comparison failures across repeated runsTest result DBAny violation is a P0 bug
False positive rate (anti-cheat)flagged_clean / total_clean on labeled corpusCorpus evaluation log> 0.1% (V54 threshold)
False negative rate (anti-cheat)missed_cheat / total_cheat on labeled corpusCorpus evaluation log> 5% (V54 threshold)
Order rejection accuracyCorrect rejection category rate across exhaustive matrixTest result DB< 100% is a bug (categories per D012)
Fuzz coverage (edge/line)cargo-fuzz with --sanitizer=coverageFuzz coverage report< 80% line coverage in target module
Property test case countproptest runner statisticsTest log< configured minimum (256 for T1, 10K for T3)
Snapshot round-trip state identitystate_hash() comparison: snapshot → restore → state_hash()Test result DBAny hash difference is a P0 bug

Security Metrics (collected per security test suite run)

MetricCollection MethodStorageAlert Threshold
Sandbox escape attempts blockedCounter in WASM/Lua hostSecurity test logAny unblocked attempt is a P0 bug
Path traversal attempts blockedStrictPath rejection counter during fuzzFuzz logAny unblocked traversal is a P0 bug
Replay tampering detection rateTampered frames detected / total tampered framesSecurity test log< 100% is a P0 bug
SCR replay attack detection rateReplayed credentials detected / total replaysSecurity test log< 100% is a P0 bug
Rate limit enforcement accuracyOrders dropped when budget exhausted / orders sent beyond budgetTest log< 100% is a bug
Half-open connection cleanup timeTime from flood to full recoveryStress test log> 5 seconds is a bug