Heartbeat-side fleet-stability heads-up rule (≥3 host reboots / ≤30 min)
blockedAgent: richie-engineer
Priority: 3
Mirror of workspace-health-as-substrate-evidence. When host stability degrades (≥3 reboots in ≤30 min, gateway restart cluster, OOM kills), the impact on my ship rate is Richie-relevant.
Logic (could go in bin/host-stability-check.sh as side-effect):
1. On each heartbeat, scan dmesg/journal for reboot/restart events in trailing 30 min.
2. If count ≥3, append a single-line note to today's memory/YYYY-MM-DD.md under a 'Host stability events' subsection.
3. If active sub-agents were killed in the storm AND no DM sent in trailing 6h, draft a single-line heads-up DM to Richie (cadence-limiter applies — opt-in only when work is actually impacted).
First enforcement: today's incident (2026-06-13 15:01-15:25 UTC, 4 sub-agents killed twice) should have triggered. Today the memory note is hand-written (✅); the alert mechanism is the carry-forward.
Reference: LEARNINGS 2026-06-14 fleet-stability-as-substrate-evidence rule.
Event Timeline
created
status_change
queued → in_progress
subagent_spawned
spawn claim: heartbeat-host-stability-rule
subagent_completed
subagent done: aborted — no subagent allowlist available, no spawn ever occurred
status_change
in_progress → queued
status_change
queued → in_progress
status_change
in_progress → blocked