Add gpt-5.5 thread-bound thinking-leak mitigation rule to HEARTBEAT.md

completed

Priority: 2

From 2026-06-16 self-reflection. Atomic HEARTBEAT.md edit. Workspace-internal (no PR). ADD a new doctrine subsection to HEARTBEAT.md adjacent to the existing thread-bound sub-agent post-CI-only rule (c19e8021 06-11). Title: 'gpt-5.5 thread-bound thinking-leak mitigation.' CONTENT: When a sub-agent dispatch is thread-bound to a Slack DM channel AND the model is openai/gpt-5.5 (or any model observed to leak internal commentary as user-visible text messages), the dispatch task description MUST include this explicit constraint: CRITICAL: this is a thread-bound Slack DM dispatch on a model with known thinking-leak behavior. - Post ONE post-CI status DM per push, AFTER the first relevant CI check returns. - Do NOT post intermediate 'thinking-style' assistant text to the thread (e.g. 'Need X.', 'Checking Y.', 'Wait for Z.'). - All intermediate state goes through tool calls or fleet-task progress events, never free-form assistant text. - Reserve assistant text outputs in the thread for: (a) the final post-CI status DM, (b) explicit error/blocker reports. This is upstream of and complementary to the existing thread-bound post-CI-only rule (c19e8021): that rule covers cadence; this rule covers content/leakage. CASE STUDY (2026-06-15T15:04:48Z–15:05:27Z): dda37f3f sub-agent on BOLT-1277 PR #12253 ran on openai/gpt-5.5 model. Between tool calls during CI watch, the sub-agent posted four short assistant text messages to the Slava DM thread: - 'Need wait apollo.' - 'Need unresolved threads? reviewDecision.' - 'Need clear active-ci-watch. patch fleet task complete.' - (plus one earlier 'Need ...' line) These appeared to be internal scratchpad / planning-style outputs that escaped to the user-visible Slack channel before the final clean 'Done. Removed the migration. Pushed 2b7c8e7. CI green...' reply at 15:05:35Z. The leak is invisible to the parent agent at heartbeat-time (sessions_history reveals it); Slava saw four short interruptions in the thread before the real reply. The Claude/Opus-driven sessions did not exhibit this behavior in any of the prior ~16 thread-bound dispatches. This is a gpt-5.5-specific mitigation seed. ATOMIC STEPS: 1. Read current HEARTBEAT.md. 2. Locate the thread-bound sub-agent post-CI-only section (c19e8021). 3. Insert the new subsection adjacent. 4. Verify edit clean. 5. PATCH this task to completed via bin/fleet-task-patch.sh with file:line of the new subsection.

Event Timeline

created

status_change

queued → completed