Add gpt-5.5 thread-bound thinking-leak mitigation rule to HEARTBEAT.md
completedAgent: slava-agent
Priority: 2
From 2026-06-16 self-reflection. Atomic HEARTBEAT.md edit. Workspace-internal (no PR).
ADD a new doctrine subsection to HEARTBEAT.md adjacent to the existing thread-bound sub-agent post-CI-only rule (c19e8021 06-11). Title: 'gpt-5.5 thread-bound thinking-leak mitigation.'
CONTENT:
When a sub-agent dispatch is thread-bound to a Slack DM channel AND the model is openai/gpt-5.5 (or any model observed to leak internal commentary as user-visible text messages), the dispatch task description MUST include this explicit constraint:
CRITICAL: this is a thread-bound Slack DM dispatch on a model with known thinking-leak behavior.
- Post ONE post-CI status DM per push, AFTER the first relevant CI check returns.
- Do NOT post intermediate 'thinking-style' assistant text to the thread (e.g. 'Need X.', 'Checking Y.', 'Wait for Z.').
- All intermediate state goes through tool calls or fleet-task progress events, never free-form assistant text.
- Reserve assistant text outputs in the thread for: (a) the final post-CI status DM, (b) explicit error/blocker reports.
This is upstream of and complementary to the existing thread-bound post-CI-only rule (c19e8021): that rule covers cadence; this rule covers content/leakage.
CASE STUDY (2026-06-15T15:04:48Z–15:05:27Z): dda37f3f sub-agent on BOLT-1277 PR #12253 ran on openai/gpt-5.5 model. Between tool calls during CI watch, the sub-agent posted four short assistant text messages to the Slava DM thread:
- 'Need wait apollo.'
- 'Need unresolved threads? reviewDecision.'
- 'Need clear active-ci-watch. patch fleet task complete.'
- (plus one earlier 'Need ...' line)
These appeared to be internal scratchpad / planning-style outputs that escaped to the user-visible Slack channel before the final clean 'Done. Removed the migration. Pushed 2b7c8e7. CI green...' reply at 15:05:35Z. The leak is invisible to the parent agent at heartbeat-time (sessions_history reveals it); Slava saw four short interruptions in the thread before the real reply.
The Claude/Opus-driven sessions did not exhibit this behavior in any of the prior ~16 thread-bound dispatches. This is a gpt-5.5-specific mitigation seed.
ATOMIC STEPS:
1. Read current HEARTBEAT.md.
2. Locate the thread-bound sub-agent post-CI-only section (c19e8021).
3. Insert the new subsection adjacent.
4. Verify edit clean.
5. PATCH this task to completed via bin/fleet-task-patch.sh with file:line of the new subsection.
Event Timeline
created
status_change
queued → completed