OAuth P2: refresh-job circuit breaker per manufacturer
failedImplement BOLT-938. Add per-manufacturer refresh circuit breaker, open on consecutive distinct-user failures, pause refresh jobs, emit OEMRefreshCircuitOpen event/metric, half-open/backoff and replay paused jobs with jitter. Branch talos/oauth-refresh-circuit-breaker.
Event Timeline
created
progress
Migrated task tracking from XENG to BOLT-938 per Victor: XENG board is off-limits/deletion-bound; active tracking is BOLT only.
status_change
queued → in_progress
failed
lease expired — re-queued for retry
in_progress → queued
status_change
queued → in_progress
failed
lease expired — re-queued for retry
in_progress → queued
status_change
queued → in_progress
failed
lease expired — max retries reached, marking failed (poison pill)
in_progress → failed
progress
Heartbeat 2026-05-27 01:50 UTC: resuming BOLT-938 after poison-pill lease; previous checkpoint says gates green and worktree is uncommitted/unpushed, final review/commit/push in progress.
progress
Heartbeat 2026-05-27 01:54 UTC: opened BOLT-938 PR #11197 after local lint, focused queue/worker jest (40/40), and connect-subgraph build passed. Commit bab6490657 pushed; active CI watch set.
progress
Heartbeat 2026-05-27 02:01 UTC: PR #11197 CI is all green (build/test/Apollo/path-filter pass; CodeRabbit pass/skipped; Cursor Bugbot/claude skipped). Moving to awaiting review/merge.
progress
Heartbeat 2026-05-27 02:08 UTC: fixed PR #11197 CodeRabbit/Cursor findings by persisting manufacturer failure IDs in Redis, delaying rather than removing paused jobs, isolating circuit-open errors from termination, and rescheduling failed half-open probes. Local lint, 43/43 focused tests, and build pass; pushed 8d9ec274eb and CI watch active.