ALL-681: SCADA/grid enrichment for monitoring index

completed

Priority: 0

Branch: wintermute/all-681-scada-monitoring-enrichment

PR: #11239

Implement SCADA/grid telemetry monitoring enrichment assuming PR #11228 merges. Create branch wintermute/all-681-scada-monitoring-enrichment stacked on seph/monitor-detected-priority-topics if needed. Add SCADA identity to telemetry.ingest, preserve device-backed telemetry, index SCADA telemetry without texture-devices match, enrich from grid-elements/ScadaGridNode where available, add mapping/tests, open PR assigned to sephcoster. Slack thread 1779915195.603589.

Event Timeline

created

progress

PR #11239 initial push (7c8e8b3) went red ~01:07Z: test-oem-adapter failed with TS2322 on vecScadaHandler.ts:451 because the published @texturehq/event-bus 10.22.0 (what adapters/oem resolves to) does not yet include the new scadaNode.gsGuid field; only the in-repo source schema does. DeepSource JS + coverage also red as downstream. Heartbeat (13:15Z) detected the red state, dropped scadaNode.gsGuid from the producer (device-consumer already falls back to event.externalId, so enrichment unchanged), confirmed local yarn lint clean + yarn tsc -p tsconfig.prod.json --noEmit clean, pushed 8c094f8. New CI run kicked off; resuming watch.

progress

Heartbeat 13:53Z verified PR #11239 CI state after the 13:13Z push (8c094f8). Outcome: all GitHub Actions gates green (test, test-device-domain, apollo), Cursor Bugbot ✅, CodeRabbit ✅. The only remaining red is DeepSource (JavaScript + Test coverage on domains/device) — both report 'Blocking issues or failing metrics found' but the PR inline report card is grade A across all axes with 0 inline issues and 90.7% new-code coverage. Verified the same two gates are failing on PR #11228 base tip (b457aec) with the identical pattern, so the DeepSource red is inherited from the base branch and not introduced by #11239. Posted honest 12h-of-red retrospective into Slack thread 1779915195.603589 (msg ts 1779976638.526339). PR state OPEN, mergeStateStatus UNSTABLE, awaiting #11228 merge. Stopping CI watch.

progress

Pushed c4554a4 addressing Seph review M2/M3/M6: gsGuid fallback warn+metric, toElementGuid normalize+tests, event-bus patch bump. CI green on the GH Actions side (test-device-domain, test/test, build, apollo). DeepSource red is inherited pattern from #11228 base, already tolerated on main. Posted Slack reply in #wintermute-seph thread 1779915195.603589 with recommendations: critical=(a) fail-closed/backfill workspaceId, M1=sketched mixed-batch isolation (await go-ahead), M4=triage done one real coupling in resolveSeriesIds.ts, M5=not in scope will file follow-up. Awaiting decisions.

progress

Heartbeat 16:31Z: Slack thread 1779915195.603589 advanced 16:13–16:21Z with Seph reaching agreement on critical (a) workspace-scoping fix. Plan: (1) optional workspaceId on event-bus ScadaNode + VEC handler config + telemetry.ingest.scadaNode schema, (2) monitor consumer prefer scadaNode.workspaceId then grid-elements fallback then fail-closed drop+metric, (3) tests, (4) ship in PR #11239. Parent committed at 16:21:45Z. Spawned sub-agent all_681_workspace_id_patch (runId 49d76682) to execute on branch wintermute/all-681-scada-monitoring-enrichment from c4554a4. Sub-agent will pre-push lint+tsc, push additive commit, watch CI, post one thread status reply, log fleet-task progress event with final SHA. M1/M4/M5 deferred per parent decision.

progress

Pushed 7d3364b576 to PR #11239: optional SCADA workspaceId propagation. Changes: event-bus ScadaNode.workspaceId optional, VEC S3 target handler/admin UI optional workspaceId, monitoring consumer propagates it, SCADA monitor docs prefer event workspaceId then grid-elements workspaceId and fail closed with metric if unresolved. Local checks from sub-agent: event-bus build; device targeted jest 36 passed + build; adapters/oem vecScadaHandler jest 5 passed + tsc; apps/admin S3PollDialogs vitest 20 passed + tsc. Post-push CI started and is pending.

progress

Heartbeat 17:08Z: Sub-agent all_681_workspace_id_patch returned completed but left a dangling commitment. Sequence on PR #11239: 7d3364b (workspaceId propagation) → 2a29fa1 (harden coverage) → 706c275 (normalize fallback ids) → Seph asked 17:02:04Z about splitting handler changes into a separate PR → sub-agent agreed at 17:02:35Z and pushed 0025d4f7 reverting the workspaceId propagation out of #11239 → sub-agent ended without opening the promised stacked PR. Current #11239 HEAD is the revert; CI re-running. Spawned follow-up sub-agent all_681_open_stacked_workspaceid_pr (runId pending) to create wintermute/all-681-scada-workspaceid-propagation stacked on #11239, apply revert-the-revert, push, open PR assigned to sephcoster, post single Slack reply with PR URL, and log progress event.

progress

Split optional VEC SCADA workspaceId handler/config changes out of PR #11239 per Seph request. Pushed revert commit 0025d4f77a to #11239 so it is back to grid-elements enrichment scope. Created stacked branch wintermute/all-681-scada-workspaceid-config and PR #11274 assigned to sephcoster, base wintermute/all-681-scada-monitoring-enrichment. Validation after split: yarn --cwd domains/device jest monitoring bulk-writer/document-builder/telemetry-ingest tests => 26 passed.

progress

Split complete: reverted workspaceId propagation commits from #11239 (0025d4f77a), opened stacked PR #11275 (wintermute/all-681-scada-workspaceid-propagation) with the inverse — adds optional workspaceId to event-bus schema, VEC SCADA S3 handler, and monitoring enrichment. All lint/tsc clean. CI running. PR: https://github.com/TextureHQ/mono/pull/11275

progress

Split optional VEC SCADA workspaceId handler/config propagation out of PR #11239 per Seph request. #11239 head 0025d4f77a is now core SCADA grid-elements monitoring enrichment only; all GitHub Actions and Cursor Bugbot green, reviewDecision REVIEW_REQUIRED / mergeState BLOCKED for review. New stacked PR #11274 head 954f065260 carries event-bus scadaNode.workspaceId, VEC handler/admin config, monitor workspaceId preference/fail-closed; all GitHub Actions and Cursor Bugbot green. Slack thread update from subagent was blocked by cross-context messaging after prior thread update ts 1779988129.271689.

progress

fix(scada): aligned gsGuid fallback operator ?? -> || in scadaElementGuid. Commit 9aabe30266 pushed to wintermute/all-681-scada-monitoring-enrichment. Jest: 3 passed/3 total (gridElementsEnricher.test.ts, including new empty-string gsGuid fallback test). Lint: clean. TSC: no errors in modified files (pre-existing unrelated errors in integration test remain). CI gates: push complete, awaiting GH Actions / DeepSource / Cursor Bugbot evaluation on new commit.

progress

CI update for commit 9aabe30266 on PR #11239. GH Actions: ALL PASS (Lint, Type Check, Build, Unit Tests, test-device-domain, test/test x2, deploy/Deploy admin, build-docker x2, apollo-schema-check). DeepSource JS (apps/admin, domains/device): fail - pre-existing from base PR #11228 pattern, not introduced. DeepSource test coverage domains/device: fail - pre-existing. Cursor Bugbot: still pending review. Jest locally: 3/3 passed. Fix is clean and minimal - single operator change ?? -> || in scadaElementGuid.

progress

Addressed latest Cursor Bugbot comment on PR #11239 (grid enrichment errors abort flush of unrelated device events). Pushed 064a079e26: texture-devices enrichment remains critical path; grid-elements enrichment only runs for SCADA-keyed batches and degrades to empty grid map on MonitoringGridEnrichmentLookupError. Device-state batches no longer call grid enrichment; SCADA events with workspaceId write fallback docs, otherwise fail closed. Local validation: yarn --cwd domains/device jest src/monitoring/es/bulk-writer.test.ts --runInBand --detectOpenHandles --forceExit => 19 passed; yarn --cwd domains/device lint src/monitoring/es/bulk-writer.ts src/monitoring/es/bulk-writer.test.ts => clean. CI restarted.

progress

Terminal CI state for PR #11239 head a391fefc1b (build-docker TS2339 fix). GitHub Actions: ALL PASS (Lint, Type Check, Unit Tests, test-device-domain, test/test x2, Build, build-docker x2, deploy/Deploy admin, apollo-schema-check). CodeRabbit: pass (skipped). Cursor Bugbot: pass. DeepSource: JS (apps/admin) fail, JS (domains/device) fail, coverage (domains/device) fail — same inherited pattern as base PR #11228 + main (already disclosed in Slack thread 1779915195.603589 at 13:57Z and 19:21Z; PR inline DeepSource report card still grade A, 0 inline issues, 90.7% new-code coverage). PR state OPEN, mergeStateStatus BLOCKED, reviewDecision REVIEW_REQUIRED — awaiting Seph review. No new external info → no Slack post per PR-thread gate. Stopping CI watch.

progress

Investigated admin CI failure on PR #11239. Failing admin Type Check was ../../packages/app-shell/src/AppShellWrapper.tsx cannot find next/navigation, caused by branch including legacy apps/website workspace removal from a prior merge while origin/main still has apps/website; lockfile/workspace drift broke admin typecheck. Rebased #11239 cleanly onto current origin/main, reran local checks: yarn --cwd apps/admin typecheck passed; yarn --cwd domains/device jest src/monitoring/es/bulk-writer.test.ts --runInBand --detectOpenHandles --forceExit => 19 passed; yarn --cwd domains/device lint src/monitoring/es/bulk-writer.ts src/monitoring/es/bulk-writer.test.ts => clean. Force-pushed rebased branch at 6a3b767624; CI restarted.

progress

Admin CI was still failing at apps/admin Type Check with packages/app-shell/src/AppShellWrapper.tsx cannot find next/navigation. Reproduced in a clean CI-style checkout: root install + apps/admin install + apps/admin typecheck fails because TypeScript resolves @texturehq/app-shell source from packages/app-shell, and next was only installed under apps/admin/node_modules. Fix: commit 12b26bd7e5 adds next 15.5.9 as app-shell devDependency and updates root yarn.lock. Local verification: yarn --cwd apps/admin typecheck passed, then yarn --cwd apps/admin tsc --noEmit passed. Pushed; CI restarted.

status_change

in_progress → completed