You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes stale frontend workflow execution state when fast workflows complete before all socket events are reflected in the workflow editor.
This is best done with a formal state model, so this PR adds a small workflow execution state model for queue and invocation events. Socket events now pass through the model before mutating node execution state, and completed workflow queue items are reconciled from the authoritative persisted session.
Area
State
Event
Result
Queue
null
queue_item_status_changed: pending
Apply pending
Queue
null
queue_item_status_changed: in_progress
Apply in_progress
Queue
null
queue_item_status_changed: completed
Apply completed
Queue
null
queue_item_status_changed: failed
Apply failed
Queue
null
queue_item_status_changed: canceled
Apply canceled
Queue
pending
queue_item_status_changed: in_progress
Apply in_progress
Queue
pending
queue_item_status_changed: completed
Apply completed
Queue
pending
queue_item_status_changed: failed
Apply failed
Queue
pending
queue_item_status_changed: canceled
Apply canceled
Queue
in_progress
queue_item_status_changed: completed
Apply completed
Queue
in_progress
queue_item_status_changed: failed
Apply failed
Queue
in_progress
queue_item_status_changed: canceled
Apply canceled
Queue
completed
queue_item_status_changed: pending
Ignore stale event
Queue
completed
queue_item_status_changed: in_progress
Ignore stale event
Queue
failed
queue_item_status_changed: pending
Ignore stale event
Queue
failed
queue_item_status_changed: in_progress
Ignore stale event
Queue
canceled
queue_item_status_changed: pending
Ignore stale event
Queue
canceled
queue_item_status_changed: in_progress
Ignore stale event
Invocation
unknown
invocation_started
Apply in_progress
Invocation
unknown
invocation_progress
Apply in_progress
Invocation
unknown
invocation_complete
Apply completed
Invocation
unknown
invocation_error
Apply failed
Invocation
in_progress
invocation_complete
Apply completed
Invocation
in_progress
invocation_error
Apply failed
Invocation
completed
invocation_started
Ignore stale event
Invocation
completed
invocation_progress
Ignore stale event
Invocation
completed
invocation_error
Ignore stale event
Invocation
failed
invocation_started
Ignore stale event
Invocation
failed
invocation_progress
Ignore stale event
Invocation
failed
invocation_complete
Ignore stale event
Queue completed
invocation not terminal
invocation_complete
Apply completed
Queue completed
any invocation
invocation_started
Ignore stale event
Queue completed
any invocation
invocation_progress
Ignore stale event
Queue completed
any invocation
invocation_error
Ignore stale event
Queue failed
invocation not terminal
invocation_error
Apply failed
Queue failed
any invocation
invocation_started
Ignore stale event
Queue failed
any invocation
invocation_progress
Ignore stale event
Queue failed
any invocation
invocation_complete
Ignore stale event
Queue canceled
any invocation
any invocation event
Ignore stale event
Reconciliation
completed queue item
completed_session_reconciled
Mark queue completed; mark persisted prepared invocation IDs completed; rebuild node outputs from session
Related Issues / Discussions
This attempts to finally resolve issues with execution state partially resolved in #9043 and others.
QA Instructions
Run tests from invokeai/frontend/web:
pnpm exec vitest run src/services/events/workflowExecutionState.test.ts src/services/events/nodeExecutionState.test.ts src/services/events/invocationTracking.test.ts
Merge Plan
Normal merge.
Checklist
The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)
High:invokeai/frontend/web/src/services/events/setEventListeners.tsx:544-566 introduces an async reconciliation fetch for completed workflow queue items, but its upsertExecutionState side effect is keyed only by nodeId (the source-node id) in the global $nodeExecutionStates store, with no guard that the reconciled item is still the current run. Scenario: queue item 1 (workflow) reaches completed, the reconciliation dispatch(queueApi.endpoints.getQueueItem.initiate(item_id, { forceRefetch: true, subscribe: false })) is dispatched but not yet resolved. Queue item 2 for the same workflow then starts; on queue_item_status_changed with status === 'in_progress', invokeai/frontend/web/src/services/events/setEventListeners.tsx:516-528 resets all node states to PENDING, and subsequent invocation_started events transition source nodes to IN_PROGRESS. The pending fetch for item 1 finally resolves and calls upsertExecutionState for each source node in item 1's session with status COMPLETED and item 1's outputs, blowing away item 2's IN_PROGRESS (or partial progress) state. Evidence chain: setEventListeners.tsx:559-561 -> invokeai/frontend/web/src/features/nodes/hooks/useNodeExecutionState.ts:40-47 (upsertExecutionState does { ...state, ...updates } with no item id check). The previous code never wrote node state from an async reconciliation, so this is a regression introduced by this branch.
To expose this issue, add a test that drives the listener through (1) queue_item_status_changed(item_id=1, status=completed, origin=workflows) with a deferred getQueueItem response, (2) queue_item_status_changed(item_id=2, status=in_progress) followed by invocation_started for the same source nodes, then (3) resolve item 1's getQueueItem mock with completed results, and assert $nodeExecutionStates still reflects item 2's IN_PROGRESS state rather than item 1's COMPLETED outputs.
High:invokeai/frontend/web/src/services/events/workflowExecutionState.ts:76-101 now gates invocation_complete through the state machine, returning shouldApply: false when either the per-invocation status is already terminal or the queue status is failed/canceled. invokeai/frontend/web/src/services/events/setEventListeners.tsx:213-224 short-circuits before calling onInvocationComplete, which is the only place that runs addImagesToGallery, clearCanvasWorkflowIntegrationProcessing, and $lastProgressEvent.set(null) (invokeai/frontend/web/src/services/events/onInvocationComplete.tsx:247-275). Scenario A: a workflow item enters failed (one invocation errored) but other sibling invocations had already produced images; if their invocation_complete events arrive after the queue_item_status_changed(failed) event (which is a documented race the previous LRU cache was sized to handle), the new gate drops them and the generated images are never inserted into the gallery, board totals, or auto-switched. The previous handler explicitly excluded invocation_complete from the finished-item filter (the removed shouldIgnoreFinishedQueueItemInvocationEvent returned false for invocation_error and never even saw invocation_complete), so the regression is direct. Scenario B: when reconciliation marks invocations completed in the state machine before a late invocation_complete arrives for the same prepared id (e.g., a fast getQueueItem resolution), the per-invocation terminal gate at workflowExecutionState.ts:76-79 drops the late event and the image still never reaches the gallery (reconciliation only calls upsertExecutionState, never the gallery path).
To expose this issue, add a test that fires queue_item_status_changed(item_id=1, status=failed) (or pre-applies completed_session_reconciled) and then invocation_complete for a sibling invocation with an ImageOutput result, and asserts that onInvocationComplete ran (boards/image cache updates, last-progress cleared) instead of being swallowed.
Medium:invokeai/frontend/web/src/services/events/setEventListeners.tsx:544 only triggers reconciliation when status === 'completed' && origin === 'workflows'. For partial-success runs that end in failed or canceled, the persisted session.results may still contain completed prepared invocations whose invocation_complete events were dropped by the gate described in the previous finding. Those results are not reconciled at all, so successful sibling outputs disappear from the UI on any race-affected failed run. The branch's stated goal is "workflow execution state reconciliation", yet the failure path it most needs to cover is excluded.
To expose this issue, add a test that simulates a workflow where some invocations succeed and one fails with the invocation_complete events arriving after queue_item_status_changed(failed), and asserts the surviving node outputs are reconciled (either via the same getQueueItem path or via not gating invocation_complete in failed state).
Medium:invokeai/frontend/web/src/services/events/setEventListeners.tsx:435-444 plus invokeai/frontend/web/src/services/events/workflowExecutionState.ts:55-62 only short-circuits queue_item_status_changed when the cached status was terminal AND the new status is non-terminal. Two consecutive terminal events (e.g., a re-delivered completed, or a backend that transitions completed -> failed) both return shouldApply: true. That re-runs the entire handler: another getQueueItem force-refetch, another full tag invalidation set, another reconciliation pass, and on a completed -> failed flip it can stomp queueStatus back to failed then a duplicate completed will flip it again. The previous finishedQueueItemIds.has(...) check rejected any repeat terminal event outright. This is a behavioral change with no test.
To expose this issue, add a test that fires two queue_item_status_changed events with status: 'completed' for the same item id and asserts the reconciliation dispatch and tag invalidations only happen once.
Medium:invokeai/frontend/web/src/services/events/setEventListeners.tsx:79 sets the cache size at max: 100, the same as the previous finished-id cache, but each entry now stores a full WorkflowExecutionState including a Record<string, InvocationStatus> keyed by every prepared invocation id seen. For workflows with many prepared nodes (iterate/batch expansion) this is materially larger memory-per-entry than a boolean, and there is no scheduled cleanup once a queue item reaches a terminal state. invokeai/frontend/web/src/services/events/setEventListeners.tsx:530 only clears completedInvocationKeysByItemId, not workflowExecutionStates. Long-lived sessions with large batched workflows can grow this map until LRU eviction kicks in, and there is no test demonstrating memory bounds for the new structure.
Medium:invokeai/frontend/web/src/services/events/setEventListeners.tsx:544-566 calls getQueueItem.initiate(..., { subscribe: false }). With no subscription added, RTK Query may evict the entry before the rest of the app sees it; more importantly, no unsubscribe/abort is wired up, so if the user disconnects/reconnects mid-reconciliation the resolved callback will still call transitionWorkflowEvent and upsertExecutionState against the new socket session's state. Combined with the cross-queue-item race in the first finding, this widens the window where stale reconciliation results win.
Low:invokeai/frontend/web/src/services/events/workflowExecutionState.ts:64-74's completed_session_reconciled branch unconditionally forces queueStatus = 'completed' whenever it is applied. The only invocation site is invokeai/frontend/web/src/services/events/setEventListeners.tsx:554-558, immediately after a completed status event, so today this is a no-op redundancy. But the function is exported and named generically; any future caller that fires completed_session_reconciled for a non-terminal state will silently mark the queue completed. The state machine should at least verify state.queueStatus === 'completed' before promoting.
Low:invokeai/frontend/web/src/services/events/nodeExecutionState.ts:87-110's getNodeExecutionStatesFromCompletedSession skips nodes that have no result rows, but does not handle the case where a session has stored a result for a NON-source prepared id (e.g., subsequent edits to source_prepared_mapping). It iterates Object.entries(session.source_prepared_mapping), so any persisted result whose prepared id is missing from the mapping is silently dropped from reconciliation. There is no test for that branch.
To expose this issue, add a test where session.results contains a prepared id not present in source_prepared_mapping and assert it is either reconciled or explicitly ignored as intended.
Low:invokeai/frontend/web/src/services/events/setEventListeners.tsx:189-210 now gates invocation_error through the state machine. invokeai/frontend/web/src/services/events/workflowExecutionState.ts:89-95 drops the event when state.queueStatus === 'completed' || 'canceled'. The previous code unconditionally ran the error handler, which dispatches canvasWorkflowIntegrationProcessingCompleted() at setEventListeners.tsx:207-209. After this branch, a late invocation_error for a canvas_workflow_integration origin that arrives after a canceled/completed queue status will leave the canvas modal stuck on its loading spinner.
To expose this issue, add a test that fires queue_item_status_changed(canceled) and then invocation_error(origin=canvas_workflow_integration) and asserts canvasWorkflowIntegrationProcessingCompleted is still dispatched.
Low:invokeai/frontend/web/src/services/events/workflowExecutionState.test.ts has good unit coverage of the pure reducer but no integration test exercises invokeai/frontend/web/src/services/events/setEventListeners.tsx's actual socket-handler wiring. None of the regressions in the High/Medium findings above can be caught by the current vitest suite. Frontend repo policy (invokeai/frontend/web/CLAUDE.md) excludes DOM tests, but these are pure state-machine + listener interactions and could be covered with vitest against a mocked Socket plus dispatch.
Open Questions
Is the backend currently emitting queue_item_status_changed events strictly after all invocation_complete events for the same item, or can they be interleaved on the wire? The previous comment at the deleted setEventListeners.tsx:75-78 explicitly asserted that out-of-order delivery does occur, which is why all three High/Medium findings rely on it. If the server now guarantees ordering this branch's regressions narrow significantly; if it does not, every finding above is reachable in production.
The reconciliation fetch uses forceRefetch: true, subscribe: false. Was subscribe: true (with explicit unsubscribe) considered, so the cache result is retained for other consumers and the request can be aborted on disconnect?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6.13.5Library UpdatesfrontendPRs that change frontend files
2 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes stale frontend workflow execution state when fast workflows complete before all socket events are reflected in the workflow editor.
This is best done with a formal state model, so this PR adds a small workflow execution state model for queue and invocation events. Socket events now pass through the model before mutating node execution state, and completed workflow queue items are reconciled from the authoritative persisted session.
nullqueue_item_status_changed: pendingpendingnullqueue_item_status_changed: in_progressin_progressnullqueue_item_status_changed: completedcompletednullqueue_item_status_changed: failedfailednullqueue_item_status_changed: canceledcanceledpendingqueue_item_status_changed: in_progressin_progresspendingqueue_item_status_changed: completedcompletedpendingqueue_item_status_changed: failedfailedpendingqueue_item_status_changed: canceledcanceledin_progressqueue_item_status_changed: completedcompletedin_progressqueue_item_status_changed: failedfailedin_progressqueue_item_status_changed: canceledcanceledcompletedqueue_item_status_changed: pendingcompletedqueue_item_status_changed: in_progressfailedqueue_item_status_changed: pendingfailedqueue_item_status_changed: in_progresscanceledqueue_item_status_changed: pendingcanceledqueue_item_status_changed: in_progressunknowninvocation_startedin_progressunknowninvocation_progressin_progressunknowninvocation_completecompletedunknowninvocation_errorfailedin_progressinvocation_completecompletedin_progressinvocation_errorfailedcompletedinvocation_startedcompletedinvocation_progresscompletedinvocation_errorfailedinvocation_startedfailedinvocation_progressfailedinvocation_completecompletedinvocation_completecompletedcompletedinvocation_startedcompletedinvocation_progresscompletedinvocation_errorfailedinvocation_errorfailedfailedinvocation_startedfailedinvocation_progressfailedinvocation_completecanceledcompleted_session_reconciledcompleted; mark persisted prepared invocation IDscompleted; rebuild node outputs from sessionRelated Issues / Discussions
This attempts to finally resolve issues with execution state partially resolved in #9043 and others.
QA Instructions
Run tests from
invokeai/frontend/web:pnpm exec vitest run src/services/events/workflowExecutionState.test.ts src/services/events/nodeExecutionState.test.ts src/services/events/invocationTracking.test.tsMerge Plan
Normal merge.
Checklist
What's Newcopy (if doing a release after this PR)