Skip to content

fix(knowledge): fix document processing stuck in processing state#3857

Merged
waleedlatif1 merged 5 commits intostagingfrom
waleedlatif1/fix-document-processing
Mar 31, 2026
Merged

fix(knowledge): fix document processing stuck in processing state#3857
waleedlatif1 merged 5 commits intostagingfrom
waleedlatif1/fix-document-processing

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • Fix documents getting stuck in "Processing" state when processed via Trigger.dev sync (inline fallback was fire-and-forget, worker died before processing completed)
  • Enable Trigger.dev retry by re-throwing errors in processDocumentAsync (maxAttempts: 3 was dead code)
  • Reset stale embeddings and processing state before retrying stuck documents in sync engine
  • Add 60s timeout to embedding API fetch to prevent hanging requests
  • Fix false "Additional scopes required" warning when connector credential belongs to different user
  • Replace OAuthRequiredModal with ConnectCredentialModal in KB connectors for consistent credential naming UX
  • Fix OAuth token refresh error logging (Error objects serialize to {})

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Mar 31, 2026 3:30am

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Mar 31, 2026

PR Summary

Medium Risk
Touches the knowledge ingestion pipeline (dispatch, retries, embedding calls) and connector credential flows; mistakes could cause reprocessing loops, missed jobs, or incorrect credential/scope prompts.

Overview
Fixes KB document processing getting stuck by awaiting direct processing, propagating failures (rethrow in processDocumentAsync), and improving dispatch behavior to log partial failures and hard-fail only when all dispatches fail.

Improves connector re-sync reliability by resetting stuck documents back to pending and clearing their embeddings/processing counters before re-queuing them, plus adds a 60s timeout to embedding API requests to avoid hung workers.

Cleans up OAuth/credential UX across knowledge connectors and workflows: KB connector flows now use ConnectCredentialModal, credential lists refetch on open/visibility triggers, stale OAuth return context is consumed on close, and “additional scopes required” warnings no longer trigger when the referenced credential can’t be found (e.g., different user). Also improves OAuth refresh error logging to record messages instead of {}.

Written by Cursor Bugbot for commit 897b761. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR addresses a cluster of related reliability and UX bugs in the knowledge-base document processing pipeline and OAuth credential flow. The changes are well-scoped and the root causes are correctly diagnosed.

Core processing fixes (service.ts, sync-engine.ts, embeddings.ts)

  • Converts the inline-fallback path from fire-and-forget (void) to await, so the calling worker can't exit before processing finishes.
  • Re-throws from processDocumentAsync's catch block to make Trigger.dev's maxAttempts: 3 effective instead of dead code.
  • Deletes stale embeddings and resets document state to 'pending' before re-queuing stuck documents — this is a 3-step non-transactional operation, but acceptable since a partial failure leaves documents in 'pending' and the next sync cycle retries them naturally.
  • Adds a per-attempt AbortController with a 60 s timeout around embedding API calls; .finally() correctly clears the timer on both success and abort paths.

OAuth credential UX fixes

  • Replaces OAuthRequiredModal with ConnectCredentialModal in KB connectors for consistent credential naming, and delegates draft-credential creation + return-context writing to the modal itself.
  • Fixes a false "Additional scopes required" warning by guarding getMissingRequiredScopes when the found credential is undefined (credential belongs to a different user).
  • Tightens ConnectCredentialModal props to a discriminated union (workflowId | knowledgeBaseId) following up on the previous review comment.
  • Fixes Error objects serialising to {} in the OAuth token refresh logger.

Key findings:

  • processDocumentsWithQueue only throws when ALL dispatches fail; partial failures are logged but not surfaced to callers — documents silently land in 'failed' until the next stuck-doc retry cycle.
  • preCount: 0 is hardcoded in the copilot-tool OAuth event handler (workflow.tsx); if the user already has credentials for that provider the post-redirect detection could produce a false positive.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style/reliability suggestions that don't affect the primary fix paths.

The core document-processing bug (fire-and-forget inline fallback, dead retry code) is correctly fixed. The OAuth false-positive and error-serialization fixes are straightforward and well-tested locally per the PR. The two flagged concerns (partial-failure surfacing and hardcoded preCount) are edge cases that do not block the primary use cases addressed by this PR. No P0/P1 issues found.

apps/sim/lib/knowledge/documents/service.ts (partial-failure handling) and apps/sim/app/workspace/[workspaceId]/w/[workflowId]/workflow.tsx (hardcoded preCount)

Important Files Changed

Filename Overview
apps/sim/lib/knowledge/documents/service.ts Core fix: changes fire-and-forget inline fallback to await, re-throws errors to enable Trigger.dev retries, uses Promise.allSettled with partial-failure logging — only throws when ALL dispatches fail
apps/sim/lib/knowledge/connectors/sync-engine.ts Resets stale embeddings and document state before retrying stuck documents; non-transactional but safe since documents land in 'pending' and are retried on next sync
apps/sim/lib/knowledge/embeddings.ts Adds per-attempt AbortController with 60 s timeout; .finally() correctly clears the timer on both success and failure paths
apps/sim/app/workspace/[workspaceId]/knowledge/[id]/components/connectors-section/connectors-section.tsx Fixes false 'Additional scopes required' warning by guarding getMissingRequiredScopes when credential is not found; splits modal rendering correctly by credentialId presence
apps/sim/app/workspace/[workspaceId]/knowledge/[id]/components/add-connector-modal/add-connector-modal.tsx Replaces OAuthRequiredModal with ConnectCredentialModal, removes manual draft-credential creation, delegates OAuth context writing to the modal; simplification is clean
apps/sim/app/workspace/[workspaceId]/w/[workflowId]/components/panel/components/editor/components/sub-block/components/credential-selector/components/connect-credential-modal.tsx Discriminated union type (workflowId
apps/sim/lib/oauth/oauth.ts Fixes silent Error serialization bug in catch block — { error } serialises Error objects to {}, now correctly extracts .message
apps/sim/app/workspace/[workspaceId]/w/[workflowId]/workflow.tsx Writes OAuth return context before opening the OAuthRequiredModal from copilot events; adds consumeOAuthReturnContext on close; dependency array updated correctly

Sequence Diagram

sequenceDiagram
    participant SE as SyncEngine
    participant SVC as DocumentService
    participant TRG as Trigger.dev
    participant DB as Database
    participant EMB as EmbeddingAPI

    SE->>DB: Find stuck documents (processing > cutoff)
    DB-->>SE: stuckDocs[]
    SE->>DB: DELETE embeddings WHERE documentId IN stuckDocIds
    SE->>DB: UPDATE documents SET status='pending', reset counts
    SE->>SVC: processDocumentsWithQueue(stuckDocs)

    alt Trigger.dev available
        SVC->>TRG: dispatch job (maxAttempts:3)
        TRG-->>SVC: enqueued
        Note over TRG,DB: Trigger.dev worker picks up job
        TRG->>SVC: processDocumentAsync()
        SVC->>DB: SET status='processing'
        SVC->>EMB: fetch embeddings (60s timeout)
        EMB-->>SVC: embeddings[]
        SVC->>DB: SET status='completed'
        Note over TRG: On error: re-throw → Trigger.dev retries up to 3x
    else Inline fallback
        SVC->>SVC: await processDocumentAsync()
        SVC->>DB: SET status='processing'
        SVC->>EMB: fetch embeddings (60s timeout)
        EMB-->>SVC: embeddings[]
        SVC->>DB: SET status='completed'
        Note over SVC: On error: SET status='failed', re-throw → allSettled catches
    end
Loading

Reviews (2): Last reviewed commit: "upgrade turbo" | Re-trigger Greptile

@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from 2852c32 to b65d041 Compare March 31, 2026 02:40
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from b65d041 to 12be44d Compare March 31, 2026 02:47
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from 12be44d to 068fa59 Compare March 31, 2026 02:53
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from 068fa59 to df314fe Compare March 31, 2026 02:57
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from df314fe to 991e2b8 Compare March 31, 2026 02:59
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/fix-document-processing branch from 991e2b8 to 6c4ac15 Compare March 31, 2026 03:04
…Copilot OAuth context

- Change Promise.all to Promise.allSettled in processDocumentsWithQueue so
  one failed dispatch doesn't abort the entire batch
- Add writeOAuthReturnContext before showing LazyOAuthRequiredModal from
  Copilot tools so useOAuthReturnForWorkflow can handle the return
- Add consumeOAuthReturnContext on modal close to clean up stale context
Pass empty string instead of undefined for connectorProviderId fallback
to match the hook's string parameter type.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

…reshTriggers call

Same string narrowing fix as add-connector-modal — pass empty string
fallback for providerId.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

@waleedlatif1 waleedlatif1 merged commit 7d4dd26 into staging Mar 31, 2026
11 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/fix-document-processing branch March 31, 2026 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant