Add transient-error retry to SalesforceBulkOperator#64574
Closed
nagasrisai wants to merge 4 commits intoapache:mainfrom
Closed
Add transient-error retry to SalesforceBulkOperator#64574nagasrisai wants to merge 4 commits intoapache:mainfrom
nagasrisai wants to merge 4 commits intoapache:mainfrom
Conversation
Introduces max_retries, retry_delay, and transient_error_codes parameters. When max_retries > 0, records that fail with a transient Salesforce error (UNABLE_TO_LOCK_ROW, API_TEMPORARILY_UNAVAILABLE by default) are re-submitted after retry_delay seconds, up to max_retries times. Only the failed records are re-submitted, not the entire payload. Related to apache#64519
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While wrapping up #64519, eladkal pointed to
BigQueryInsertJobOperatoras a reference for handling transient errors. This is that follow-up.The problem is that Salesforce Bulk errors live at the record level, not the API level. When Salesforce is under concurrent write load you can get
UNABLE_TO_LOCK_ROWback on individual records, but the API call itself succeeded — so you just end up withsuccess=Falseentries in the result list. There's no built-in way to deal with that today without unwrapping the XCom, filtering for those codes, building a new payload, and calling the operator again yourself.This adds three optional parameters:
max_retries(default0) — how many times to re-submit records that come back with a transient error. Defaults to zero so existing behaviour is unchanged.retry_delay(default5.0) — seconds to wait before each retry.transient_error_codes— which SalesforcestatusCodevalues count as transient. Defaults to{"UNABLE_TO_LOCK_ROW", "API_TEMPORARILY_UNAVAILABLE"}.When retrying, only the failing records are re-submitted, not the whole payload. The retry results slot back into the original positions because Salesforce always returns results in the same order as the input. Permanent errors like
INVALID_FIELDare not in the default set and are never retried.Related to #64519