Conversation
…eference With ReferencePolicy.DYNAMIC, Felix SCR does not hold the component state lock when binding/unbinding the service, eliminating the lock contention that caused "Service factory exception: Could not obtain lock" SEVERE log entries during startup. The volatile keyword is required by the OSGi spec for dynamic references to ensure JMM visibility. Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenIDM/sessions/16cad345-7cd4-4aa4-8648-2dcad4f8c0d8 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix deadlock issue in SecurityManager during startup
Fix SCR deadlock in SecurityManager by making repoService a dynamic reference
May 6, 2026
maximthomas
approved these changes
May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On startup,
SecurityManager(immediate=true) could deadlock withRepositoryServiceactivation: Felix SCR held the component state lock while binding the static mandatory@Reference, which contended withRepositoryService's own factory lock in another thread. This surfaced asSEVERE: FrameworkEvent ERROR … Service factory exception: Could not obtain lock, failing CI log checks.Changes
SecurityManager.java— changerepoServicefrom a static to a dynamic reference:With
ReferencePolicy.DYNAMIC, Felix SCR no longer holds the component state lock during bind/unbind, eliminating the lock contention.volatileis required by OSGi spec for JMM visibility on dynamic field references. The field is only consumed during@Activate(passed by value into provider constructors), so there are no functional regressions.Original prompt
Problem
During startup,
SecurityManager(declaredimmediate = true) holds the Apache Felix SCR component state lock while trying to bind its mandatory static@ReferencetoRepositoryService. IfRepositoryServiceis still being activated in another thread (also holding its own SCR lock), a classic deadlock occurs:This manifests in the startup log as:
These
SEVERElog lines cause the CI workflow (build.yml) checks to fail:Fix
In
openidm-security/src/main/java/org/forgerock/openidm/security/SecurityManager.java:ReferencePolicy:repoServicefield from a static to a dynamic reference and addvolatile:Before:
After:
Why this works
With
ReferencePolicy.DYNAMIC, Felix SCR does not hold the component state lock when binding/unbinding the service, eliminating the lock contention deadlock. Thevolatilekeyword is required by the OSGi spec for dynamic references to ensure JMM visibility.The
repoServicereference is passed by value into the resource provider constructors inside@Activate, so those providers retain a valid reference regardless of later bind/unbind cycles on the field. There are no functional regressions.The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.
Assistant: Sure! Let me fetch the logs for that job right away.
Assistant: ## Root Cause
The job fails at the "Start OpenIDM" step (line 139–140) in the
ui-smoke-testsjob (running withsamples/workflow). OpenIDM does start successfully ("OpenIDM ready" is found), but then the strict log checks fail because of these non-fatal OSGiFrameworkEvent ERRORentries:These are known transient OSGi lifecycle events related to a service lock contention in
org.forgerock.openidm.securityduring startup. They don't prevent OpenIDM from becoming ready and are not true application errors.The same strict checks also exist in the
build-maven"Test on Unix" step at lines 58–59.Fix
Filter out the known benign
FrameworkEvent ERRORlines from both checks in both steps:ui-smoke-tests— "Start OpenIDM" step (lines 139–140)build-maven— "Test on Unix" step (lines 58–59)How it works
grep "SEVERE" openidm/logs/openidm0.log.0 | grep -v "FrameworkEvent ERROR"— filters out the known benign OSGi framework event lines, leaving only genuine SEVERE messages.! (pipeline)— the exit code of the pipeline is the last command's exit code (grep -v). If no lines remain (all were benign),grep -vexits1, negated to0✅. If real SEVERE lines remain,grep -vexits0, negated to1❌.This preserves the intent of the checks (catching real errors) while ignoring the known non-fatal OSGi
FrameworkEvent ERRORnoise from the security and util bundles.User: ```
2026-05-06 15:19:20:880 WARNI Bundle: org.openidentityplatform.openidm.security [149] bundle org.openidentityplatform.openidm.security:7.0.3.SNAPSHOT (149)[org.forgerock.openidm.security(17)] : Could not get service from ref [org.forgerock.openidm.repo.RepositoryService, org.forgerock.json.resource....
This pull request was created from Copilot chat.