Creating draft PR with potential fix by jcastro-dotcms · Pull Request #36092 · dotCMS/core

jcastro-dotcms · 2026-06-09T21:57:45Z

Proposed Changes

change 1
change 2

Checklist

Tests
Translations
Security Implications Contemplated (add notes if applicable)

Additional Info

** any additional useful context or info **

Screenshots

Original	Updated
original screenshot	updated screenshot

github-actions · 2026-06-09T21:57:54Z

❌ Issue Linking Required

This PR could not be linked to an issue. All PRs must be linked to an issue for tracking purposes.

How to fix this:

Option 1: Add keyword to PR body (Recommended - auto-removes this comment)
Edit this PR description and add one of these lines:

This PR fixes #123 or Fixes: #123
This PR closes #123 or Closes: #123
This PR resolves #123 or Resolves: #123
Other supported keywords: fix, fixed, close, closed, resolve, resolved
Option 2: Link via GitHub UI (Note: won't clear the failed check)

Go to the PR → Development section (right sidebar)
Click "Link issue" and select an existing issue
Push a new commit or re-run the workflow to clear the failed check
Option 3: Use branch naming
Create a new branch with one of these patterns:

123-feature-description (number at start)
issue-123-feature-description (issue-number at start)
feature-issue-123 (issue-number anywhere)

Why is this required?

Issue linking ensures proper tracking, documentation, and helps maintain project history. It connects your code changes to the problem they solve.---

This comment was automatically generated by the issue linking workflow

claude · 2026-06-09T21:58:40Z

Claude finished @jcastro-dotcms's task in 3m 37s —— View job

Code Review

Gathering context
Reviewing changed files
Posting findings

Summary

The PR rewrites the hot path for saving related content: instead of fetching and hydrating every related contentlet one-by-one, it reads raw identifiers from the tree table, deletes/inserts in bulk, and only re-indexes parents whose relationships actually changed. The approach is sound, but there are several issues worth addressing before merge.

Issues

1. SQL injection via dynamic column names — Semgrep findings are real (`TreeFactory.java`)

getRelatedIds and getNextTreeOrder build SQL by string-concatenating caller-controlled column names (selectColumn, whereColumn) directly into the statement:

// TreeFactory.java ~L459, ~L510
.setSQL("SELECT " + selectColumn + " FROM tree WHERE " + whereColumn + " = ? ...")
.setSQL("... WHERE " + whereColumn + " = ? AND relation_type = ?")

Both private methods are only ever called with the literals "parent" or "child", so there is no current injection vector. However, the pattern is fragile: there is no allowlist guard, and any future caller passing unsanitised input would be vulnerable. The Semgrep rule is correct to flag this.

Fix: add an allowlist assert/throw at the top of each private method, or collapse the two branches into two named methods that hard-code the column name. Fix this →

2. Non-atomic delete-then-insert in `insertTrees` (`TreeFactory.java` ~L610)

new DotConnect().executeBatch("DELETE FROM tree WHERE ...", deleteParams);
new DotConnect().executeBatch("INSERT INTO tree ...", insertParams);

executeBatch is @WrapInTransaction, meaning each call gets its own transaction. Between the DELETE batch completing and the INSERT batch starting there is a window where the rows do not exist. A concurrent read of tree during that window (e.g. getRelatedIdsByParentAndRelationType or a search index refresh triggered from another thread) will see no related content. Under concurrent saves of the same parent or a high-traffic node this can cause:

transient empty relationship reads;
redundant re-index of unchanged parents (because the index sees an empty relationship set);
a lost INSERT if the two executeBatch calls end up on different connections.

Both calls should be wrapped in a single explicit transaction (or the delete should be done inline before each insert row using an INSERT ... ON CONFLICT DO UPDATE / MERGE if the DB supports it).

3. `isExemptFromPermissionFiltering` may produce wrong results for `null` users in the admin role check (`ESContentletAPIImpl.java`)

private boolean isExemptFromPermissionFiltering(final User user) throws DotDataException {
    if (user == null) {
        return false;
    }
    ...
    return APILocator.systemUser().getUserId().equals(user.getUserId())
            || roleAPI.doesUserHaveRole(user, roleAPI.loadCMSAdminRole());
}

Returning false for a null user sends the call into deleteReadableRelatedContent, which then calls permissionAPI.filterCollection(…, user). The PermissionAPI contract for null user is not universally guaranteed — some implementations throw, some treat it as anonymous. This was not the behaviour before the refactor (the old code always hydrated content first), so any existing callers that pass a null user would hit a different code path now. At minimum, document the null-user contract; consider throwing IllegalArgumentException so the contract is explicit.

4. `deleteAllRelatedContent` re-indexes removed parents but skips the `invalidateRelatedContentCache` for the non-relationship-field case (`ESContentletAPIImpl.java` ~L4660)

In deleteReadableRelatedContent, invalidateRelatedContentCache(contentlet, …) is called on the hydrated contentlet for every related item. In the new fast path (deleteAllRelatedContent), the cache-invalidation loop uses the identifier-only overload, which correctly bails out early when !relationship.isRelationshipField(). This is fine for that guard, but non-relationship-field cache eviction that the old code handled via the contentlet-level invalidateRelatedContentCache is now skipped entirely in the fast path. If the Contentlet version of the method does anything beyond clearing RelationshipCache, that logic is silently dropped.

Recommendation: audit both overloads to confirm they're functionally equivalent for all relationship types before merging.

5. Scroll fallback in `dependenciesLeftToReindex` is triggered incorrectly (`ESMappingAPIImpl.java` ~L1268)

if (oldSearchResults instanceof PaginatedArrayList
        && ((PaginatedArrayList<?>) oldSearchResults).getTotalResults() > oldSearchResults.size()) {
    // fall back to a scroll by passing MAX_LIMIT + 1
    oldSearchResults = contentletAPI.searchIndex(relatedQuery,
            ESContentletAPIImpl.MAX_LIMIT + 1, 0, null, ...);
}

MAX_LIMIT + 1 (10001) triggers the scroll path internally (limit > MAX_LIMIT), but the method name says "falling back to a scroll search". The scroll API returns a List<ContentletSearch> that is NOT a PaginatedArrayList, so the check above would not re-trigger. However, nothing prevents a second page of results — if a single relationship type has >10 000 indexed documents the scroll approach is correct, but the fallback fires on every call that returns exactly MAX_LIMIT results even when the total is only marginally above the limit. This is an inefficiency (unconditional scroll for edge cases). More importantly, searchIndex with a limit above MAX_LIMIT is not documented and depends on internal branching that could change. Consider using the dedicated scroll API directly.

6. `relateContent` tree position starts at `getNextTreeOrderByParent…` but `treePosition++` increments it without being used in the `child` branch (`ESContentletAPIImpl.java` ~L5160)

int treePosition = related.isHasParent()
        ? TreeFactory.getNextTreeOrderByParentAndRelationType(...)
        : TreeFactory.getNextTreeOrderByChildAndRelationType(...);
...
for (final Contentlet relatedContent : related.getRecords()) {
    if (child) {
        // positionInParent is used here — from contentParents, not treePosition
        newTree = new Tree(..., positionInParent);
    } else {
        newTree = new Tree(..., treePosition);
    }
    positionInParent++;
    if (uniqueRelationshipSet.add(newTree)) {
        treesToInsert.add(newTree);
        treePosition++;   // <-- incremented for both branches
    }
}

When child == true, treePosition is fetched from the DB but only incremented — it is never assigned to any Tree. The DB read is wasted. Minor, but worth cleaning up.

7. No transaction boundary around the delete-then-insert pair in `relateContent` (`ESContentletAPIImpl.java`)

deleteRelatedContent and the subsequent TreeFactory.insertTrees call (in relateContent) run in two separate calls on the same thread but without an explicit wrapping transaction. Each uses @WrapInTransaction / @CloseDBIfOpened independently. If the insertTrees fails after deleteRelatedContent has committed, the relationship is permanently lost. The outer checkin flow may wrap this in a transaction, but relateContent is also a public API called standalone — this should be guarded.

…ent-performance-improvements

semgrep-code-dotcms-test · 2026-06-11T15:56:59Z

Semgrep found 10 CUSTOM_INJECTION-2 findings:

dotCMS/src/main/java/com/dotmarketing/factories/TreeFactory.java
- L413-414 - Triage
- L413 - Triage
- L413 - Triage
- L413 - Triage
- L466-468 - Triage
- L466-468 - Triage
- L466-467 - Triage
- L466-467 - Triage
- L466 - Triage
- L466 - Triage

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

…ent-performance-improvements

semgrep-code-dotcms-test · 2026-06-11T21:57:21Z

+                .setSQL("SELECT COALESCE(MAX(tree_order), 0) + 1 AS next_position FROM tree"
+                        + " WHERE " + whereColumn + " = ? AND relation_type = ?")


Semgrep identified a blocking 🔴 issue in your code:
The method identified is susceptible to injection. The input should be validated and properly
escaped.

Why this might be safe to ignore:

This query concatenates only the column name, and in this method that value is constrained by internal callers to the fixed literals "parent" or "child" while user-controlled values are still passed as parameters. The rule matched a generic string-built SQL pattern, but in this context there is no realistic injection path to the database.

To resolve this comment:

🔧 No guidance has been designated for this issue. Fix according to your organization's approved methods.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by CUSTOM_INJECTION-2.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-dotcms-test · 2026-06-11T21:57:21Z

+					"SELECT * FROM contentlet_version_info WHERE identifier IN ("
+							+ DotConnect.createParametersPlaceholder(chunk.size()) + ")");


Semgrep identified a blocking 🔴 issue in your code:
The method identified is susceptible to injection. The input should be validated and properly
escaped.

Why this might be safe to ignore:

This query builds only the number of parameter placeholders for an IN clause, and the actual identifier values are passed separately with addParam, so user input is not concatenated into SQL. The rule matched a SQL string concatenation pattern, but in this context it does not create a meaningful injection risk.

To resolve this comment:

🔧 No guidance has been designated for this issue. Fix according to your organization's approved methods.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by CUSTOM_INJECTION-2.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-dotcms-test · 2026-06-11T21:57:22Z

Semgrep found 3 CUSTOM_INJECTION-2 findings:

dotCMS/src/main/java/com/dotmarketing/factories/TreeFactory.java
- L510-511 - Triage
- L510-511 - Triage
dotCMS/src/main/java/com/dotmarketing/business/VersionableFactoryImpl.java
- L476-477 - Triage

The method identified is susceptible to injection. The input should be validated and properly
escaped.

If this is a critical or high severity finding, please also link this issue in the #security channel in Slack.

jcastro-dotcms · 2026-06-11T22:09:01Z

Regarding Semgrep's analysis, all three are false positives:

Finding analysis

Findings 842408061 + 842408062 — TreeFactory L510–511 (both point at the same two lines, in getNextTreeOrder):

  .setSQL("SELECT COALESCE(MAX(tree_order), 0) + 1 AS next_position FROM tree"
          + " WHERE " + whereColumn + " = ? AND relation_type = ?")

The rule fires on the string concatenation, but the only concatenated variable is whereColumn, which is a private method parameter whose only two callers pass the hardcoded literals "parent" and "child". Everything user-controllable (id, relationType) goes through addParam bound parameters. No injection path exists.

Finding 842408064 — VersionableFactoryImpl L476–477:

  .setSQL("SELECT * FROM contentlet_version_info WHERE identifier IN ("
          + DotConnect.createParametersPlaceholder(chunk.size()) + ")");

The concatenated value is the output of createParametersPlaceholder(int) — a generated ?,?,?... string derived from a list size, never from content. All identifiers are bound via addParam. This is also the established house pattern: ContentletIndexAPIImpl.SELECT_CONTENTLET_VERSION_INFO and ESContentFactoryImpl build chunked IN clauses with the exact same helper.

Creating draft PR with potential fix

e2ecab5

github-project-automation Bot added this to dotCMS - Product Planning Jun 9, 2026

github-actions Bot added the Area : Backend PR changes Java/Maven backend code label Jun 9, 2026

claude Bot added the AI: Safe To Rollback label Jun 9, 2026

jcastro-dotcms added 2 commits June 11, 2026 09:51

Adding more potential changes

80cd1c9

Merge branch 'main' into support-36004-save-content-with-related-cont…

ec92fa6

…ent-performance-improvements

jcastro-dotcms marked this pull request as ready for review June 11, 2026 15:52

github-actions Bot removed the AI: Safe To Rollback label Jun 11, 2026

claude Bot added the AI: Safe To Rollback label Jun 11, 2026

Merge branch 'main' into support-36004-save-content-with-related-cont…

6b8cb38

…ent-performance-improvements

github-actions Bot removed the AI: Safe To Rollback label Jun 11, 2026

claude Bot added the AI: Safe To Rollback label Jun 11, 2026

Merge branch 'main' into support-36004-save-content-with-related-cont…

9144940

…ent-performance-improvements

github-actions Bot removed the AI: Safe To Rollback label Jun 11, 2026

claude Bot added the AI: Safe To Rollback label Jun 11, 2026

Adding more potential changes

f1946e1

github-actions Bot removed the AI: Safe To Rollback label Jun 11, 2026

claude Bot added the AI: Safe To Rollback label Jun 11, 2026

semgrep-code-dotcms-test Bot reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating draft PR with potential fix#36092

Creating draft PR with potential fix#36092
jcastro-dotcms wants to merge 6 commits into
mainfrom
support-36004-save-content-with-related-content-performance-improvements

jcastro-dotcms commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

claude Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

semgrep-code-dotcms-test Bot commented Jun 11, 2026

Uh oh!

semgrep-code-dotcms-test Bot Jun 11, 2026

Uh oh!

semgrep-code-dotcms-test Bot Jun 11, 2026

Uh oh!

semgrep-code-dotcms-test Bot commented Jun 11, 2026

Uh oh!

jcastro-dotcms commented Jun 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		.setSQL("SELECT COALESCE(MAX(tree_order), 0) + 1 AS next_position FROM tree"
		+ " WHERE " + whereColumn + " = ? AND relation_type = ?")

		"SELECT * FROM contentlet_version_info WHERE identifier IN ("
		+ DotConnect.createParametersPlaceholder(chunk.size()) + ")");

Conversation

jcastro-dotcms commented Jun 9, 2026

Proposed Changes

Checklist

Additional Info

Screenshots

Uh oh!

github-actions Bot commented Jun 9, 2026

❌ Issue Linking Required

How to fix this:

Why is this required?

Uh oh!

claude Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Issues

1. SQL injection via dynamic column names — Semgrep findings are real (TreeFactory.java)

2. Non-atomic delete-then-insert in insertTrees (TreeFactory.java ~L610)

3. isExemptFromPermissionFiltering may produce wrong results for null users in the admin role check (ESContentletAPIImpl.java)

4. deleteAllRelatedContent re-indexes removed parents but skips the invalidateRelatedContentCache for the non-relationship-field case (ESContentletAPIImpl.java ~L4660)

5. Scroll fallback in dependenciesLeftToReindex is triggered incorrectly (ESMappingAPIImpl.java ~L1268)

6. relateContent tree position starts at getNextTreeOrderByParent… but treePosition++ increments it without being used in the child branch (ESContentletAPIImpl.java ~L5160)

7. No transaction boundary around the delete-then-insert pair in relateContent (ESContentletAPIImpl.java)

Uh oh!

semgrep-code-dotcms-test Bot commented Jun 11, 2026

Uh oh!

semgrep-code-dotcms-test Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

semgrep-code-dotcms-test Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

semgrep-code-dotcms-test Bot commented Jun 11, 2026

Uh oh!

jcastro-dotcms commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Finding analysis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 9, 2026 •

edited

Loading

1. SQL injection via dynamic column names — Semgrep findings are real (`TreeFactory.java`)

2. Non-atomic delete-then-insert in `insertTrees` (`TreeFactory.java` ~L610)

3. `isExemptFromPermissionFiltering` may produce wrong results for `null` users in the admin role check (`ESContentletAPIImpl.java`)

4. `deleteAllRelatedContent` re-indexes removed parents but skips the `invalidateRelatedContentCache` for the non-relationship-field case (`ESContentletAPIImpl.java` ~L4660)

5. Scroll fallback in `dependenciesLeftToReindex` is triggered incorrectly (`ESMappingAPIImpl.java` ~L1268)

6. `relateContent` tree position starts at `getNextTreeOrderByParent…` but `treePosition++` increments it without being used in the `child` branch (`ESContentletAPIImpl.java` ~L5160)

7. No transaction boundary around the delete-then-insert pair in `relateContent` (`ESContentletAPIImpl.java`)

jcastro-dotcms commented Jun 11, 2026 •

edited

Loading