Skip to content

fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15)#22

Merged
bradjin8 merged 7 commits intomasterfrom
fix/set-workspace-path-validation-15
May 8, 2026
Merged

fix: validate /api/set-workspace path — reject traversal, symlink escape, non-existent (closes #15)#22
bradjin8 merged 7 commits intomasterfrom
fix/set-workspace-path-validation-15

Conversation

@timon0305
Copy link
Copy Markdown
Collaborator

@timon0305 timon0305 commented May 7, 2026

Problem

POST /api/set-workspace accepted any string in body.path, did ~/ expansion, and stored the result in a module-global consumed by every subsequent file lookup. A client (anyone who reaches the dashboard, including a hostile JS payload — see related XSS issue #11) could post { "path": "/etc" } or { "path": "/home/user/.ssh" } and the app would then serve those files as Cursor data. Symlink escape (e.g. /tmp/cursor-link → /) bypassed any naive startswith check.

Change

  • utils/path_validation.py (new) — validate_workspace_path(raw) with WorkspacePathError. os.path.realpath() collapses .. AND resolves symlinks in one step; rejects non-string/empty, non-existent, non-directory, no-Cursor-markers; returns the canonical real path so the override is stored canonically, not as the caller sent it. Lives outside api/ so tests don't pull Flask into scope (existing test-suite convention).
  • api/config_api.py — handler calls the validator. Returns 400 { error: "<reason>" } on rejection (was silently 200); on accept returns 200 { success: true, path: "<canonical>" }.
  • tests/test_workspace_path_validation.py (new) — 11 regression cases.

Test plan

Unit (python3 -m unittest discover tests): 148/148 OK (was 137; +11 new).

Live HTTP smoke against the running app, all 9 cases:

Case Payload Expected Got
A empty body 400 path is required
B empty path string 400 path is required
C non-existent path 400 path does not exist
D file /etc/passwd 400 path is not a directory
E dir /etc (no markers) 400 no Cursor workspaceStorage
F traversal …/storage/../.. lands at /tmp 400 (.. collapsed, then markers reject)
G symlink → / 400 (realpath unmasked it)
H real fake workspace (state.vscdb in subdir) 200 { path: "<canonical>", success: true }
I symlink → real workspace 200 { path: "<realpath>" } (canonicalised — input was the symlink path)

Case I is the meaningful one: even though the operator sent …/good-link, the override stored is the realpath. Every subsequent read goes through canonical, not the link — a future swap of the link target can't redirect reads.

Closes #15.

Summary by CodeRabbit

  • Improvements
    • Stricter workspace path validation: rejects invalid/traversal inputs and non-workspace locations, resolves canonical real paths, and returns canonical path plus workspace count on validation.
    • Setting a workspace now validates input and returns the canonical path on success; malformed/non-object JSON yields clear 400 errors.
  • Tests
    • Added regression tests covering validation, canonicalization, symlink behavior, and API responses.
  • Documentation
    • README clarified workspace-path validation and operator escape hatch.
  • Bug Fixes
    • UI shows server-provided validation error messages.

timon0305 added 2 commits May 5, 2026 16:33
…ape, non-existent (closes #15)

POST /api/set-workspace accepted any string in body.path, ran tilde
expansion, and stored it in a module-global override consumed by every
subsequent file lookup. Anyone reaching the endpoint (including a
hostile JS payload — see XSS issue #11) could repoint the app at /etc,
/, ~/.ssh, or any other directory; the dashboard would then serve files
from there as if they were Cursor chat data. Symlink-based escape
(e.g. /tmp/cursor-link → /) bypassed any naive startswith-style check.

New utils/path_validation.py with validate_workspace_path(raw):

- Expands ~/.
- os.path.realpath() — collapses `..` AND resolves symlinks in one
  step. Both classes of escape become equivalent to the canonical real
  path on disk; downstream marker check then operates on the truth.
- Rejects with WorkspacePathError if the path doesn't exist, isn't a
  directory, or contains no immediate subdirectory with a state.vscdb
  marker (the same heuristic /api/validate-path already uses).
- Returns the canonical real path so the override is stored in
  canonical form, not whatever the caller sent.

api/config_api.py:set_workspace now calls the validator and returns
400 { error: "<reason>" } on rejection (was silently 200), and stores
the canonical path on success: 200 { success: true, path: "<canonical>" }.

Lives outside api/ so the test suite can import without Flask in scope
(tests/test_cli_args.py convention).

Tests: tests/test_workspace_path_validation.py — 11 cases covering:
  - happy path with marker file present
  - canonical path returned (`..` collapsed)
  - empty / whitespace / non-string rejected
  - non-existent / file-not-directory rejected
  - directory without Cursor markers rejected
  - traversal lands outside workspace → rejected on markers
  - symlink → / rejected on markers (POSIX-only)
  - symlink → real workspace canonicalised + accepted (POSIX-only)

Full suite: 148/148 OK (was 137; +11 new). Live HTTP smoke against the
running app verified all 9 documented behaviours (200 with canonical
path on accept; 400 with the documented reason on each reject).
…n PR #16)

Two CodeRabbit follow-ups:

1. api/config_api.py:set_workspace — when get_json(silent=True) returned
   a non-dict (JSON array, string, number, null), the truthy fallback
   `or {}` was bypassed and `body.get("path", "")` raised AttributeError,
   which the outer Exception handler mis-reported as a 500 server error.
   Added isinstance(body, dict) guard that returns
   400 { error: "request body must be a JSON object" } directly. Diverged
   from CodeRabbit's literal patch in one way: they had it raise
   WorkspacePathError("path is required"), but the actual problem here is
   a malformed body shape — the error message should match the cause so
   the client can fix it.

2. tests/test_workspace_path_validation.py — the traversal test escaped
   to /tmp itself, which is shared and could be flipped by any other
   test / process creating <something>/state.vscdb there. Pinned the
   escape target to an isolated root inside self.tmp.

Also added 4 API-layer regression tests (TestSetWorkspaceApi) using
Flask test_client: JSON array / string / number return 400 (not 500),
plus a sanity end-to-end with a valid {path} body returning 200 and the
canonical realpath.

Full suite: 152/152 OK (was 148; +4 new API tests).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 50c706cf-cd7a-4bd6-a98a-d2ad42b0a74b

📥 Commits

Reviewing files that changed from the base of the PR and between 2aa40b7 and f3ac9b9.

📒 Files selected for processing (2)
  • api/config_api.py
  • tests/test_workspace_path_validation.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • api/config_api.py
  • tests/test_workspace_path_validation.py

📝 Walkthrough

Walkthrough

Adds a realpath-based workspace path validator with Cursor marker detection, integrates it into POST /api/set-workspace and POST /api/validate-path to return canonical paths and map validation failures to HTTP 400, updates docs/templates, and adds tests covering validator and endpoint behavior.

Changes

Workspace Path Validation & API Integration

Layer / File(s) Summary
Path Validation Contract
utils/path_validation.py
Introduces module docs and WorkspacePathError used to surface validation failures as client-facing errors.
Path Validation Logic
utils/path_validation.py
Implements _has_cursor_workspace_markers() and validate_workspace_path() to expand ~, canonicalize via os.path.realpath, verify existence and directory-ness, enforce Cursor workspace markers, and return the canonical real path or raise WorkspacePathError.
API Endpoint Integration
api/config_api.py
Imports validator and exception; /api/validate-path and /api/set-workspace now call validate_workspace_path(), /api/set-workspace requires JSON object bodies (400 on non-dict), maps WorkspacePathError to 400 (or valid:false for validate-path), and returns {"success": True, "path": <canonical>} or {"valid": True, "workspaceCount": N, "path": <canonical>} on success.
Tests & Coverage
tests/test_workspace_path_validation.py
Adds helper to create Cursor-like workspace markers; unit tests cover valid inputs, canonicalization, traversal and symlink scenarios; API tests verify 400 for non-dict bodies and 200 with success + canonical path for valid dict payloads; validate-path API returns valid plus workspaceCount and canonical path.
Docs & Template
README.md, utils/workspace_path.py, templates/config.html
README documents API validation behavior; resolve_workspace_path() docstring expanded; config.html now displays data.error on validation failure when present.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

Poem

🐰 I hopped down a path both tangled and neat,

Realpath smoothed the turns beneath my feet.
Markers lit burrows where workspaces hide,
Canonical maps keep the rabbit inside.
Hooray—safe paths and carrots for guide.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main change: adding validation to /api/set-workspace to reject traversal, symlinks, and non-existent paths, with reference to the closed issue #15.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/set-workspace-path-validation-15

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_workspace_path_validation.py (1)

161-187: ⚡ Quick win

Add explicit JSON null body regression case.

You already pin array/string/number non-dict bodies; adding null closes the same class completely and protects the shape-guard contract.

Suggested test
+    def test_non_dict_json_null_returns_400(self):
+        resp = self.client.post(
+            "/api/set-workspace",
+            data="null",
+            content_type="application/json",
+        )
+        self.assertEqual(resp.status_code, 400)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_workspace_path_validation.py` around lines 161 - 187, Add a
regression test to assert that a JSON null body returns 400 like other non-dict
bodies: create a test function (e.g., test_non_dict_json_null_returns_400) in
tests/test_workspace_path_validation.py mirroring the existing tests
(test_non_dict_json_array_returns_400_not_500,
test_non_dict_json_string_returns_400, test_non_dict_json_number_returns_400)
that posts data="null" with content_type="application/json" to
"/api/set-workspace" and asserts resp.status_code == 400 (and optionally that
resp.get_json() contains an "error" key to match the array case).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/config_api.py`:
- Around line 93-100: The call to set_workspace_path_override(canonical) can
raise and currently falls outside the validate_workspace_path() try/except,
causing an unstructured framework 500; wrap
set_workspace_path_override(canonical) in its own try/except that catches
Exception (or specific persistence errors if available) and returns a JSON error
response (e.g., jsonify({"error": "Failed to persist workspace path", "details":
str(e)}), 500), while optionally logging the exception; keep existing handling
for WorkspacePathError from validate_workspace_path() intact.

---

Nitpick comments:
In `@tests/test_workspace_path_validation.py`:
- Around line 161-187: Add a regression test to assert that a JSON null body
returns 400 like other non-dict bodies: create a test function (e.g.,
test_non_dict_json_null_returns_400) in tests/test_workspace_path_validation.py
mirroring the existing tests (test_non_dict_json_array_returns_400_not_500,
test_non_dict_json_string_returns_400, test_non_dict_json_number_returns_400)
that posts data="null" with content_type="application/json" to
"/api/set-workspace" and asserts resp.status_code == 400 (and optionally that
resp.get_json() contains an "error" key to match the array case).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7a3c5e99-e2df-4ce1-a18a-18556c9b9dfe

📥 Commits

Reviewing files that changed from the base of the PR and between f8b3cb3 and a63297f.

📒 Files selected for processing (3)
  • api/config_api.py
  • tests/test_workspace_path_validation.py
  • utils/path_validation.py

Comment thread api/config_api.py
@timon0305 timon0305 self-assigned this May 7, 2026
@timon0305 timon0305 requested a review from clean6378-max-it May 7, 2026 22:02
…deRabbit on PR #22)

validate_workspace_path() failures were already returning structured
JSON, but set_workspace_path_override(canonical) was outside the try
block — a persistence failure would have surfaced as Flask's HTML 500
page instead of {"error": "..."}. Wraps the call in its own try/except
so the response shape stays structured for any consumer parsing JSON.
@timon0305 timon0305 requested review from bradjin8 and removed request for clean6378-max-it May 8, 2026 15:00
… API tests

Two test-hygiene follow-ups from a structured re-review of PR #22; no
production code changes.

1. tests/test_workspace_path_validation.py — test_returns_canonical_path_collapsing_dotdot
   The canonical-path assertion was a substring check against `..` on
   the raw realpath string. That would spuriously fail if the OS-supplied
   tempdir name ever embedded `..` in a folder component — substring
   presence is the wrong primitive for the invariant we actually care
   about, which is "no `..` *segment* in the canonical path." Switched
   to `Path(result).parts`, which handles `\` vs `/` natively and
   asserts on path components.

2. tests/test_workspace_path_validation.py — TestSetWorkspaceApi.setUp
   The 200-path test mutates the module-global _workspace_path_override
   in utils/workspace_path.py via the API, and the tempdir it then
   points at is rmtree'd by the existing cleanup. Without an explicit
   reset, the global stays pinned at a now-deleted path across tests.
   Added addCleanup(set_workspace_path_override, None) so any future
   sibling test inspecting the override sees a clean None.

Full suite: 152/152 OK (skipped=2 POSIX-only symlink tests on Windows).
No behaviour change; ReadLints clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Collaborator

@bradjin8 bradjin8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utils/workspace_path.py 61–67

WORKSPACE_PATH env is still applied in resolve_workspace_path() without the same checks as the API. That is a normal “trusted operator” escape hatch but differs from XSS-driven POST abuse in #15. Optional: document in README, or apply the same validator when the env is read (stricter).

utils/path_validation.py 30–48

Marker heuristic only inspects immediate children for state.vscdb. Matches /api/validate-path and the issue; if Cursor ever nests storage differently, both endpoints could false-negative. Acceptable as stated.

Comment thread tests/test_workspace_path_validation.py
Comment thread utils/path_validation.py
Comment thread api/config_api.py
bradjin8 and others added 2 commits May 8, 2026 16:07
- POST /api/validate-path now uses the same realpath + marker checks as
  set-workspace; returns canonical path and structured errors on failure.
- README documents WORKSPACE_PATH as trusted-operator tilde expansion only.
- Config page shows server error text when validation fails.
- Docstrings + symlink-test CI note; TOCTOU comment after realpath.

Co-authored-by: Cursor <cursoragent@cursor.com>
@bradjin8 bradjin8 requested a review from wpak-ai May 8, 2026 20:12
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
templates/config.html (1)

47-61: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't treat /api/set-workspace failures as a successful save.

Both save flows ignore the /api/set-workspace response. Because fetch() does not throw on HTTP 400/500, the page can still update localStorage, show success, and redirect even when the server rejected or failed to persist the override. Please gate the success path on response.ok/success, and store the canonical path returned by the API instead of the raw input.

Proposed fix for the manual save flow
     if (data.valid) {
-      localStorage.setItem('workspacePath', path);
-      await fetch('/api/set-workspace', {
+      const saveRes = await fetch('/api/set-workspace', {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
         body: JSON.stringify({ path })
       });
+      const saveData = await saveRes.json();
+      if (!saveRes.ok || !saveData.success) {
+        statusEl.className = 'alert alert-danger';
+        statusEl.textContent = saveData.error || 'Failed to save configuration.';
+        statusEl.style.display = 'block';
+        return;
+      }
+      localStorage.setItem('workspacePath', saveData.path || data.path || path);
       statusEl.className = 'alert alert-success';
       statusEl.textContent = `Found ${data.workspaceCount} workspaces in the specified location`;
       statusEl.style.display = 'block';
       setTimeout(() => { window.location.href = '/'; }, 1000);

Apply the same response handling in the auto-detect branch at Lines 47-61.

Also applies to: 86-99

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@templates/config.html` around lines 47 - 61, The auto-detect branch currently
ignores the POST to '/api/set-workspace' and treats the operation as successful;
change the flow in the block handling 'detected' so you await and inspect the
response (call it e.g. setRes) from fetch('/api/set-workspace'), check setRes.ok
(or parse JSON and check a success flag), and only then update
localStorage.setItem('workspacePath', ...) / show success / redirect;
additionally parse the canonical path from the set-workspace response body and
store that canonical path (not the raw 'detected' input). Apply the same
response.ok + canonical-path handling to the manual save flow that also calls
'/api/set-workspace'.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/config_api.py`:
- Around line 55-60: The endpoint reads the JSON body with
request.get_json(silent=True) but doesn't guard against non-object JSON (e.g., a
string), so body.get("path", "") can raise; mirror the check used in
set_workspace(): after body = request.get_json(silent=True) or {}, verify
isinstance(body, dict) and if not return jsonify({"valid": False, "error":
"invalid JSON body", "workspaceCount": 0}); then use body.get("path", "") and
proceed to call validate_workspace_path(raw) and catch WorkspacePathError as
before to preserve the existing error response shape.

---

Outside diff comments:
In `@templates/config.html`:
- Around line 47-61: The auto-detect branch currently ignores the POST to
'/api/set-workspace' and treats the operation as successful; change the flow in
the block handling 'detected' so you await and inspect the response (call it
e.g. setRes) from fetch('/api/set-workspace'), check setRes.ok (or parse JSON
and check a success flag), and only then update
localStorage.setItem('workspacePath', ...) / show success / redirect;
additionally parse the canonical path from the set-workspace response body and
store that canonical path (not the raw 'detected' input). Apply the same
response.ok + canonical-path handling to the manual save flow that also calls
'/api/set-workspace'.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b77ae31b-364d-4422-acbe-56b463fc8095

📥 Commits

Reviewing files that changed from the base of the PR and between b7ba220 and 2aa40b7.

📒 Files selected for processing (6)
  • README.md
  • api/config_api.py
  • templates/config.html
  • tests/test_workspace_path_validation.py
  • utils/path_validation.py
  • utils/workspace_path.py
✅ Files skipped from review due to trivial changes (2)
  • README.md
  • utils/workspace_path.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_workspace_path_validation.py

Comment thread api/config_api.py
Avoid AttributeError on truthy JSON scalars/arrays (same class as PR #16).
Return valid:false + workspaceCount:0 to match validation error shape.

Co-authored-by: Cursor <cursoragent@cursor.com>
@bradjin8 bradjin8 merged commit 95d3140 into master May 8, 2026
6 checks passed
@bradjin8 bradjin8 deleted the fix/set-workspace-path-validation-15 branch May 8, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: validate /api/set-workspace path (reject traversal, symlink escape, non-existent paths)

3 participants