fix: prevent path traversal in knowledge source file paths (#4547)#4548
Open
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Open
fix: prevent path traversal in knowledge source file paths (#4547)#4548devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
Conversation
Add path boundary validation in convert_to_path() to ensure resolved paths stay within the knowledge directory. This prevents attackers from using '../' sequences to read arbitrary files outside the knowledge dir. Fixes: - BaseFileKnowledgeSource.convert_to_path() - ExcelKnowledgeSource.convert_to_path() - CrewDoclingSource.validate_content() inline path construction Added tests covering path traversal rejection and valid path acceptance. Co-Authored-By: João <joao@crewai.com>
Contributor
Author
|
Prompt hidden (unlisted session) |
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix: prevent path traversal in knowledge source file paths (#4547)
Summary
Adds path boundary validation to prevent path traversal attacks in knowledge source file path handling. Previously,
convert_to_path()simply concatenated user-provided strings withKNOWLEDGE_DIRECTORY(e.g.,Path("knowledge/" + path)), allowing../sequences to escape the knowledge directory and read arbitrary files on the system.The fix uses
Path.resolve()+Path.is_relative_to()to validate that the final resolved path remains within the knowledge directory, raisingValueErrorif it doesn't. Applied to three locations:BaseFileKnowledgeSource.convert_to_path()ExcelKnowledgeSource.convert_to_path()CrewDoclingSource.validate_content()(inline path construction)Review & Testing Checklist for Human
Pathobject (not a string) is passed, the traversal check is skipped entirely — only string inputs are validated. Verify this is acceptable for your threat model, sincePath("../../etc/passwd")would not be caught.KNOWLEDGE_DIRECTORYis"knowledge"(relative).Path("knowledge").resolve()resolves against CWD. Verify this behaves correctly in all deployment contexts (e.g., different working directories, symlinks).Pathobject, so it exercises the non-validated branch. Consider whether a test with a valid string path (e.g.,"test_valid.txt"whereknowledge/test_valid.txtexists) would better cover the happy path through the new validation logic.doclingpackage — thecrew_docling_source.pyfix may not be tested in CI.Suggested manual test: instantiate a
TextFileKnowledgeSource(file_paths=["../../../etc/passwd"])and confirm it raisesValueError.Notes