Skip to content

fix: skip binary files in content search#3006

Open
dgageot wants to merge 1 commit into
docker:mainfrom
dgageot:board/9d4587b8d0900d39
Open

fix: skip binary files in content search#3006
dgageot wants to merge 1 commit into
docker:mainfrom
dgageot:board/9d4587b8d0900d39

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented Jun 5, 2026

The search_files_content tool now sniffs file headers to detect and skip binary files before attempting to read them. This prevents slow reads, unnecessary memory usage, and potential encoding errors when encountering binary data during search operations.

The implementation detects ELF binaries, PNG images, PDF files, and other common binary formats by checking magic bytes at the start of files. Tests cover all major detection patterns and verify end-to-end behavior with mixed text and binary files in a directory.

@dgageot dgageot requested a review from a team as a code owner June 5, 2026 09:32
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

This PR correctly implements binary file detection for search_files_content by sniffing the first 512 bytes of each file. The implementation is clean, consistent with existing patterns, and well-tested.

Approach: Checks for null bytes first (fast path for ELF, ZIP, etc.), then falls back to http.DetectContentType for magic-byte-based detection (PDF, PNG, and other formats). Silent skip on detection errors is consistent with how the rest of the walk handles I/O failures.

Test coverage: Unit tests cover empty input, plain text, UTF-8 text, ELF, PNG, PDF, and an end-to-end integration test with mixed text/binary files in a directory — adequate for the scope of change.

No bugs found in the changed code.

@aheritier aheritier added area/tools For features/issues/fixes related to the usage of built-in and MCP tools kind/fix PR fixes a bug (maps to fix: commit prefix) labels Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tools For features/issues/fixes related to the usage of built-in and MCP tools kind/fix PR fixes a bug (maps to fix: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants