fix: clean up temp files and fix image encoding path mismatch#10
Conversation
|
✅ @khasinski — Superconductor finished — View implementation Standing by for instructions. |
There was a problem hiding this comment.
Pull request overview
This PR attempts to fix two issues: temp file cleanup for uploaded files and image path encoding mismatch. However, the implementation introduces a critical bug that will prevent the agent from functioning properly.
Changes:
- Added
ensureblocks to all FileAnalyzerAgent action methods to callcleanup_temp_file - Added
cleanup_temp_filemethod that deletes files in/tmp/upload_*pattern - Added
encode_imagedispatch method that triesfile_pathfirst, then falls back toattachment_slug - Renamed
encode_imagemethod toencode_image_to_base64to avoid naming collision - Added two new tests for the cleanup functionality
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
app/agents/file_analyzer_agent.rb |
Added ensure blocks for temp file cleanup, new encode_image dispatch method, renamed encode_image_to_base64, added cleanup_temp_file method |
test/agents/file_analyzer_agent_test.rb |
Added tests for cleanup_temp_file behavior and updated test for renamed method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
app/agents/file_analyzer_agent.rb
Outdated
| # Clean up temp files created by the controller for file uploads | ||
| def cleanup_temp_file | ||
| return unless @file_path | ||
| return unless @file_path.to_s.include?("/tmp/upload_") |
There was a problem hiding this comment.
The path matching logic uses a string literal "/tmp/upload_" which assumes a Unix-style path separator. On Windows, paths use backslashes. Consider using File.join or checking for both separators, or use a more robust pattern like checking if the filename contains "upload_" regardless of the full path structure.
For example: @file_path.to_s.include?("#{File::SEPARATOR}tmp#{File::SEPARATOR}upload_") or check the basename: File.basename(@file_path.to_s).start_with?("upload_")
| return unless @file_path.to_s.include?("/tmp/upload_") | |
| return unless @file_path.to_s.include?("#{File::SEPARATOR}tmp#{File::SEPARATOR}upload_") |
| test "cleanup_temp_file deletes upload temp files" do | ||
| temp_file = Rails.root.join("tmp", "upload_abc123_test.jpg") | ||
| File.write(temp_file, "fake image") | ||
|
|
||
| agent = FileAnalyzerAgent.new | ||
| agent.instance_variable_set(:@file_path, temp_file.to_s) | ||
| agent.send(:cleanup_temp_file) | ||
|
|
||
| assert_not File.exist?(temp_file) | ||
| end | ||
|
|
||
| test "cleanup_temp_file does not delete non-upload files" do | ||
| temp_file = @temp_dir.join("important.txt") | ||
| File.write(temp_file, "keep me") | ||
|
|
||
| agent = FileAnalyzerAgent.new | ||
| agent.instance_variable_set(:@file_path, temp_file.to_s) | ||
| agent.send(:cleanup_temp_file) | ||
|
|
||
| assert File.exist?(temp_file) | ||
| end |
There was a problem hiding this comment.
The test suite passes but doesn't catch the critical timing bug with ensure blocks and async jobs. The current tests only verify the controller returns a stream_id but don't test that the agent can actually access the files when the async job runs. Consider adding integration tests that verify the full workflow including async job execution, or run the test suite with jobs executed inline to catch these issues.
| def analyze_pdf | ||
| @file_path = params[:file_path] | ||
| # Read PDF content (would need pdf-reader gem) | ||
| @content = extract_pdf_content(@file_path) if @file_path | ||
|
|
||
| setup_context_and_prompt | ||
| ensure | ||
| cleanup_temp_file | ||
| end |
There was a problem hiding this comment.
The ensure block will execute immediately when the action method returns, but before the async job (triggered by generate_later) runs. This means temp files will be deleted before the agent can process them. The file will be deleted before analyze_pdf can read it in the async job.
Consider moving the cleanup logic to happen after the agent processing completes, or make the cleanup conditional on synchronous execution. One approach is to add an after_generate callback or perform cleanup in a job completion callback.
| def analyze_image | ||
| Rails.logger.info "[FileAnalyzer] analyze_image called with params: #{params.inspect}" | ||
|
|
||
| @file_path = params[:file_path] | ||
| @description_detail = params[:description_detail] || "medium" | ||
|
|
||
| # Encode image before calling prompt so it can be passed as an option | ||
| encode_image_from_attachment | ||
|
|
||
| encode_image | ||
| setup_context_and_prompt_with_image | ||
| ensure | ||
| cleanup_temp_file | ||
| end |
There was a problem hiding this comment.
The ensure block will execute immediately when the action method returns, but before the async job (triggered by generate_later) runs. This means temp files will be deleted before the agent can process them. The file will be deleted before analyze_image can encode and process it in the async job.
Consider moving the cleanup logic to happen after the agent processing completes, or make the cleanup conditional on synchronous execution. One approach is to add an after_generate callback or perform cleanup in a job completion callback.
| def extract_image_text | ||
| Rails.logger.info "[FileAnalyzer] extract_image_text called with params: #{params.inspect}" | ||
|
|
||
| @file_path = params[:file_path] | ||
| @extraction_focus = params[:extraction_focus] || "all" | ||
|
|
||
| # Encode image before calling prompt | ||
| encode_image_from_attachment | ||
|
|
||
| encode_image | ||
| setup_context_and_prompt_with_image | ||
| ensure | ||
| cleanup_temp_file | ||
| end |
There was a problem hiding this comment.
The ensure block will execute immediately when the action method returns, but before the async job (triggered by generate_later) runs. This means temp files will be deleted before the agent can process them. The file will be deleted before extract_image_text can encode and process it in the async job.
Consider moving the cleanup logic to happen after the agent processing completes, or make the cleanup conditional on synchronous execution. One approach is to add an after_generate callback or perform cleanup in a job completion callback.
| def extract_text | ||
| @file_path = params[:file_path] if params[:file_path] | ||
| @content = extract_file_content(@file_path) if @file_path | ||
|
|
||
| setup_context_and_prompt | ||
| ensure | ||
| cleanup_temp_file | ||
| end |
There was a problem hiding this comment.
The ensure block will execute immediately when the action method returns, but before the async job (triggered by generate_later) runs. This means temp files will be deleted before the agent can process them. The file will be deleted before extract_text can read it in the async job.
Consider moving the cleanup logic to happen after the agent processing completes, or make the cleanup conditional on synchronous execution. One approach is to add an after_generate callback or perform cleanup in a job completion callback.
| def summarize_document | ||
| @file_path = params[:file_path] | ||
| @content = extract_file_content(@file_path) if @file_path | ||
|
|
||
| setup_context_and_prompt | ||
| ensure | ||
| cleanup_temp_file | ||
| end |
There was a problem hiding this comment.
The ensure block will execute immediately when the action method returns, but before the async job (triggered by generate_later) runs. This means temp files will be deleted before the agent can process them. The file will be deleted before summarize_document can read it in the async job.
Consider moving the cleanup logic to happen after the agent processing completes, or make the cleanup conditional on synchronous execution. One approach is to add an after_generate callback or perform cleanup in a job completion callback.
Two issues fixed: 1. Temp file leak: analyze_file and image_caption controller actions saved uploads to tmp/ but only cleaned up on error. Files accumulated indefinitely on success. Agent actions now clean up via ensure blocks. 2. Image path mismatch: controller passed file_path for uploaded images but analyze_image/extract_image_text only tried attachment_slug (via encode_image_from_attachment), silently failing to encode. Added smart dispatch in encode_image that uses file_path when available, falling back to attachment_slug for Active Storage.
- Use File.join for tmp/upload_ path matching instead of hardcoded separator - Add comment explaining why ensure blocks are safe with generate_later (entire action runs inside background job, not before enqueue)
1b398c1 to
5b6a1ea
Compare
Summary
Two issues fixed:
analyze_fileandimage_captionsaved uploads totmp/but only cleaned up on error. Files accumulated on success. Agent actions now clean up viaensureblocks, scoped toupload_*files only.file_pathfor uploaded images butanalyze_image/extract_image_textonly triedattachment_slug(viaencode_image_from_attachment), silently failing. Added smart dispatch that uses file path when available, falling back to attachment slug for Active Storage.Test plan