Skip to content

Conversation

@gahr
Copy link

@gahr gahr commented Sep 22, 2025

What is the purpose of the change

StreamTests uses the test_str.bin file. This is problematic in our setup because we build and test 32-bit and 64-bit versions in parallel. When the 32-bit and 64-bit StreamTest happen to run concurrently, one ends up deleting the file that the other is trying to read.

My solution here is to use a temporary file instead of the hard-coded test_str.bin.

This fixes AVRO-4185.

Verifying this change

This change is already covered by existing tests, such as StreamTests.

Documentation

  • Does this pull request introduce a new feature? (no)

StreamTests uses the test_str.bin file. This is problematic in our setup because we build and test 32-bit and 64-bit versions in parallel. When the 32-bit and 64-bit StreamTest happen to run concurrently, one ends up deleting the file that the other is trying to read.

My solution here is to use a temporary file instead of the hard-coded test_str.bin.
@github-actions github-actions bot added the C++ Pull Requests for C++ binding label Sep 22, 2025
@martin-g martin-g requested a review from Copilot September 22, 2025 21:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a concurrency issue in StreamTests where parallel execution of 32-bit and 64-bit test builds would interfere with each other by sharing the same hard-coded test file. The solution replaces the fixed filename with temporary files generated at runtime.

  • Replaces hard-coded "test_str.bin" filename with dynamically generated temporary files
  • Refactors FileRemover struct to FileGuard for better encapsulation of temporary file management
  • Ensures each test run uses its own unique temporary file to prevent race conditions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

explicit FileRemover(const char *fn) : file(fn) {}
~FileRemover() { std::filesystem::remove(file); }
struct FileGuard {
const std::filesystem::path path{ std::tmpnam(nullptr) };
Copy link

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::tmpnam(nullptr) is unsafe as it can cause race conditions and security vulnerabilities. Consider using std::filesystem::temp_directory_path() with a unique filename generator or platform-specific secure temporary file functions instead.

Copilot uses AI. Check for mistakes.
Copy link
Author

@gahr gahr Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these concerns apply to this test code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't tmpnam deprecated ?

Avro uses C++17, so we can use something like the following instead:

#include <random>
...
std::filesystem::path path = std::filesystem::temp_directory_path() / 
    ("avro_test_" + std::to_string(std::random_device{}()));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't tmpnam deprecated ?

I can't find references to its deprecation.

I don't think using std::random_device is a good idea, as it can be implemented using a PRNG, i.e., two tests running at the same time would produce the same number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/hs3e7355(v=vs.100)
In the example code there is // Note: tmpnam is deprecated; consider using tmpnam_s instead

Also in the build logs we can see:

2025-10-23T08:00:00.3420284Z [ 68%] Linking CXX executable StreamTests
2025-10-23T08:00:00.4955456Z /usr/bin/ld: CMakeFiles/StreamTests.dir/test/StreamTests.cc.o: in function `avro::stream::FileGuard::FileGuard()':
2025-10-23T08:00:00.4957287Z /home/runner/work/avro/avro/lang/c++/test/StreamTests.cc:138:(.text._ZN4avro6stream9FileGuardC2Ev[_ZN4avro6stream9FileGuardC5Ev]+0x2a): warning: the use of `tmpnam' is dangerous, better use `mkstemp'
2025-10-23T08:00:00.6147334Z [ 68%] Built target StreamTests

https://productionresultssa3.blob.core.windows.net/actions-results/abfc1b86-599c-48bd-a249-acda8f187020/workflow-job-run-670f0204-095a-5c23-a7d3-1d13fabc5e69/logs/job/job-logs.txt?rsct=text%2Fplain&se=2025-10-23T08%3A12%3A13Z&sig=9G%2FOy2gomdBnbYY%2FXTGGg%2Fza%2BjuKKXvqJVAL4xpemfg%3D&ske=2025-10-23T17%3A45%3A39Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2025-10-23T05%3A45%3A39Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-11-05&sp=r&spr=https&sr=b&st=2025-10-23T08%3A02%3A08Z&sv=2025-11-05

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a more standard way, sorry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since random_device() is meant to be used for seeding a PNRG and not directly for generating random numbers, we can do something like:

static std::random_device rdev;
static std::mt19937 rng(rdev());

struct FileGuard {
    const std::filesystem::path path{ rng() };
    ...
 }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't address my concern: two processes can still generate the same sequence.

Copy link
Contributor

@thiru-mg thiru-mg Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember the code is in test. The conflict happens when two people run the run the exactly at the same sub-microsecond. How likely is it? Even if happens, who cares? Just rerun the test. It is anyway better than what we have now.
If we are still paranoid, let's xor the value of random_device() with pid.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the spec that say that random_device must be implemented in such a way? In the standard I read it can generate the same sequence each time.

explicit FileRemover(const char *fn) : file(fn) {}
~FileRemover() { std::filesystem::remove(file); }
struct FileGuard {
const std::filesystem::path path{ std::tmpnam(nullptr) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't tmpnam deprecated ?

Avro uses C++17, so we can use something like the following instead:

#include <random>
...
std::filesystem::path path = std::filesystem::temp_directory_path() / 
    ("avro_test_" + std::to_string(std::random_device{}()));

~FileRemover() { std::filesystem::remove(file); }
struct FileGuard {
const std::filesystem::path path{ std::tmpnam(nullptr) };
~FileGuard() { std::filesystem::remove(path); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we care about failures in remove() here ?
Or since it is a test we don't care ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not care, and I wouldn't know how to recover from such an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not recover, but log it.
But this is minor. We can ignore it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fail to remove(), it is not catastrophic, but it would point to a bug somewhere. At lease we should report the error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bug, or some temporary condition.

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ Pull Requests for C++ binding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants