-
Notifications
You must be signed in to change notification settings - Fork 1.7k
AVRO-4185: [C++] Use a temp file in StreamTests #3496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
StreamTests uses the test_str.bin file. This is problematic in our setup because we build and test 32-bit and 64-bit versions in parallel. When the 32-bit and 64-bit StreamTest happen to run concurrently, one ends up deleting the file that the other is trying to read. My solution here is to use a temporary file instead of the hard-coded test_str.bin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a concurrency issue in StreamTests where parallel execution of 32-bit and 64-bit test builds would interfere with each other by sharing the same hard-coded test file. The solution replaces the fixed filename with temporary files generated at runtime.
- Replaces hard-coded "test_str.bin" filename with dynamically generated temporary files
- Refactors FileRemover struct to FileGuard for better encapsulation of temporary file management
- Ensures each test run uses its own unique temporary file to prevent race conditions
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| explicit FileRemover(const char *fn) : file(fn) {} | ||
| ~FileRemover() { std::filesystem::remove(file); } | ||
| struct FileGuard { | ||
| const std::filesystem::path path{ std::tmpnam(nullptr) }; |
Copilot
AI
Sep 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::tmpnam(nullptr) is unsafe as it can cause race conditions and security vulnerabilities. Consider using std::filesystem::temp_directory_path() with a unique filename generator or platform-specific secure temporary file functions instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these concerns apply to this test code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't tmpnam deprecated ?
Avro uses C++17, so we can use something like the following instead:
#include <random>
...
std::filesystem::path path = std::filesystem::temp_directory_path() /
("avro_test_" + std::to_string(std::random_device{}()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't
tmpnamdeprecated ?
I can't find references to its deprecation.
I don't think using std::random_device is a good idea, as it can be implemented using a PRNG, i.e., two tests running at the same time would produce the same number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/hs3e7355(v=vs.100)
In the example code there is // Note: tmpnam is deprecated; consider using tmpnam_s instead
Also in the build logs we can see:
2025-10-23T08:00:00.3420284Z [ 68%] Linking CXX executable StreamTests
2025-10-23T08:00:00.4955456Z /usr/bin/ld: CMakeFiles/StreamTests.dir/test/StreamTests.cc.o: in function `avro::stream::FileGuard::FileGuard()':
2025-10-23T08:00:00.4957287Z /home/runner/work/avro/avro/lang/c++/test/StreamTests.cc:138:(.text._ZN4avro6stream9FileGuardC2Ev[_ZN4avro6stream9FileGuardC5Ev]+0x2a): warning: the use of `tmpnam' is dangerous, better use `mkstemp'
2025-10-23T08:00:00.6147334Z [ 68%] Built target StreamTests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a more standard way, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since random_device() is meant to be used for seeding a PNRG and not directly for generating random numbers, we can do something like:
static std::random_device rdev;
static std::mt19937 rng(rdev());
struct FileGuard {
const std::filesystem::path path{ rng() };
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't address my concern: two processes can still generate the same sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember the code is in test. The conflict happens when two people run the run the exactly at the same sub-microsecond. How likely is it? Even if happens, who cares? Just rerun the test. It is anyway better than what we have now.
If we are still paranoid, let's xor the value of random_device() with pid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the spec that say that random_device must be implemented in such a way? In the standard I read it can generate the same sequence each time.
| explicit FileRemover(const char *fn) : file(fn) {} | ||
| ~FileRemover() { std::filesystem::remove(file); } | ||
| struct FileGuard { | ||
| const std::filesystem::path path{ std::tmpnam(nullptr) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't tmpnam deprecated ?
Avro uses C++17, so we can use something like the following instead:
#include <random>
...
std::filesystem::path path = std::filesystem::temp_directory_path() /
("avro_test_" + std::to_string(std::random_device{}()));
| ~FileRemover() { std::filesystem::remove(file); } | ||
| struct FileGuard { | ||
| const std::filesystem::path path{ std::tmpnam(nullptr) }; | ||
| ~FileGuard() { std::filesystem::remove(path); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we care about failures in remove() here ?
Or since it is a test we don't care ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not care, and I wouldn't know how to recover from such an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not recover, but log it.
But this is minor. We can ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we fail to remove(), it is not catastrophic, but it would point to a bug somewhere. At lease we should report the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bug, or some temporary condition.
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
What is the purpose of the change
StreamTests uses the test_str.bin file. This is problematic in our setup because we build and test 32-bit and 64-bit versions in parallel. When the 32-bit and 64-bit StreamTest happen to run concurrently, one ends up deleting the file that the other is trying to read.
My solution here is to use a temporary file instead of the hard-coded test_str.bin.
This fixes AVRO-4185.
Verifying this change
This change is already covered by existing tests, such as StreamTests.
Documentation