-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Avro-4193: [c] segfault cannot handle empty byte array #3527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Please rebase! |
When avro-c tries to encode an empty byte array it will avro_malloc(0) which on some architectures will return NULL. Make sure this is not interpreted as an error or or dereferenced causing a segfault.
ea27931 to
89fb9db
Compare
|
@martin-g sorry for that, rebased on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a segmentation fault that occurs when avro-c attempts to encode empty byte arrays. The issue arises because avro_malloc(0) can return NULL on some architectures, which was incorrectly treated as an allocation error and caused null pointer dereferences.
Key changes:
- Updated NULL checks to distinguish between allocation failures and valid zero-size allocations
- Added null pointer guard in the test allocator's deallocation path
- Extended test coverage to validate empty byte array handling
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| lang/c/src/datum.c | Modified NULL checks in avro_bytes() and avro_bytes_set() to only treat NULL as an error when size is non-zero |
| lang/c/src/value-json.c | Updated NULL checks in encode_utf8_bytes() and avro_value_to_json_t() to handle zero-length allocations correctly |
| lang/c/tests/test_avro_data.c | Added null pointer guard in test allocator and introduced test_empty_bytes() to verify empty byte array handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| static int test_empty_bytes(void) | ||
| { | ||
| char bytes[] = { }; |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty array initializer { } is a GNU extension and not standard C. Consider using char *bytes = NULL; or char bytes[1]; for better portability across compilers.
| char bytes[] = { }; | |
| char *bytes = NULL; |
| avro_datum_decref(datum); | ||
| avro_datum_decref(expected_datum); | ||
|
|
||
| avro_schema_decref(writer_schema); |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double free: writer_schema is already decremented at line 230, so this second decrement at line 243 will cause a use-after-free error.
| avro_schema_decref(writer_schema); |
| { | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen now if size is 0?
It looks to me that no matter what is the value of bytes_copy it will never enter the if and it will call memcpy(bytes_copy, bytes, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the there must be a check for size == 0 as a first step in this function that returns early and avoids any mallocs and memcpys.
| } | ||
|
|
||
| json_t *result = json_stringn_nocheck((const char *) encoded, encoded_size); | ||
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded_size); | |
| json_t *result = json_stringn_nocheck((const char *) encoded ? encoded : "", encoded ? encoded_size : 0); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a similar code for FIXED at https://github.com/steven-aerts/avro/blob/89fb9db03478b793be7188c5683c6560c2eff115/lang/c/src/value-json.c#L244
Does it need the same improvements ?
| { | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the there must be a check for size == 0 as a first step in this function that returns early and avoids any mallocs and memcpys.
| int rval; | ||
| char *bytes_copy = (char *) avro_malloc(size); | ||
| if (!bytes_copy) { | ||
| if (!bytes_copy && size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
What is the purpose of the change
When avro-c tries to encode an empty byte array it will avro_malloc(0)
which on some architectures will return NULL. Make sure this is not
interpreted as an error or or dereferenced causing a segfault.
Verifying this change
Extended unit tests to validate this case
Documentation