Skip to content

Conversation

@rroesch1
Copy link
Contributor

@rroesch1 rroesch1 commented Aug 5, 2024

What is the purpose of the change

JsonEncoder uses special string values to represent NaN, Infinity and -Infinity values for float and double values, but JsonDecoder does not accept these string values. This change adds support for these special values to JsonDecoder.

Verifying this change

This change added tests and can be verified as follows:

  • The change adds an unit tests which verifies all 6 special cases:
    • NaN for float fields
    • NaN for double fields
    • Infinity for float fields
    • Infinity for double fields
    • -Infinity for float fields
    • -Infinity for double fields
  • These values are defined in Jackson.
    StdDeserializer

Documentation

  • This PR doesn't introduce a new feature but implements a behavior which is not documented at all.
  • The Avro spec doesn't mention how these special cases are handled in JSON (RFC 8259 does not define numeric literals for NaN, Infinity and -Infinity)
  • JsonEncoder/JsonDecoder resemble the behavior of Jackson

@github-actions github-actions bot added the Java Pull Requests for Java binding label Aug 5, 2024
@martin-g
Copy link
Member

martin-g commented Aug 6, 2024

Similar PR for the Rust SDK: #3051
Note: it also supports inf next to infinity.

@martin-g
Copy link
Member

martin-g commented Aug 6, 2024

Note: it also supports inf next to infinity.

I see that the implementation here also supports Inf. The difference with the Rust PR is that here the values are case-sensitive.

@rroesch1
Copy link
Contributor Author

rroesch1 commented Aug 6, 2024

@martin-g

@rroesch1
Copy link
Contributor Author

rroesch1 commented Aug 6, 2024

fyi: looks like the C++ implementations also uses the literals "NaN", "Infinity", and "-Infinity" (see: JsonCodec.cc )

Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Avro spec doesn't mention how these special cases are handled in JSON

Indeed. I think we should fix that AVRO-4025.
Perhaps we could add some tests to demonstrate the current behavior (the de-facto standard) of schema / JsonEncoder, and them document the behavior into the spec. I will try to have a PR for that soon.

@xxchan
Copy link
Member

xxchan commented Aug 6, 2024

Please note this PR is just about the JSON encoding for records, not about the encoding for schemas

Indeed. But "Except for unions, the JSON encoding is the same as is used to encode field default values (in schema)", so I think they are essentially the same (and SHOULD have the same behavior).

https://avro.apache.org/docs/1.11.1/specification/_print/#json-encoding

@zcsizmadia
Copy link
Contributor

zcsizmadia commented Aug 7, 2024

The same PR for C# #3070. We should specify the accepted strings. NaN, Infinity, -Infinity makes sense, however if if we need to support INF and -INF and/or case insensitivity, we should add all the supprted strings to the doc. I added tests for all of those cases.

martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g added a commit that referenced this pull request Aug 7, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
martin-g added a commit that referenced this pull request Aug 7, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
(cherry picked from commit dffc13a)
martin-g added a commit to apache/avro-rs that referenced this pull request Sep 23, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (apache/avro#3066).
This is what the C# SDK also would support (apache/avro#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@legal-s
Copy link

legal-s commented Nov 22, 2024

Hello,

Do you know when this pull request will be merged because it would fix a data deserialization problem with an entity containing the NaN value that i have?

Thanks

@gbecan
Copy link

gbecan commented Mar 20, 2025

We are also impacted by this issue and would love to see this PR merged.
The Rust and C# PRs are merged. Is there a reason why the Java one is stuck?
Happy to help if something is missing 🙏

opwvhk pushed a commit to opwvhk/avro that referenced this pull request Sep 5, 2025
…g numbers with Java and C# (apache#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (apache#3066).
This is what the C# SDK also would support (apache#3070)

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>

---------

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
@ajross
Copy link

ajross commented Oct 17, 2025

We'd love to see this PR merged. Is there anything we can do to help to get this over the line?

JsonEncoder uses special string values to represent NaN, Infinity and
-Infinity values for float and double values, but JsonDecoder does not
accept these string values. This change adds support for these special
values to JsonDecoder.
@woppa684
Copy link

woppa684 commented Nov 10, 2025

We are also impacted by this issue and would love to see this PR merged. The Rust and C# PRs are merged. Is there a reason why the Java one is stuck? Happy to help if something is missing 🙏

Same question indeed, why not merge this fix for Java? @martin-g or @xxchan is that something you can answer?

@ajross
Copy link

ajross commented Dec 2, 2025

@RyanSkraba I saw you'd been able to get several PRs merged to this repo recently. Any chance you can help get this approved or point us to someone that can complete this PR?

@RyanSkraba
Copy link
Contributor

Hello! I'm keen to have this straightened out, but my solution would be to remove JSON decoding from Avro entirely 😆 There's no reliable and portable way to represent floating point numbers in "pure" JSON, and everybody has their own unhappy workarounds and assumptions (looking at Jackson).

Seriously though, let's get this merged into Avro Java (and cherry-pick it into the next minor releases ASAP) so we can round-trip our data through a JSON encoding and decoding cycle without errors!

I created AVRO-4217 to track this discussion for the next major release.

@RyanSkraba RyanSkraba merged commit 8a22cef into apache:main Dec 3, 2025
9 checks passed
RyanSkraba pushed a commit that referenced this pull request Dec 3, 2025
…#3066)

JsonEncoder uses special string values to represent NaN, Infinity and
-Infinity values for float and double values, but JsonDecoder does not
accept these string values. This change adds support for these special
values to JsonDecoder.
RyanSkraba pushed a commit that referenced this pull request Dec 3, 2025
…#3066)

JsonEncoder uses special string values to represent NaN, Infinity and
-Infinity values for float and double values, but JsonDecoder does not
accept these string values. This change adds support for these special
values to JsonDecoder.
@RyanSkraba
Copy link
Contributor

Cherry-picked to branch-1.12.
Cherry-picked to branch-1.11.

@ajross
Copy link

ajross commented Dec 4, 2025

Thanks very much @RyanSkraba !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Java Pull Requests for Java binding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants