Skip to content

Conversation

@rschmitt
Copy link
Contributor

These test cases are intermittently failing in the build environment on macos-latest using JDK11, because some of the calls seem to be taking slightly more than twice as long as they ought to before timing out. I'm not able to reproduce the issue, but it's safe to increase the upper bound: we'll still know if one of the socket timeout config options stops working, since the assertThrows assertion will fail.

These test cases are intermittently failing in the build environment on
macos-latest using JDK11, because some of the calls seem to be taking
slightly more than twice as long as they ought to before timing out. I'm
not able to reproduce the issue, but it's safe to increase the upper
bound: we'll still know if one of the socket timeout config options
stops working, since the `assertThrows` assertion will fail.
@ok2c
Copy link
Member

ok2c commented Jun 19, 2025

@rschmitt This still feels a bit random. If all we want is to test whether a certain timeout setting takes precedence over other settings, could not we set all timeout to, say, 1 minute, and the timeout being tested to 10s and assert that the time taken for the the message exchange to get interrupted is less than 1 minute?

@rschmitt
Copy link
Contributor Author

That's not too different from what we're doing now, though, but I have a few observations:

  1. We can't set all the other timeouts to one minute. In order to test lower-precedence timeouts, the higher-precedence timeouts have to be unset. Otherwise the ResponseTimeout will always win.
  2. The specific values are much lower than the ones you suggested because I hate slow tests. I want them to succeed as fast as possible, and ideally to fail quickly as well in the event of a regression. Ideally I'd like to add an integration test for the default socket timeout (which is one minute), but that would more than double the time it takes to run ./mvnw verify. Hence, I haven't added such a test, even though I feel it's very important that the client in its default configuration never hang indefinitely when making a call.
  3. One thing I like about the stricter upper bound (stricter than "less than the far longer timeouts we configured") is that it can detect other issues, such as changes in internal retry behavior, or the blocking SSLSocket close behavior on JDK11 that was detected by TestTlsHandshakeTimeout. I'd rather be able to detect these things and then decide whether we want to ignore them or not than ignore them automatically because our assertions are so lax. There's a balance here that has to be maintained over time: we want our tests to be sensitive, but not flaky. (Flaky tests should not be tolerated.)

@ok2c
Copy link
Member

ok2c commented Jun 19, 2025

@rschmitt I likewise prefer test cases to succeed fast, but I do not mind some of them fail slow if that helps make them more reliable. Anyways, I have nothing against magic numbers as song as everything works.

@rschmitt rschmitt merged commit 7897744 into apache:master Jun 19, 2025
10 checks passed
@rschmitt rschmitt deleted the fix-sotimeout branch June 19, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants