Skip to content

Conversation

@arturobernalg
Copy link
Member

Changes TLS handshake timeout fallback from connectTimeout to socketTimeout when no explicit timeout is configured in TlsConfig.

Default to socketTimeout (not connectTimeout) for TLS handshakes
// TLS handshake timeout precedence:
// 1. Explicitly configured handshake timeout from TlsConfig
// 2. Current socket timeout of the connection (if set)
// 3. Falls back to connectTimeout if neither is specified (handled later)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg I am not sure this is correct. Connect timeout should no longer have any effect.

// 1. Explicitly configured handshake timeout from TlsConfig
// 2. Current socket timeout of the connection (if set)
// 3. Falls back to connectTimeout if neither is specified (handled later)
final Timeout handshakeTimeout = tlsConfig.getHandshakeTimeout() != null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschmitt Would this change be OK with you? This should fix an inconsistency in the behavior of the classic and async transports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should default to the connectTimeout. The whole point of having separate TLS handshake timeout configuration is that defaulting to the socketTimeout (which is what would naturally happen) means that TLS handshakes take at least an order of magnitude longer to time out than is sensible. Response data can take an arbitrarily long amount of time to come back, whereas a TLS handshake should take roughly 2*RTT irrespective of the nature of the request being sent. If there's an inconsistency here between classic and async then it sounds like the classic behavior is wrong.

It also sounds like we could use an integration test here similar to the one for socket timeouts, which can also be set in a variety of ways. Such coverage could be added to the existing integration tests for socket timeouts. I added these tests in order to prevent precisely these kinds of regressions which I've had to deal with in the past, and it's not good that a change like the one in this PR doesn't break any test cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschmitt Fair enough. Please take a look at an alternative #699 and corrects the behavior of the classic connection operator instead.

With regards to the connect timeout integration tests they consistently fail for me locally (Windows MINGW64_NT-10.0-26100). I need to get them fixed first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're doing Windows development in mingw? Not WSL2 or native Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschmitt Ubuntu (or Debian) Linux is my primary development environment but sometimes I have to use a client issued laptop that runs Windows. I run Maven inside WinGit console. It is an old habit.

@arturobernalg
Copy link
Member Author

Closing in favor of 88c19c0

@lethinker
Copy link

lethinker commented Aug 7, 2025

@ok2c @rschmitt can you help me with the problem when I use the httpAsyncClient( httpclient 5.4.3 、jdk 21)
First, I set a 100-second sleep on the server side to ensure that no response is returned within 100 seconds.
Second, I use the RequestConfig.setResponseTimeout(Timeout.ofMilliseconds(50)) to set the response timeout value on the client side.

I have modified the timeout multiple times and obtained the following results.
[async]failed is 1 MILLISECONDS, time usage is 1033(1 MILLISECONDS means I set the ResponseTimeout as 1ms, time usage means that client side timeout after 1033ms)

[async]failed is 10 MILLISECONDS, time usage is 1010
[async]failed is 500 MILLISECONDS, time usage is 1015
[async]failed is 1000 MILLISECONDS, time usage is 1013 This is the expected value.

[async]failed is 1020 MILLISECONDS, time usage is 2019
......

It seems that the effected ResponseTimeout is being rounded up to the nearest second, and milliseconds are not taking effect.

In addition, I used a synchronous httpclient, which works well. I set the ResponseTimeout = 50ms. [sync]failed is Read timed out, time usage is 63

@ok2c
Copy link
Member

ok2c commented Aug 7, 2025

It seems that the effected ResponseTimeout is being rounded up to the nearest second, and milliseconds are not taking effect.

@lethinker Socket timeout granularity of async connections is one second (can be reduced at the cost of the i/o reactor running in a tighter loop and waking up more often, and causing higher CPU utilization). Socket timeout granularity of blocking connections is approximately 10 ms.

Timeouts in ms make no sense.

@lethinker
Copy link

It seems that the effected ResponseTimeout is being rounded up to the nearest second, and milliseconds are not taking effect.

@lethinker Socket timeout granularity of async connections is one second (can be reduced at the cost of the i/o reactor running in a tighter loop and waking up more often, and causing higher CPU utilization). Socket timeout granularity of blocking connections is approximately 10 ms.

Timeouts in ms make no sense.

Thank you for your help.
Is there any plan to support 10ms granularity timeout control in async connections in the fulture?
I just have the business scenario to use a timeout interval of 100ms, If the response time exceeds 100ms, I want to discard the message.

@ok2c
Copy link
Member

ok2c commented Aug 8, 2025

Is there any plan to support 10ms granularity timeout control in async connections in the fulture?

@lethinker No, there is not.

Please note that the solution mentioned by @rschmitt will cause the JRE run with near 100% CPU utilization. This is not a problem with a short integration test but may be a problem when running in PROD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants