Skip to content

Make NUT OpenSSL handshake more reliable#3344

Open
jimklimov wants to merge 2 commits intonetworkupstools:masterfrom
jimklimov:nit-ssl
Open

Make NUT OpenSSL handshake more reliable#3344
jimklimov wants to merge 2 commits intonetworkupstools:masterfrom
jimklimov:nit-ssl

Conversation

@jimklimov
Copy link
Member

Following up from findings in PR #3330 to address issue #1711, this PR would hopefully make the handshake more reliable on all systems supported by NUT (CI farm at least). Apparently Linux mostly handles the SSL handshake quickly enough to succeed by the first time we ask for results, and other platforms may need a few more cycles. Tests with OpenSSL in that PR failed so frequently, they had to be neutered by default.

Prepared with AI (Claude Sonnet 4.6) as a test to see how it fares... quite well, despite not seeing failure logs from CI server and mostly relying on test case descriptors which convey compilers, OSes, etc. Had troubles preparing a diff-patch file though.

Findings seem to match what @clepple proposed but could not commit to coding the fix quickly - review welcome. I think this addresses a similar problem to that explored in https://github.com/networkupstools/nut/compare/ssl_accept_nbio_v285?expand=1 (variant of an older branch, rebased over recent NUT master code to pre-resolve merge conflicts), but at least that branch did not fully solve the issues the NIT test reproduced all too well across the board. I'd love a comment from @clepple on whether that fix may bring some other benefits (e.g. by setting socket options), or may be abandoned if this PR works?

@jimklimov jimklimov added this to the 2.8.5 milestone Mar 8, 2026
@jimklimov jimklimov requested a review from clepple March 8, 2026 14:17
@jimklimov jimklimov self-assigned this Mar 8, 2026
@jimklimov jimklimov added bug Windows OpenBSD FreeBSD issues related to FreeBSD and its derivatives (including pfSense) macOS Solaris/illumos Solaris and illumos systems (OpenIndiana, OmniOS, SmartOS, TribbliX...) SSL/NSS Issues and PRs about SSL, TLS and other crypto-related matters NetBSD Linux Some issues are specific to Linux as a platform portability We want NUT to build and run everywhere possible AI For good or bad, machine tools are upon us. Humans are still the responsible ones. labels Mar 8, 2026
@AppVeyorBot
Copy link

…ng sockets [networkupstools#3331]

The TLS handshake functions `SSL_accept()` (server) and `SSL_connect()`
(client) require a retry loop when the underlying socket is in
non-blocking mode.  When either function returns '-1' with an error code
of `SSL_ERROR_WANT_READ` or `SSL_ERROR_WANT_WRITE`, it is signalling a
non-fatal "not done yet" condition: the handshake needs more I/O turns
to complete.  The correct response is to wait for the socket fd to
become ready in the indicated direction and then call the same function
again with the same SSL object (this is explicitly documented in OpenSSL
for every version since 0.9.x and the API contract has never changed).

Previously both call sites used a single-shot switch/case pattern that
treated these non-fatal WANT_READ/WANT_WRITE returns as fatal errors,
tearing down the connection immediately.

On Linux the loopback socket is fast enough that the handshake *nearly
always* completes in the first call, masking the bug.  On BSD (FreeBSD,
OpenBSD, NetBSD), macOS, and illumos-based systems (OmniOS, OpenIndiana)
the loopback socket behaves differently and WANT_READ/WANT_WRITE are
returned regularly, causing CI failures on all of those platforms
regardless of compiler, C standard dialect, or OpenSSL version used.

The fix is identical for all supported OpenSSL versions (0.9.x / 1.0.x /
1.1.x / 3.x): the `SSL_ERROR_WANT_*` codes and the required retry
semantics have been stable since the library's inception.

No NSS changes are needed: NSS uses `SSL_ForceHandshake()` which is
already blocking-by-design through NSPR.

Fixes issues (re-)discovered and/or confirmed by work on PR networkupstools#3330 for
issue networkupstools#1711

Tested-on: Linux (passes), macOS 12, FreeBSD 12, NetBSD 11, OpenBSD 6.5,
           OmniOS (amd64+i386), OpenIndiana (amd64+i386) - all compilers                    and C standard dialects in the CI matrix.
Co-authored-by: Claude Sonnet 4.6
Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…s, work on actual code may be needed to rectify it after all [networkupstools#3331, networkupstools#1711]"

This reverts commit af5da88
due to added support for WANT_READ/WANT_WRITE in OpenSSL handshake.

Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
@AppVeyorBot
Copy link

@vibingarthur
Copy link

CI was failing (shellcheck). Attempt #1 to fix.

Skipping: External CI (ci.networkupstools.org) requires external access to diagnose. Please check the shellcheck errors directly in the CI logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI For good or bad, machine tools are upon us. Humans are still the responsible ones. bug FreeBSD issues related to FreeBSD and its derivatives (including pfSense) Linux Some issues are specific to Linux as a platform macOS NetBSD OpenBSD portability We want NUT to build and run everywhere possible Solaris/illumos Solaris and illumos systems (OpenIndiana, OmniOS, SmartOS, TribbliX...) SSL/NSS Issues and PRs about SSL, TLS and other crypto-related matters Windows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants