-
Notifications
You must be signed in to change notification settings - Fork 10
Switch from reqwest to bitreq HTTP client
#56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Replace the `reqwest` dependency with `bitreq`, a lighter HTTP(s) client that reduces code bloat and dependency overhead. Key changes: - Use `bitreq::Client` for connection pooling and reuse - Update all HTTP request handling to use bitreq's API - Remove `reqwest::header::HeaderMap` in favor of `HashMap<String, String>` - Simplify `LnurlAuthToJwtProvider::new()` to no longer return a `Result` - Use `serde_json::from_slice()` directly for JSON parsing - Build script uses bitreq's blocking `send()` method Co-Authored-By: HAL 9000 Signed-off-by: Elias Rohrer <dev@tnull.de>
|
👋 Thanks for assigning @tankyleo as a reviewer! |
| let auth_response = | ||
| auth_request.send_async().await.map_err(VssHeaderProviderError::from)?; | ||
| let lnurl_auth_response: LnurlAuthResponse = | ||
| serde_json::from_slice(&auth_response.into_bytes()).map_err(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use bitreq's json-using-serde which goes through UTF-8 parsing first, but this seemed simpler and followed what the reqwest::Response::json method did (cf https://docs.rs/reqwest/latest/src/reqwest/async_impl/response.rs.html#269-273). Not sure if anyone has an opinion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either is fine to me 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnull sounds to me like this is something we can upstream to bitreq no ? ie pass the Vec<u8> body straight into serde_json::from_slice, and skip the utf8 parsing, which seems a net plus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnull sounds to me like this is something we can upstream to bitreq no ? ie pass the
Vec<u8>body straight intoserde_json::from_slice, and skip the utf8 parsing, which seems a net plus.
Yeah, I considered that too. Just haven't fully made my mind up whether not validating UTF8 is okay. Maybe it's here, but not necessarily in the general case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC i would expect serde_json::from_slice to validate UTF-8 as necessary during deserialization...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC i would expect
serde_json::from_sliceto validate UTF-8 as necessary during deserialization...
I'm not quite sure? The Deserializer::deserialize_bytes docs state:
The behavior of serde_json is specified to fail on non-UTF-8 strings when deserializing into Rust UTF-8 string types such as String, and succeed with the bytes representing the WTF-8 encoding of code points when deserializing using this method.
and then go ahead and use from_slice in the example further below. Might need to run a test to validate either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right so it seems to me that flow is used when the destination is some &[u8] ?
When the destination is some &str, we call fn deserialize_str, which calls Read::parse_str, which for SliceRead (the pub fn from_slice deserializer), calls SliceRead::parse_str_bytes with the as_str function as the result parameter, and this as_str calls str::from_utf8 on the full byte slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right so it seems to me that flow is used when the destination is some
&[u8]?When the destination is some
&str, we callfn deserialize_str, which callsRead::parse_str, which forSliceRead(thepub fn from_slicedeserializer), callsSliceRead::parse_str_byteswith theas_strfunction as the result parameter, and thisas_strcallsstr::from_utf8on the full byte slice.
Ah, that sounds about right. Mind making the change and opening a PR towards bitreq (ofc. also providing the same rationale)?
tnull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly not sure why the idna crate suddenly started failing to build under 1.75.0, and we aim to soon replace the whole url dependency with our own Url type, but for now we can also just fix the issue by bumping MSRV to 1.85, since LDK Node already requires that anyways and we aim too soon also update LDK to that.
Wait, what? This is news to me. Can we just remove the cursed |
| let auth_response = | ||
| auth_request.send_async().await.map_err(VssHeaderProviderError::from)?; | ||
| let lnurl_auth_response: LnurlAuthResponse = | ||
| serde_json::from_slice(&auth_response.into_bytes()).map_err(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either is fine to me 🤷♂️
No, Matt, we discussed this plan for hours at the Rust Summit. We landed on that most ecosystem crates (BDK, etc) would update ~right away given their release cycles, but that LDK would defer bumping until after Ubuntu 2604 is out, which will of course happen this April.
Yeah, working on it as we speak. Just did a |
Right, April is a ways off though and we likely don't want to bump the day that a new release comes out (I'm kinda tired of us rushing to bump the day something comes out for the sake of it without a specific motivating rust feature, at least now that we've stripped out the stupid HTTP deps that keep causing dep-MSRV bumps), and I thought we wanted to upstream long before then lightningdevkit/rust-lightning#4323 |
src/client.rs
Outdated
| const APPLICATION_OCTET_STREAM: &str = "application/octet-stream"; | ||
| const DEFAULT_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10); | ||
| const DEFAULT_TIMEOUT_SECS: u64 = 10; | ||
| const MAX_RESPONSE_BODY_SIZE: usize = 500 * 1024 * 1024; // 500 MiB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now bumped this considerably, as of course 10MB would be much too small. Let me know if you have an opinion on a better value here @TheBlueMatt @tankyleo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, yea, good question. On the server side we defaulted to 1G because its postgres' limit. Not that "just because its postgres' limit" is a good reason to do anything, though. 500 seems fine enough to me, honestly, though. As long as its documented I don't feel super strongly.
As 10 MB is def to small
15114d4 to
ef17d83
Compare
| .with_body(request_body) | ||
| .with_timeout(DEFAULT_TIMEOUT_SECS) | ||
| .with_max_body_size(Some(MAX_RESPONSE_BODY_SIZE)) | ||
| .with_pipelining(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm bitreq says "Note that because pipelined requests may be replayed in case of failure, you should only set this on idempotent requests", and our PutObjectRequest is not idempotent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and our
PutObjectRequestis not idempotent
Why is it not idempotent? If the request fails, we also assume the HTTP client to retry? This is essentially the same, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm right i was thinking if we replay the same versioned put object request, the server will return an error instead of ok, and hence that call is not idempotent.
the concern here is request A is actually successful, version increments server side, but for some reason bitreq considers it a failure, replays the request, and becomes request B.
request B fails because of a version mismatch, and kicks off the retry policy client side, and eventually results in a failure getting bubbled back up to LDK-Node even though request A succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the concern here is request A is actually successful, version increments server side, but for some reason bitreq considers it a failure, replays the request, and becomes request B.
Well, a) this shouldn't happen, it should only replay if it actually failed to deliver it and b) why would it result in a version mismatch? The version field wouldn't magically change when replayed, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a) this shouldn't happen, it should only replay if it actually failed to deliver it
Then why the warning that pipelined requests should be idempotent ? Sounds to me it is warning that the server could receive two copies of the same request, hence each request should be idempotent. Do we need a documentation update on bitreq's side ?
why would it result in a version mismatch? The version field wouldn't magically change when replayed, no?
The version field in the request does not change, but the server increments the version after request A succeeds. Request B did not increment the version, so that request will fail on "version mismatch"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First impression, I would add a with_pipelining: bool in the async fn post_request parameters that sets whether this request should be pipelined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version field in the request does not change, but the server increments the version after request A succeeds. Request B did not increment the version, so that request will fail on "version mismatch"
I'm confused: isn't that the intended behavior, and even without pipelining we can easily run into a scenario where the server updates but the success response never reaches us? (And btw, currently this is not true as we don't use key-level versions, i.e., always set -1, i.e., always go the non-conditional upsert path, without incrementing versions).
Indeed, there's some risk that the server sends a
Connection: closeafter it has processed as many requests as it wants and suddenly we have to resend a request. The server shouldn't have processed the first request, but we can't guarantee that so IMO we should avoid setting pipelining on non-idempotent requests (sorry, I know I requested it!).
Huh, but that is also true if we just lose connectivity or the server closes/resets the connection just after a write. In that case the client get's a bitreq::IoError which will be converted generically to a VssClient::InternalError, which won't get retried. I do not see how pipelining is anything special here - connectivity is always fallible, but - just as with raw TCP - you don't know what happened until you get positive acknowledgements.
I think all these side effects can only be resolved once we (finally) make proper use of key-level versions on the client side, and leave conflict resolution to the service side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I mean requests really should always be idempotent in an RPC call, for the reasons you specified. Its not entirely clear to me why writes in VSS aren't at the protocol level (even if our current usage is), but fixing that seems like a somewhat larger project.
In theory we could let PUTs with the "non-conditional upsert path" pipeline but ISTM we really can't in the "non-idempotent request" pipeline? If we get a non-idempotent request, failing due to connectivity is one thing, but failing because the server did a normal HTTP protocol thing and we handled that by retrying which led to a spurious failure even though we did write seems very strange to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we could let PUTs with the "non-conditional upsert path" pipeline but ISTM we really can't in the "non-idempotent request" pipeline? If we get a non-idempotent request, failing due to connectivity is one thing, but failing because the server did a normal HTTP protocol thing and we handled that by retrying which led to a spurious failure even though we did write seems very strange to me.
Hmm, but the client would only replay the pipelined request in case of a connectivity failure, essentially? So it's the same failure case, and both result in the write potentially going through and we're getting an error back?
I mean, happy to drop pipelining if you guys think it's a substantial improvement, I'm just pointing out that we might run into exactly the same thing with 'normal'/non-pipelined operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea kinda. A pipelined request in bitreq can be replayed because an earlier unrelated request on the same connection timed out (which is kinda connectivity failure indeed but we're def more susceptible to) or because the server (or HTTP reverse proxy) was somewhat buggy (processing requests after the request which it responded to with a Connection: close) which we generally assume isn't the case here.
Still, I think the point is that the error becomes wrong - if a client is looking at version counter errors as indication of multi-client or something like that they'll get confused when in fact it was just an HTTP timeout.
Pipelining is probably a pretty nontrivial gain in performance on the read side (especially since we only have one client), though, so its probably worth trying to keep at least in some cases.
| let auth_response = | ||
| auth_request.send_async().await.map_err(VssHeaderProviderError::from)?; | ||
| let lnurl_auth_response: LnurlAuthResponse = | ||
| serde_json::from_slice(&auth_response.into_bytes()).map_err(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnull sounds to me like this is something we can upstream to bitreq no ? ie pass the Vec<u8> body straight into serde_json::from_slice, and skip the utf8 parsing, which seems a net plus.
|
@tnull let's finish this so we can ship |
Well, I think we want to wait for the next |
|
Addressed comments so far, let me know if I can squash. |
I don't think we need to bump the bitreq minor version - the URL changes I believe are all internal/new APIs so we should be able to do it in an 0.3.2. Thus ISTM we don't need to hold this (unless it wants to use the URL stuff directly, I haven't looked) |
As mentioned above, yes, we don't need to hold this specific PR but we'll likely want to wait with the release to include the version bump, given that we might not immediately want to do another release in 2 days and it would be preferable to not immediately starting out with 0.3 and 0.4 in the dependency tree if possible. |
|
I'm confused, I don't see a reason why |
Ah, sorry, rightfully so. I misread, and on top thought that we made other changes since 0.3.1. Indeed it seems we could get away with adding |
@tnull You might have addressed them but not pushed them ? In any case feel free to push the squashed result :) |
|
There is the pipelining thread above we still need to resolve |
ef39744 to
ffbb682
Compare
Ah, whoops, indeed: > git diff-tree -U2 ef39744 ffbb682
diff --git a/src/client.rs b/src/client.rs
index c5f21c3..4c17fca 100644
--- a/src/client.rs
+++ b/src/client.rs
@@ -17,6 +17,7 @@ use crate::util::KeyValueVecKeyPrinter;
const APPLICATION_OCTET_STREAM: &str = "application/octet-stream";
+const CONTENT_TYPE: &str = "content-type";
const DEFAULT_TIMEOUT_SECS: u64 = 10;
-const MAX_RESPONSE_BODY_SIZE: usize = 500 * 1024 * 1024; // 500 MiB
+const MAX_RESPONSE_BODY_SIZE: usize = 1024 * 1024 * 1024; // 1GB
const DEFAULT_CLIENT_CAPACITY: usize = 10;
@@ -199,5 +200,5 @@ impl<R: RetryPolicy<E = VssError>> VssClient<R> {
let http_request = bitreq::post(url)
- .with_header("content-type", APPLICATION_OCTET_STREAM)
+ .with_header(CONTENT_TYPE, APPLICATION_OCTET_STREAM)
.with_headers(headers)
.with_body(request_body)
diff --git a/src/headers/lnurl_auth_jwt.rs b/src/headers/lnurl_auth_jwt.rs
index 190d9f8..63c6fc8 100644
--- a/src/headers/lnurl_auth_jwt.rs
+++ b/src/headers/lnurl_auth_jwt.rs
@@ -27,4 +27,6 @@ const KEY_QUERY_PARAM: &str = "key";
// The authorization header name.
const AUTHORIZATION: &str = "Authorization";
+// The maximum body size we allow for requests.
+const MAX_RESPONSE_BODY_SIZE: usize = 16 * 1024 * 1024; // 16 KB
#[derive(Debug, Clone)]
@@ -88,5 +90,6 @@ impl LnurlAuthToJwtProvider {
let lnurl_request = bitreq::get(&self.url)
.with_headers(self.default_headers.clone())
- .with_timeout(DEFAULT_TIMEOUT_SECS);
+ .with_timeout(DEFAULT_TIMEOUT_SECS)
+ .with_max_body_size(Some(MAX_RESPONSE_BODY_SIZE));
let lnurl_response =
lnurl_request.send_async().await.map_err(VssHeaderProviderError::from)?;
@@ -101,5 +104,6 @@ impl LnurlAuthToJwtProvider {
let auth_request = bitreq::get(&signed_lnurl)
.with_headers(self.default_headers.clone())
- .with_timeout(DEFAULT_TIMEOUT_SECS);
+ .with_timeout(DEFAULT_TIMEOUT_SECS)
+ .with_max_body_size(Some(MAX_RESPONSE_BODY_SIZE));
let auth_response =
auth_request.send_async().await.map_err(VssHeaderProviderError::from)?; |
Replace the
reqwestdependency withbitreq, co-Authored-By: HAL 9000.