fix: Limit length of response body read to 4mb#2080
fix: Limit length of response body read to 4mb#2080kaylareopelle wants to merge 1 commit intoopen-telemetry:mainfrom
Conversation
Limiting the read size may help prevent memory exhaustion exploits when the configured collector endpoint is attacker-controlled.
35dfa8e to
f0bec26
Compare
| truncated = false | ||
|
|
||
| response.read_body do |chunk| | ||
| if body.bytesize + chunk.bytesize <= RESPONSE_BODY_LIMIT |
There was a problem hiding this comment.
Before starting to read the response body, should we not first read the content-length or transfer-encoding and short circuit if the body exceeds the limit?
IIUC the Go collector will switch to using chunked responses that exceed the 2KiB default buffer size. In cases where the response is less than 2KiB or the backend supports larger buffer sizes we may get the content-length responses and exit early.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Length
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Transfer-Encoding
There was a problem hiding this comment.
I like the idea of using the headers to make this more performant!
Just to make sure I'm on the same page, are you proposing:
- Use the transfer-encoding / content-length headers to determine body size
- Skip chunking when the body is less than the limit and just return the body
- Truncate the response using chunking for bodies that exceed the limit
Or are you suggesting something else?
There was a problem hiding this comment.
My understanding is that it is either or.
The response will either have a Content-Length or Transfer-Encoding Chunked
In the case the Content-Length is present and it exceeds the size then discard the response body and exit immediately.
If it's a chunked response use the code you've introduced here.
There was a problem hiding this comment.
In a hacky reproduction of the the problem (some malicious or misbehaving OTLP receiver responds with a large body), I still saw the whole large response body getting loaded into memory. The exporter's @http.request(request) call reads the full body before read_response_body runs, so the 4 MB cap never activates against a real HTTP server. The existing tests pass because WebMock stubs don't have real socket behavior. 😭
| @http.read_timeout = remaining_timeout | ||
| @http.write_timeout = remaining_timeout | ||
| @http.start unless @http.started? | ||
| response = measure_request_duration { @http.request(request) } |
There was a problem hiding this comment.
@http.request(request) apparently reads the body into memory under the hood. 😱 Subsequent calls to read_body(), as in the new read_response_body() method, throw an IOError: read_body called twice exception (swallowed by the rescue in our new method).
We might need to restructure our send_bytes() method here a bit. I think we could wrap our case statement in @http.request(request) do |response| and choose how we're dealing with the body within the cases. Can even do @arielvalentin's header-checking suggestion within the block and before the cases.
| case response | ||
| when Net::HTTPSuccess | ||
| response.body # Read and discard body | ||
| response.read_body(nil) # Discard without reading into memory |
There was a problem hiding this comment.
For our happier-path cases where the body isn't too large, read_body(nil) apparently won't drain the socket which will break keep-alive connections. Maybe for the cases where we don't care about the body content, we could response.read_body { |_| } for a not-entirely-awful drain-and-discard?
| remaining = RESPONSE_BODY_LIMIT - body.bytesize | ||
| body << chunk.byteslice(0, remaining) if remaining > 0 | ||
| truncated = true | ||
| break |
There was a problem hiding this comment.
If read_body() hadn't thrown an exception and we got into this loop, the break would still result in nearly all of the response being read into memory because Net::HTTP reads the rest of the body after this block ends. ಠ_ಠ
| end | ||
|
|
||
| def log_status(body) | ||
| truncation_note = @body_truncated ? ' (body truncated due to size limit)' : '' |
There was a problem hiding this comment.
Unrelated to the body-read conundrum, I worry a little about an instance variable on the exporter here. Maybe a second return value from read_response_body() that could be passed to a second param for log_status()?
Adds a failing test that exercises the OTLP exporter's response body limit against a real TCPServer instead of WebMock stubs. WebMock patches Net::HTTP's read_body internals, so the existing stub-based tests pass even though the chunked reader doesn't work against real HTTP. With WebMock's adapter fully disabled, we see that Net::HTTP#request eagerly reads the full body before read_response_body runs, causing IOError: "read_body called twice". The IOError is the symptom — the actual problem is that the full oversized response body is already in memory by that point, defeating the 4 MB cap. Reproduces the problem described in review comments on PR open-telemetry#2080.
Our exporters read HTTP response bodies without any limit. A misconfigured or misbehaving server could send an arbitrarily large response, causing the exporter to read it all into memory.
This implements a 4 MB response body limit also implemented by:
Based on https://cwe.mitre.org/data/definitions/789.html
Fixes #2079