Make `UntypedBody` be able to extract to a `BufList` by sunshowers · Pull Request #542 · oxidecomputer/dropshot

sunshowers · 2023-01-05T20:27:35Z

(This is a version of #541 that breaks BC, per Adam's suggestion. I
do agree that this is overall a better approach.)

The current UntypedBody extractor writes data into a single Vec<u8>.
Consider what happens if the body is large (e.g. 100MB, which can happen
if uploading an artifact over HTTP). As each chunk (typically 10-100KB)
comes in, we'll have to both copy data from the incoming Bytes, and
reallocate the Vec over and over.

To avoid this issue, Eliza Weisman and I wrote buf-list, which
represents a list (really a queue) of chunks that can be operated on
using standard Tokio and other abstractions:
https://crates.io/crates/buf-list.

Make the UntypedBody extractor represent a bytestream that hasn't been
read yet. This allows us to extract the body as a Bytes, a BufList,
or any other stream one chooses.

One other change I did is to remove the nonexistent type parameter J
from UntypedBody<J> suggestions -- that didn't look right.

One consideration here is that BufList needs to be exposed as a type.
It's currently at 0.1 -- I could release a 1.0 if that would be helpful
as far as exposing in the API goes. What do you think?

Created using spr 1.3.4

Created using spr 1.3.4 [skip ci]

Created using spr 1.3.4

sunshowers · 2023-01-06T06:20:46Z

It occurred to me that we could make into_stream return a CappedBytesStream adapter, which implements Stream and caps the size of it at the max request bytes. To ignore the limit you could just into_inner() the adapter. I think that might be the best way to do this.

In a followup we could then switch the current into_buf_list and into_bytes impls over to using that.

Created using spr 1.3.4

sunshowers · 2023-01-06T18:01:30Z

I switched over to CappedBodyStream/UncappedBodyStream, which I think expresses the intent of the API better and in a more misuse-resistant fashion. I also removed the read_http_body methods in favor of just using the body streams as a building block.

Created using spr 1.3.4 [skip ci]

Created using spr 1.3.4

davepacheco

Sorry -- I'd written most of this up before your latest change.

dropshot/src/handler.rs

davepacheco · 2023-01-06T17:39:40Z

dropshot/src/handler.rs

 #[derive(Debug)]
 pub struct UntypedBody {
-    content: Bytes,
+    request: Arc<Mutex<Request<Body>>>,


Is it okay to take this lock at the point where we take it? Previously, we read this whole thing earlier. I get why we don't do this now, but that means we're taking a lock later in the request processing. I thought we'd be holding the lock for longer, too, but I don't think that's true.

dropshot/src/handler.rs

dropshot/src/http_util.rs

davepacheco · 2023-01-06T17:56:40Z

dropshot/src/http_util.rs

+/// # Errors
+///
+/// Errors if the body length exceeds the given cap.
+pub async fn http_read_body_bytes<T>(


Should we just use BufList internally and only turn that into a String (or Bytes I guess) if we really need it? I think the main use case where we want a String is because we're going to parse it with serde. Is it a lot more efficient to read it to a Bytes first than a BufList?

Created using spr 1.3.4

ahl · 2023-01-06T19:28:55Z

dropshot/src/handler.rs

+            let mut request = self.request.lock().await;
+            let body = request.body_mut();
+
+            'outer: {


I know I said I'd reserve my feedback until you go through @davepacheco's comments, but free advice on this: it might be less work to defer the use of relatively-new structures rather than potentially impacting some dropshot consumer at Oxide that's on an older rust version. My personal threshold is about 4-6 months in terms of my willingness to just hope that folks are up-to-date.

sunshowers · 2023-01-06T20:37:00Z

Discussing this with @davepacheco and @ahl, will mark this as draft until we come to a consensus.

Created using spr 1.3.4

ahl · 2023-03-13T15:55:43Z

should we close this PR?

sunshowers · 2023-03-13T19:21:22Z

Yes, will re-do this per RFD 353 later.

sunshowers · 2023-03-22T18:37:45Z

(re-did in #617)

sunshowers added 2 commits January 5, 2023 12:27

[𝘀𝗽𝗿] initial version

0072158

Created using spr 1.3.4

[𝘀𝗽𝗿] changes to main this commit is based on

bb88bc8

Created using spr 1.3.4 [skip ci]

sunshowers requested review from ahl, davepacheco and smklein January 5, 2023 20:27

sunshowers mentioned this pull request Jan 5, 2023

Add a new ChunkedBody extractor #541

Closed

1 task

sunshowers added 2 commits January 5, 2023 15:19

Use async-stream rather than tokio-stream

cc65ece

Created using spr 1.3.4

Read trailers

a654e72

Created using spr 1.3.4

Simplify implementation a little

e953952

Created using spr 1.3.4

sunshowers added 2 commits January 6, 2023 10:03

[𝘀𝗽𝗿] changes introduced through rebase

a63ffee

Created using spr 1.3.4 [skip ci]

Rebase

4d70a95

Created using spr 1.3.4

sunshowers changed the base branch from sunshowers/spr/main.make-untypedbody-be-able-to-extract-to-a-buflist to main January 6, 2023 18:03

davepacheco reviewed Jan 6, 2023

View reviewed changes

Fix rebase

01ec9c0

Created using spr 1.3.4

ahl reviewed Jan 6, 2023

View reviewed changes

sunshowers marked this pull request as draft January 6, 2023 20:36

sunshowers added 3 commits January 6, 2023 13:13

Putting this up for discussion

d3d93a4

Created using spr 1.3.4

Simplify into_buf_list

41667a8

Created using spr 1.3.4

Remove UncappedBodyStream, rename Capped to UntypedBodyStream

3f77402

Created using spr 1.3.4

This was referenced Jan 6, 2023

Test that max request body size is enforced for chunked encoding #545

Open

[draft] straw proposal to remove the request from RequestContext #548

Closed

davepacheco mentioned this pull request Jan 10, 2023

provide access to raw hyper::Request via an extractor #555

Closed

sunshowers closed this Mar 13, 2023

sunshowers deleted the sunshowers/spr/make-untypedbody-be-able-to-extract-to-a-buflist branch March 22, 2023 18:37

Conversation

sunshowers commented Jan 5, 2023

Uh oh!

sunshowers commented Jan 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunshowers commented Jan 6, 2023

Uh oh!

davepacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davepacheco Jan 6, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davepacheco Jan 6, 2023

Choose a reason for hiding this comment

Uh oh!

ahl Jan 6, 2023

Choose a reason for hiding this comment

Uh oh!

sunshowers commented Jan 6, 2023

Uh oh!

ahl commented Mar 13, 2023

Uh oh!

sunshowers commented Mar 13, 2023

Uh oh!

sunshowers commented Mar 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sunshowers commented Jan 6, 2023 •

edited

Loading