Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should serving one range entail serving all requested ranges? #445

Closed
quasicomputational opened this issue Sep 1, 2020 · 7 comments · Fixed by #509
Closed

Should serving one range entail serving all requested ranges? #445

quasicomputational opened this issue Sep 1, 2020 · 7 comments · Fixed by #509
Assignees

Comments

@quasicomputational
Copy link

https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#rfc.section.9.3

If all of the preconditions are true, the server supports the Range header field for the target resource, and the specified range(s) are valid and satisfiable (as defined in Section 7.1.4.2), the server SHOULD send a 206 (Partial Content) response with a payload containing one or more partial representations that correspond to the satisfiable ranges requested.

This seems to place a requirement on the server to serve enough of the representation that all satisfiable ranges requested are fulfilled.

But that's actually a burden and makes range requests not as useful as they could be: a server might well not want to implement multipart/byteranges, which is an awkward format that can't be streamed without prior knowledge of what strings are safe to use as separators; responding with a single range is much simpler to implement.

If all satisfiable ranges must be served, then, without multipart/byteranges, the only compliant options for responding to a range request with large distances between the requested ranges involve sending all of the bytes in between.

Even reading the text as meaning that servers MUST NOT only serve a portion of the satisfiable ranges, client behaviour when faced with a misbehaving server is under-specified; the sensible options are re-requesting unfulfilled ranges, retrying without ranges, or signalling an error. Retrying without ranges may be wasteful, and raising an error seems overly fussy - and, if either of those is the better or even necessary option, that definitely ought to be called out. I'd be surprised if retrying without ranges or erroring was needed, because section 10.3.7.3 (Combining Parts) already deals with combining multiple 206 responses together.

So, I think it'd make sense to make two changes here:

  • An explicit SHOULD on clients receiving a 206 to check the ranges that they actually received and make further range requests if they still need any ranges which weren't included despite being satisfiable, which they can determine because Content-Range includes enough information to tell which were unsatisfiable and which were simply not fulfilled.
  • Explicitly say that servers MAY only serve a portion of the satisfiable ranges.
@royfielding
Copy link
Member

It says "one or more" for exactly that reason. A requirement to send all would have to say ALL, so this is clearly not the case.

@quasicomputational
Copy link
Author

Wouldn't the text then be

a payload containing partial representations that correspond to one or more of the satisfiable ranges requested.

or something very similar? As presently written, I can't come up with a parse that unambiguously means that some of the satisfiable requested ranges may not be fulfilled - and, in any case, I do think the client behaviour on ranges not being fulfilled needs to be specced a bit more.

@mnot
Copy link
Member

mnot commented Sep 2, 2020

I read the current text as implying that in normal operation, the server will send all of the requested data back in some form; it might be in one range or multiple ranges, depending on its preferences and capabilities.

@quasicomputational AIUI you want to rewrite this so that when a client sends (for example) requests for ranges 10-100, 500-1000 and 5000-6000, the server can just choose to send one (or two, in theory) of those ranges back, rather than all of them.

That might be easier for the server in a number of ways, but it relies on the client recognising that the missing ranges need to be re-requested, one-by-one; effectively, it's a new protocol.

As such, if you want to enable this new pattern, I think you'd need to signal support for it in some fashion from the client. That's out of scope for this spec effort; it would be an extension.

@quasicomputational
Copy link
Author

@quasicomputational AIUI you want to rewrite this so that when a client sends (for example) requests for ranges 10-100, 500-1000 and 5000-6000, the server can just choose to send one (or two, in theory) of those ranges back, rather than all of them.

That is indeed what I'd like to have. I think there's definitely still room for at least clarifying the situation, given that two editors here read the text in two contradictory ways!

That might be easier for the server in a number of ways, but it relies on the client recognising that the missing ranges need to be re-requested, one-by-one; effectively, it's a new protocol.

As such, if you want to enable this new pattern, I think you'd need to signal support for it in some fashion from the client. That's out of scope for this spec effort; it would be an extension.

Hmm, I don't quite agree. A well-behaved server would want to see support signalled before leaving some ranges unfulfilled and I agree that that would be a new protocol, but a malfunctioning or badly-behaved server might be ignoring the send-all-ranges requirement - I think that's especially likely to be happening in the wild given that the spec here is hard to interpret. Clients will still have to deal with that, and the spec can at least list reasonable options (re-request satisfiable-but-unfulfilled ranges, re-request the whole thing, error).

Here's a new straw suggestion that (I think) doesn't change the required behaviours, just formalising and clarifying exactly what's required:

  • An explicit MUST on servers returning enough of the representation to fulfil all satisfiable ranges.

  • An explicit SHOULD on clients checking the Content-Ranges sent back, just in case a server is misbehaving, and list the three sensible options after detecting misbehaviour with MAY.

@royfielding
Copy link
Member

In normal operation, a server will try to send all of the requested information just for the sake of avoiding another request, unless it doesn't want to for its own reason that we have no need to describe.

A 206 response is self-descriptive. A client cannot interpret the response differently than how the response describes itself. The client cannot assume it has all the requested ranges it requested. There is no need to require that a server send all requested parts. There is no reason for a client to assume it would receive all of the parts without looking at the response. Hence, this is not subject to an interop requirement.

The text I'd consider adding is a note at the end of that paragraph, like

   This does not imply that a server will send all requested ranges: the
   client &MUST; inspect a 206 response's Content-Type and
   Content-Range field(s) to determine what parts are enclosed.

@royfielding
Copy link
Member

Alternatively, just add that requirement to the section on 206.

@mnot mnot added the semantics label Sep 3, 2020
@MikeBishop
Copy link
Contributor

@quasicomputational AIUI you want to rewrite this so that when a client sends (for example) requests for ranges 10-100, 500-1000 and 5000-6000, the server can just choose to send one (or two, in theory) of those ranges back, rather than all of them.

That is indeed what I'd like to have. I think there's definitely still room for at least clarifying the situation, given that two editors here read the text in two contradictory ways!

I think the most compliant server response, if it doesn't support multipart/byteranges, is to respond with the range 10-6000. Servers are allowed to send different ranges than the client asked for, but I think a server omitting some of the requested ranges can't really claim to be "successfully fulfilling a range request." The server's response needs to include the bytes the client asked for.

That said, I think something like @royfielding's suggested text would be helpful. I've seen clients get tripped up before when the server produces a different set of requested ranges, either by extending a requested range or by coalescing ranges. It looks like a rewording of the text at the end of 10.3.7.2, which would be appropriate to either replicate in 10.3.7.1 (you can't expect exactly the range you requested) or move up to 10.3.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants