#822 new defect

Web API should use a more reliable, out-of-band means of reporting errors (such as a server connection being lost) during a download

Reported by: davidsarah Owned by:
Priority: major Milestone: soon
Component: code-frontend-web Version: 1.5.0
Keywords: integrity error http download Cc:
Launchpad Bug:

Description

The discussion of bug #698 (which turned out to be a Firefox bug) turned up a potential integrity problem that can occur if a server connection is lost in the middle of a download via the WUI:

http://allmydata.org/trac/tahoe/ticket/698#comment:1

The first thing that comes to mind is that a server connection could have been lost in the middle of the download (in this case, after we've retrieved the UEB and some of the hashes, but before we've retrieved the first data block). The web server has to commit to success (200) or failure (404 or 500 or something) before it starts sending any of the plaintext, but it doesn't want to store the entire file either. So it bases the HTTP response code upon the initial availability of k servers, and hopes they'll stick around for the whole download.

When we get a "late failure" (i.e. one of the servers disconnects in the middle), the webapi doesn't have a lot of choices. At the moment, it emits a brief error message (attached to whatever partial content has already been written out), then drops the HTTP connection, and hopes that the client is observant enough to notice that the number of received bytes does not match the previously-sent Content-Length header, and then announce an error on the client side.

If the application doing the fetch (perhaps the browser, perhaps tiddywiki itself?) doesn't strictly check the Content-Length header, then it could get partial content without an error message.

There are two directions to fix this:

  • change the webapi to use "Chunked Encoding", basically delivering data one segment at a time, possibly giving the server a chance to emit an error header in between segments: this would let us respond better to these errors
  • fix the other download-should-be-better tickets (#193, #287) to tolerate lost servers better, which might reduce the rate at which these errors occur

As pointed out in http://allmydata.org/pipermail/tahoe-dev/2009-May/001724.html , it is possible that the length so far plus the length of the error message, coincidentally equals the expected file length. So even for a web client that diligently checks the Content-Length, there might not be enough information to detect an error. An attacker might try to force this situation (I don't know what their chance of success would be, but probably much higher than trying to attack the crypto).

In any case, the WUI is currently using in-band error reporting, which is problematic because the error message will be treated as data of whatever format the client thinks the content has. This is an integrity issue because the download from the gateway to the client has no cryptographic integrity checking.

To close this bug, find and implement some way to make typical web clients reliably report an error when a download fails part-way through. Alternatively, prove that it isn't possible, and document this as an inherent limitation of the WUI.

Change History (10)

comment:1 Changed at 2009-10-29T05:58:59Z by davidsarah

An alternative to Chunked Encoding that is worth considering is to use HTTP-over-TLS for the WUI (since TLS does have the ability to report errors out-of-band that clients "MUST" pay attention to, although I don't know how these are displayed to the user).

Note that the TLS handshake would only occur once per client of a given gateway, so the main performance impact would only be the session encryption/MAC. OTOH, we'd probably have to use self-signed certificates, which throw up very ugly warnings in recent browsers.

comment:2 Changed at 2009-10-29T06:14:05Z by davidsarah

Using SSL/TLS might in principle also have helped with #127 (cap leakage via Referer headers), but see http://allmydata.org/trac/tahoe/ticket/127#comment:13

comment:3 Changed at 2009-11-02T08:17:53Z by warner

Yeah, I think chunked encoding is the way to go.

comment:4 Changed at 2009-12-23T20:35:32Z by davidsarah

  • Summary changed from WUI should use a more reliable, out-of-band means of reporting errors when a server connection is lost during a download to Web API should use a more reliable, out-of-band means of reporting errors when a server connection is lost during a download

comment:5 Changed at 2009-12-23T20:43:19Z by davidsarah

  • Priority changed from major to critical

Bumping to critical -- I hadn't realized when I reported this ticket that it applies to all front-ends the CLI as well (since it depends on the webapi), not just the WUI.

We should probably do the TLS alert thing anyway when the webapi is running over TLS. Does twisted's TLS API make that reasonably easy?

Last edited at 2011-07-21T19:41:02Z by davidsarah (previous) (diff)

comment:6 Changed at 2009-12-24T21:51:44Z by warner

I've got no idea.. most of the TLS glue code in python is synchronous, so Twisted has to do some weird acrobatics to present it in a more async form to the higher layers. So there might be places where, e.g., a TLS alert might not get reported until some other (normal) piece of data came through behind it. I also haven't ever seen an interface for triggering or reading a TLS alert.

I don't actually know what the TLS alerts are meant for, or what sorts of information they could convey. In this case, we've committed to a successful response with a certain length, and after the fact, we discover that we can't provide data beyond a certain point. What sort of TLS alert would be appropriate in this case? How would a typical client (i.e. web browser) react to this? And how easy is it for say libopenssl- or gnutls- using code to see these alerts? The latter question translates into "how many TLS-speaking applications are likely to react correctly?". Ideally something like 'wget' or the tahoe cli tools should should get an error indication and e.g. delete their temporary output file instead of closing it normally.

I don't know that using SSL on a localhost connection is likely to work very well in practice. Self-signed certificates are a mess, and there's no CA that will sign for "localhost", and you really want to be running the tahoe client node locally. I still think that 127.0.0.1 is the right place for the webapi, in which case there's no point in encryption (anyone who can read packets off the loopback interface can also read your memory), and without a CA-signed certificate we can't get authentication out of a typical web browser. So, I don't see how we could really use it.

comment:7 Changed at 2010-05-01T23:44:26Z by davidsarah

  • Keywords error http download added
  • Summary changed from Web API should use a more reliable, out-of-band means of reporting errors when a server connection is lost during a download to Web API should use a more reliable, out-of-band means of reporting errors (such as a server connection being lost) during a download

Servers being lost aren't the only reason why a download may fail.

comment:8 Changed at 2010-11-30T01:04:24Z by davidsarah

http://bugs.python.org/issue6312 is a bug in httplib (fixed in Python 2.7) that occurs when a HEAD response uses chunked encoding. Therefore we should only serve chunked encoding for GET requests.

comment:9 Changed at 2011-07-21T19:42:15Z by davidsarah

  • Milestone changed from undecided to soon

comment:10 Changed at 2012-11-13T23:26:47Z by zooko

  • Priority changed from critical to major
Note: See TracTickets for help on using tickets.