Context Navigation

Extend external interfaces for operation monitoring.

Reported by:	nejucomo	Owned by:	nejucomo
Priority:	minor	Milestone:	undecided
Component:	code	Version:	0.6.1
Keywords:		Cc:
Launchpad Bug:

Description

I'd like to see the external interfaces (rest, xml-rpc, foolscap) support operation monitoring, so that external clients can display operation progress and perhaps control the operations (cancel, for instance).

The rest api could include an operation id in response headers for new operations. Separate urls could be used to query progress or list current operations.

Change History (6)

comment:1 follow-up: ↓ 3 Changed at 2007-10-20T21:05:46Z by zooko

Owner changed from somebody to nejucomo
Version changed from 0.4.0 to 0.6.1

Currently you can see progress of e.g. an upload (tahoe put) by observing how much of the file you have been able to upload. The server (which ideally is running on your localhost anyway) will not accept the file faster than it can encrypt, encode, and distribute the shares.

Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-)

What do you think?

Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load.

I'm pretty pleased with this design so far...

comment:2 Changed at 2008-06-01T21:19:51Z by warner

Milestone changed from eventually to undecided

comment:3 in reply to: ↑ 1 Changed at 2009-12-13T01:47:26Z by davidsarah

Replying to zooko:

Currently you can see progress of e.g. an upload (tahoe put) by observing how much of the file you have been able to upload. ... Likewise, the progress of download is apparent by how much of the leading cleartext segments of the files have been delivered to you. :-) Oh, cancellation is implemented by closing the HTTP connection before you've finished up/down load.

+1. The request in this ticket seems like unnecessary complexity. I suggest wontfix.

(Note that #92 is about showing upload progress/completion to the user in the WUI; that would still be useful.)

comment:4 follow-up: ↓ 6 Changed at 2009-12-13T03:59:38Z by zooko

So I definitely would have preferred the simplicity of using in-band progress indicators and cancellation as described in comment:1, but Brian persuaded me that this just wasn't good enough. The part of his argument that I remember being unable to counter was that we have some operations that take longer than an HTTP connection can reliably last. For example if you want to do a deep-verify-and-repair which is going to walk a large directory structure and download every bit of every share of every file and, if necessary, upload replacement shares. This could take days or weeks or months, and if your control of the process is a single HTTP connection then you're quite likely to suffer a network glitch which closes your TCP connection or encounter some kind of stupid timeout in an HTTP proxy or something.

(The way I like to think of this is that the comms abstraction of TCP is insufficiently robust -- there isn't a widely understood and implemented way to force your HTTP transaction to outlive temporary disconnections of the underlying TCP connection. That means that HTTP, while a wonderful lingua franca for some protocols, can't be used for long-running operations or operations which cannot be cannot be safely retried when the first try might or might not have failed to get through.)

So, Brian went ahead and invented "operation handles", documented here: docs/frontends/webapi.txt@4112#L203.

Hm, reading those docs again, I see this new text:

Many "slow" operations can begin to use unacceptable amounts of memory when
operation on large directory structures. The memory usage increases when the
ophandle is polled, as the results must be copied into a JSON string, sent
over the wire, then parsed by a client. So, as an alternative, many "slow"
operations have streaming equivalents. These equivalents do not use operation
handles. Instead, they emit line-oriented status results immediately. Client
code can cancel the operation by simply closing the HTTP connection.

Oh dear, so it appears that neither the operation-handles nor the single HTTP connection is really good enough in all dimensions. Hm.

So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory" and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts".

I'm not actually going to open either of those two tickets right now. I just took painkillers for my knee (recuperating from surgery).

If Brian, Nathan, or David-Sarah (or anyone) have any ideas on how to follow-up on this by all means post to the list or comment on this or some other ticket.

comment:5 Changed at 2009-12-13T03:59:45Z by zooko

Resolution set to fixed
Status changed from new to closed

comment:6 in reply to: ↑ 4 Changed at 2009-12-13T08:07:53Z by davidsarah

Replying to zooko:

So what shall we do with this ticket? I guess we'll close it as "fixed", and then maybe open a new ticket saying "Make operation-handle-querying use only a little memory"

This is #857.

and maybe open a new ticket saying "Invent robust HTTP so that streaming operations handles can be used on operations that last longer than a TCP connection lasts".

Not a Tahoe bug :-)