#152 new enhancement

build "sharing slots" / use mutable files as primitives for sharing messages

Reported by: warner Owned by:
Priority: major Milestone: undecided
Component: code-frontend Version: 0.6.1
Keywords: performance newcaps revocation Cc:
Launchpad Bug:

Description

We were talking with Peter yesterday about what sort of sharing UI he'd like to use. In exchanging documents with a colleague, he said he'd like to take the spreadsheet that he's editing and push a button that says "Share This File", and immediately get a window with a string that he can IM or email to somebody. He doesn't want to wait for a file to finish uploading or even encoding, because he wants to be able to walk away from the process once he's IM'ed this string to his friend.

We can do this. The requirements are that his computer stays online until the upload finishes, and that his friend might not be able to download the file right away (i.e. if he uses the IM'ed string too quickly). If the download is not yet available, the friend should get an ETA or some sort of progress message to let them know when they should start downloading it, so that they can plan *their* time ("do I go get coffee, or go out to lunch, or come back tomorrow?").

To build this, I'm thinking we start with an SSK-based mutable slot. The "Share This File" button creates an SSK slot, fills it with some starting data, and displays the SSK URI to the originating user. The slot is filled with:

  • suggested filename
  • file length
  • one of:
    • "upload in progress"
      • bytes uploaded so far
      • ETA
    • "complete"
      • CHK URI

The originating client will modify the SSK slot every once in a while (perhaps once every 10 to 60 seconds?) to update the ETA, and will eventually fill in the URI.

The recipient's GUI should accept an SSK URI (with some framing information to suggest that it is filled with data in this format) and read the slot to see whether the file is available yet or not. There should be a "Retrieve Shared File" button to which you can paste or drag the SSK URI, and it either produces a window with "waiting for upload to complete: NN%, ETA XX", or "downloading: NN%, ETA XX", or a file icon ready to be dragged somewhere.

These SSK slots should expire after a while, maybe a week or a month (perhaps the "share this file" button should have an option somewhere to specify how long the file will be available). The CHK file needs to last at least the same duration, so perhaps it needs an extra purely-time-based lease (still accounted to the originator, but not cancelled if they remove the file from their vdrive (or never added it in the first place)).

Change History (7)

comment:1 Changed at 2007-09-27T21:11:16Z by zooko

How big are big spreadsheets? I have some small spreadsheets that are about 20 KB. If a file is less than a couple hundred KB, the upload of the file itself might complete faster than Peter can cut-and-paste the string and IM it to his friend. (Back-of-envelope 1s per file plus 23 KB/s, so maybe 2 seconds for a 40 KB file.)

But for sufficiently large files, this feature sounds cool.

Hm, actually, why doesn't his friend start downloading the file before Peter's computer has finished uploading the file? So the progress meter isn't telling you how far to go until you can start downloading, it is telling you how far to go until the file is completely downloaded. Also if the file is useful when incomplete (such as a movie or audio file), then the friend can start using it as soon as Peter's computer starts uploading it.

comment:2 Changed at 2007-09-28T01:56:39Z by warner

I guess I'm assuming that microsoft produces are incapable of creating any file smaller than a few megabytes. I'm also assuming slow consumer-grade ADSL uplinks.

I'd think that the user should be able to wait up to, say, 15 seconds (from the time they push the button to the time they get an IM-able string). If it's less than 2 seconds, then it will feel like their file is being instantly transmitted, at least from the sender's point of view. The burden of waiting is really being transferred to their friend, but most of that latency is hidden from both parties by their own natural sloth :-). (the longer they procrastinate before pushing the "download this file" link, the better we look).

If the only thing we need to do is to generate a unique string (like a Storage Index), then we can respond in a few milliseconds. I think we should evaluate this time in absolute terms rather than how long it takes Peter to subsequently cut-and-paste the string, since Peter is waiting on *us* before that point, and only on himself after that point. I.e., he can't blame us for how long it takes *him* to manipulate his IM client.

Starting the download before the upload finishes would be really slick. It also won't work at all for our current CHK format, unless we allow the recipient to download unverified data and keep it quarantined somewhere until the hashes are uploaded and downloaded and checked. The CHK format has only one place for verification data (the UEB hash inside the URI), and we can't generate it until the very end.

Doing download-before-upload on SSK would need some clever work too.. like signing each segment separately. Or, we could make the validation section contain a hash tree over just the segments that have been encoded thus far, with a signature on the root. As we encode more segments, we keep replacing this tree with a larger one that covers more segments. When we finish uploading, we'll have a bunch of segments, a complete merkle tree of hashes (covering all segments), and a single signature on the root.

If this is an important use case, we should keep it in mind when we design the SSK format. We've talked in the past about designing SSKs that can handle large amounts of data (using FEC instead of simple replication); if we also design them to handle partial-upload (with the merkle tree and a variable number of segments), then we can implement this very nifty feature. (and if we do this, then the "sharing slot" might just be the SSK itself.. this would require a place to store "expected file size" or "expected number of segments", and then we'd probably need to put the suggested file name in the metadata that wraps the SSK URI and gets pasted or IM'ed to the recipient).

comment:3 Changed at 2007-10-21T01:53:46Z by zooko

  • Version changed from 0.5.1 to 0.6.1

Now we're designing SSKs, and I still think that this is a valuable use case, so I'm posting this comment to remind us to think about this while designing SSKs.

comment:4 Changed at 2008-03-28T19:44:21Z by warner

  • Summary changed from build "sharing slots" to build "sharing slots" / use mutable files as primitives for sharing messages

We were chatting with Ping at the hackfest last night, explaining how I was guessing that sharing would work, specifically the idea of having a pair-wise directory: when Alice wants to give something to Bob, she creates a new directory, links its write-cap to "outbox/to-Bob" in her vdrive, puts the file/files she wants to share in the dir, then mails him the directory's read-cap. Bob links the read-cap to "inbox/from-Alice". Then Alice can "revoke" the grant by just deleting the file from that directory, and she has a record of what she's shared.

Ping was surprised by the idea that we'd re-use this directory. He suggested that we treat the directory like a one-time "Purse" (from the Mint example, either from erights.org or Tyler's IOU protocol). The specific thing that he thought would be confusing was that Bob might come to assume that the file would remain forever in that inbox (that he "owns" the inbox), and therefore he would be upset if Alice removed something from his space. Likewise Bob might be upset to think that Alice could add things to his vdrive at will. Using the same directory for multiple files would increase the utility of this inbox, increasing the chances that Bob would keep using things in-place rather than copying them elsewhere, increasing the surprise/upset.

The other realization we had was that the #217 elliptic-curve -based DSA-based mutable files would have smaller write-caps than read-caps: with some tricks, we could get them down to 96 bits (plus prefix), so about 15 characters of base-62. If we use a separate mutable file per act of sharing, then we could give the recipient the full write-cap instead of the (longer) read-cap. Then we wouldn't need to treat the gift as a directory at all, we could just use it as a "channel" that the two parties can use to communicate about this gift.

For example, we could define a human-shareable cap format (i.e. printable, short enough to avoid wrapping, and with an http prefix) specifically for sharing things, with a prefix character of "S" (as opposed to "D" for directory and "F" for file). The rest of the cap would be a mutable-file write-cap, but the "S" would indicate that we want to treat the contents specially.

The contsnts would contain a message from the giver to the recipient. It would include a list of file/directory caps (with names), the nickname of the sender, heck it could include the public key of the sender and the rest of the body could be signed (allowing the recipient to assign a petname to the sender). Higher-level code would accept the gift, look up the mutable file, read and parse the contents, then offer the user the choice of what to do with the gift. The response channel could just be writing a timestamp and a short note into the slot, saying "got it.. thanks". The revocation action would be to have the writer erase the slot, replacing it with a type byte that says "this gift was revoked" or something.

The key insight is to use mutable files as a primitive, and to use higher-level protocols to generate and interpret their contents.

comment:5 Changed at 2008-06-01T20:58:26Z by warner

  • Milestone changed from eventually to undecided

comment:6 Changed at 2009-12-18T00:09:34Z by davidsarah

  • Keywords performance newcaps added
  • Priority changed from minor to major

This doesn't have to be restricted to mutable files; the ability to generate a file cap before the file has been fully uploaded has also been discussed for immutable files in the new cap protocol. That is possible if we use public key crypto for immutable files (the integrity and confidentiality of the file would still only depend on symmetric crypto). See http://allmydata.org/pipermail/tahoe-dev/2009-October/002962.html

comment:7 Changed at 2012-09-10T19:53:48Z by zooko

  • Keywords revocation added
Note: See TracTickets for help on using tickets.