[tahoe-dev] Tahoe-LAFS key management, part 2: Tahoe-LAFS is like encrypted git

Zooko Wilcox-O'Hearn zooko at zooko.com
Wed Aug 19 08:28:45 PDT 2009


Okay, in today's installment I'll reply to my friend Kris Nuttycombe,  
who read yesterday's installment and then asked how the storage  
service provider could provide access to the files without being able  
to see their filehandles and thus decrypt them.

I replied that the handle could be stored in another file on the  
server, and therefore encrypted so that the server couldn't see it.   
You could imagine taking a bunch of these handles -- capabilities to  
read an immutable file -- and putting them into a new file and  
uploading to the Tahoe-LAFS grid.  Uploading it would encrypt it and  
give you a capability to that new file.  The storage service provider  
wouldn't be able to read the contents of that file, so it wouldn't be  
able to read the files that it references.  This forms a  
"Cryptographic Hash Function Directed Acyclic Graph" structure, which  
should be familiar to many readers as the underlying structure in git  
[*].  Git uses this same technique of combining identification and  
integrity-checking into one handle.

 From this perspective, Tahoe-LAFS can be seen as "like git, and use  
the handle for encryption in addition to integrity-checking and  
identification".

(There are many other differences.  For starters git has a high- 
performance compression scheme and it has a decentralized revision  
control tool built on top.  Tahoe-LAFS has erasure-coding and a  
distributed key-value store for a backend.)

Okay, the bus is arriving at work.

Oh, so then Kris asked "But what about the root of that tree?".  The  
answer is that the capability to the root of that tree is not stored  
on the servers.  It is held by the client, and never transmitted to  
the storage servers.  It turns out that storage servers don't need  
the capability to the file in order to serve up the ciphertext.   
(Technically, they *do* need an identifier, and ideally they would  
also have the integrity-checking part so that they could perform  
integrity checks on the file contents (in addition to clients  
performing that integrity check for themselves).  So the capability  
gets split into its component parts during the download protocol,  
when the client sends the identification and integrity-checking bits  
to the server but not the decryption key, and receives the ciphertext  
in reply.)

Therefore the next layer up, whether another program or a human user,  
needs to manage this single capability to the root of a tree.  Here  
the abstraction-piercing problem of availability versus  
confidentiality remains in force, and different programs and  
different human users have different ways to manage their caps.  I  
personally keep mine in my bookmarks in my web browser.  This is  
risky -- they could be stolen by malicious Javascript (probably) or I  
might accidentally leak them in an HTTP Referer header.  But it is  
very convenient.  For the files in question I value that convenience  
more than an extra degree of safety.  I know of other people who keep  
their Tahoe-LAFS caps more securely, on Unix filesystems, on  
encrypted USB keys, etc..

Regards,

Zooko

[*] Linus Torvalds got the idea of a Cryptographic Hash Function  
Directed Acyclic Graph structure from an earlier distributed revision  
control tool named Monotone.  He didn't go out of his way to give  
credit to Monotone, and many people mistakenly think that he invented  
the idea.


More information about the tahoe-dev mailing list