[Tahoe-dev] decentralization, economics, attack resistance (was: wmf's reply to my reply)

zooko at zooko.com zooko at zooko.com
Fri Jun 1 15:22:41 PDT 2007

 wmf wrote:
> Hey Zooko, I saw the announcement and looked into it a bit, but as a  
> p2p-critic rather than a Tahoe developer I didn't see anything new  
> there compared to the good old Mojo Nation days.

and in a later letter, he wrote:

> I am missing the big picture about how Tahoe ends up being a better  
> backup system; are you going to run all the storage nodes yourself or  
> are you going to try to use customers for storage? If the former, why  
> build a P2P system rather than something S3-like (although maybe S3  
> is P2P internally)? If the latter, is there any chance of it actually  
> working? I guess I am interested in economics first and peer  
> selection algorithms second. This goes back to the "decline of  
> decentralization" discussion.

Dear wmf:

I've thought carefully about this.  Just to be explicit, I'm writing this
letter with my "Tahoe Hacker" hat on, not my "Allmydata, Inc. employee" hat on.
Since the entire Tahoe source code base is GPL'ed (plus 12 months grace
period), I can implement radical ideas of the sort that I propose below even if
Allmydata, Inc. goes out of business tomorrow.

But before the radical ideas, a few words about the necessary foundation:

That foundation is a working, if simple, system with an active community of
happy users.  The Tahoe project's first, most important user is the Allmydata
corporation.  The current version of Tahoe is almost ready to be deployed in
the first of Allmydata's use cases:

use case 1: Allmydata operating a storage grid on corporately owned servers (*)

But to have a real community requires more than one user, and I'm actively
working on making Tahoe useful for two other kinds of user:

use case 2: groups of friends who want to share backup and file-sharing;  Tahoe
is currently useful for this -- after some testing (e-mail me for the furl to
my private storage grid) and improvements (see the trac tickets for possible
improvements) we'll make a second release of Tahoe and encourage people to use
it for this purpose.

use case 3: system administrators who want to experiment with it for backup;
Tahoe is almost ready to be used for this (**).

In all three of these use cases, all of the nodes participating in the Tahoe
grid belong in the same trust domain -- they are all owned and operated by one
company or by friends and family.

In the long run, if Tahoe is successful, people will want to use it to share
storage and data with people that they do not entirely trust.  Eventually there
will need to be a way to incentivize participants to honestly contribute value
in return for the value they receive.  Mojo Nation attempted to solve that
problem with a digital cash system and an automated market to buy and sell
services.  This attempt was both too ambitious (the automated market part of it
remains an unsolved problem) and too timid (the centralized nature of Mojo
Nation's digital currency meant that the economy could be centrally manipulated
and that it could not outlive the founding company).

For Tahoe, I want to invent something else: a truly decentralized economic
mechanism.  Research that points in this direction includes the sub-field of
"algorithmic mechanism design" within economic game theory, some peer-to-peer
research such as GNUnet, Wei Dai's and Nick Szabo's ideas about "bit gold",
Nick Szabo's "smart contracts", and much more.  Another inspiration is
BitTorrent's tit-for-tat mechanism, which is decentralized and minimal, but
gets the job done within its limited problem domain.

The motivation for a truly decentralized incentive system is two-fold, and both
reasons are unavoidable for large-scale, distributed, reliable systems:

reason 1: Such a system will comprise selfish or mutually distrusting groups.
If the system is to be useful to more than a single organization, it must
enable such selfish or mutually distrusting groups to cooperate without being
vulnerable to one another.

reason 2: Such a system will be attacked by intelligent, malicious, adaptive
enemies (see "Brave New War" by John Robb [1]).

So here is use case #5:

use case 5: a selfish, mutually distrusting ecosystem of people who want to
share a reliable, attack-resistant storage and data service.

What about use case #4?  That is the one that you asked about:

use case 4: Allmydata and its customers operating a storage grid including the
customer's computers.

It turns out that so far, use case 4 has more in common with use case 3 than
with use case 5.  Although Mojo Nation's users were simply connecting, looking
for files to download, and then leaving, and were therefore extremely
unreliable [2], Allmydata's users are signing up for a long-term contract to
protect their data against the possibility of future loss.  So far, they are
much more reliable, and none have attempted to cheat their way out of their end
of the contract.

Okay, I hope that this letter answered your question about why Tahoe is being
developed the way that it is.  Stay tuned for the second release of Tahoe and
for new ideas about how to enable selfish, mutually distrusting groups to form
a shared, attack resistant computing platform.



(*)  The following two tickets and perhaps others have to be done before
Allmydata can start deploying Tahoe on their servers:


(**) At least one of the following two tickets has to be done before system
administrators can start experimenting with Tahoe for backup:


[1] http://amazon.com/exec/obidos/ASIN/0471780790
[2] http://citeseer.ist.psu.edu/734934.html

More information about the Tahoe-dev mailing list