[tahoe-dev] converting roadblocks into speedbumps -- Re: "Elk Point" design for mutable, add-only, and immutable files

Tue Oct 6 11:10:02 PDT 2009

Folks:

I'm finally getting caught up on the Elk Point design.  (David-Sarah:
where were you when you first thought of it?  Because the tradition is
to name a design after the geographic location where it was
conceived.)

I have several things to contribute to the discussion.  This note is
about "how to convert roadblocks into speedbumps".  This is a possible
feature that could be added to NewCap design to reduce the number of
"roadblock prevention bits" in caps.  This feature may not turn out to
be needed or to be worth its cost in complexity and efficiency, but I
want to document it as a tool in our design toolbox.

As you know, a "roadblock attack" is a kind of denial-of-service where
the attacker knows or guesses the storage index that you are going to
use to upload your file, and it persuades a storage server to put
something else into that storage index before you upload your share.
If the part of the storage index which the storage server can verify
(labelled 't' in the Elk Point diagrams [1]) is short enough, then an
attacker can use brute force to generate a share which matches t, and
the server won't be able to tell that it isn't a legitimate share for
that entire storage index (t, S).

One can imagine dealing with a roadblock attack by choosing a
different key and re-uploading (or if there is a destroy cap 'D' then
choosing a different 'D' so that you don't have to re-encrypt with a
different key), but even if the uploader could win that race against
the roadblocker, a repairer would not have that option since the
repairer has to match the already-extant immutable file read cap.

Here is an extension to the upload/download protocol that converts
roadblocks to speedbumps: the storage server holds all shares that
have been uploaded under a given storage index (and share num), and it
stores them in the order that they arrived.  Downloaders can retrieve
and check the K1 and V values of each of the candidates in order that
they were uploaded, and they can tell which one matches their readcap.
 Therefore if an attacker goes to the work to set up a roadblock, it
only inconveniences the downloader (and the storage server) a tiny
amount -- it is only a shallow speedbump.

Here are the details.

The protocol to upload shares is that the client invokes the server's
"allocate_buckets()" method [2], passing a storage index and a set of
sharenums. To download shares, the client invoke's the server's
"get_buckets()" method [3] passing a storage index.  The storage
server returns a dict mapping from sharenum to BucketReader objects.

The extension to the protocol is that the storage server returns
instead a dict mapping from sharenum to a list of BucketReader
objects, which are sorted in the order that they were uploaded.

This doesn't sound like too much added complication, but perhaps
that's just because the Foolscap library is so Pythonic and
expressive.  If you think about how the storage server lays out its
share data on disk and how many network round-trips and bytes are
required for downloading, the cost of this improvement looks a bit
worse:

The storage server in Tahoe-LAFS v1.5 stores shares in its local
filesystem in the following layout:

storage/shares/$START/$STORAGEINDEX/$SHARENUM

Where "$START" denotes the first 10 bits worth of $STORAGEINDEX
(that's 2 base-32 chars).

You can see this in storage/server.py [4].

The easiest way to extend this to handle multiple share candidates
would be something like

storage/shares/$START/$STORAGEINDEX/$SHARENUM_$CANDIDATENUM

Then the server could see all the candidates (just by inspecting the
$STORAGEINDEX directory -- not having to read the files themselves)
and return an ordered list of remote references to BucketReacers.

Now if the server's reply to get_buckets() contained only the share
num and the remote references to the BucketReaders, then the
downloader would have to wait for at least one more network round trip
to fetch the K1 and V values for each BucketReader before it could
decide which candidate share matched its read-cap.  One the other
hand, if the server's initial reply contained the K1 and V values for
each candidate, then the server would have to do at least one extra
disk seek per candidate to read those values out of each share.

We know from experience that a disk seek in the storage server is
typically one of the slowest operations in the whole grid.

So we could add a file to the layout which contains just the K1 and V
values, in order, for all of the candidates, thus making it possible
for the storage server to return all the K1 and V values without doing
a lot of seeks:

storage/shares/$START/$STORAGEINDEX/$SHARENUM_K1s_and_Vs
storage/shares/$START/$STORAGEINDEX/$SHARENUM_$CANDIDATENUM

Okay, that's it.  Would this be a win?  Well, I think that depends on
the exact sizes of the caps.  I'm working on a table cataloguing
Everything That Can Go Wrong, including brute force attacks on various
elements of the crypto structure and failure of the various crypto
primitives that we use.  The Everything That Can Go Wrong table
includes the numbers for how much a brute force attack would cost,
which will let me understand exactly what size of crypto values I
would be comfortable with.  Once I have that table, then I'll
reconsider whether in Elk Point town we should build a
Roadblock-To-Speedbump Converter, or if we should just set such a high
tax on roadblocks that nobody can afford to build one (by including
enough t bits).

Regards,

Zooko

[1] http://jacaranda.org/tahoe/mutable-addonly-elkpoint-2.svg
[2] http://allmydata.org/trac/tahoe/browser/src/allmydata/interfaces.py?rev=4045#L94
[3] http://allmydata.org/trac/tahoe/browser/src/allmydata/interfaces.py?rev=4045#L160
[4] http://allmydata.org/trac/tahoe/browser/src/allmydata/storage/server.py?rev=3871