Ticket #393: 393status29.dpatch

File 393status29.dpatch, 511.3 KB (added by kevan, at 2010-08-10T00:52:37Z)
Line 
1Mon Aug  9 16:15:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
3 
4  These modifications were basically all to the end of having the
5  servermap updater use the unified MDMF + SDMF read interface whenever
6  possible -- this reduces the complexity of the code, making it easier to
7  read and maintain. To do this, I needed to modify the process of
8  updating the servermap a little bit.
9 
10  To support partial-file updates, I also modified the servermap updater
11  to fetch the block hash trees and certain segments of files while it
12  performed a servermap update (this can be done without adding any new
13  roundtrips because of batch-read functionality that the read proxy has).
14 
15
16Mon Aug  9 16:20:25 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * mutable/publish.py: Modify the publish process to support MDMF
18 
19  The inner workings of the publishing process needed to be reworked to a
20  large extend to cope with segmented mutable files, and to cope with
21  partial-file updates of mutable files. This patch does that. It also
22  introduces wrappers for uploadable data, allowing the use of
23  filehandle-like objects as data sources, in addition to strings. This
24  reduces memory inefficiency when dealing with large files through the
25  webapi, and clarifies update code there.
26
27Mon Aug  9 16:23:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
28  * mutable/retrieve.py: Modify the retrieval process to support MDMF
29 
30  The logic behind a mutable file download had to be adapted to work with
31  segmented mutable files; this patch performs those adaptations. It also
32  exposes some decoding and decrypting functionality to make partial-file
33  updates a little easier, and supports efficient random-access downloads
34  of parts of an MDMF file.
35
36Mon Aug  9 16:25:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
37  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
38 
39  The checker and repairer required minimal changes to work with the MDMF
40  modifications made elsewhere. The checker duplicated a lot of the code
41  that was already in the downloader, so I modified the downloader
42  slightly to expose this functionality to the checker and removed the
43  duplicated code. The repairer only required a minor change to deal with
44  data representation.
45
46Mon Aug  9 16:27:41 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
47  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
48 
49  One of the goals of MDMF as a GSoC project is to lay the groundwork for
50  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
51  multiple versions of a single cap on the grid. In line with this, there
52  is a now a distinction between an overriding mutable file (which can be
53  thought to correspond to the cap/unique identifier for that mutable
54  file) and versions of the mutable file (which we can download, update,
55  and so on). All download, upload, and modification operations end up
56  happening on a particular version of a mutable file, but there are
57  shortcut methods on the object representing the overriding mutable file
58  that perform these operations on the best version of the mutable file
59  (which is what code should be doing until we have LDMF and better
60  support for other paradigms).
61 
62  Another goal of MDMF was to take advantage of segmentation to give
63  callers more efficient partial file updates or appends. This patch
64  implements methods that do that, too.
65 
66
67Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
68  * interfaces.py: Add #993 interfaces
69
70Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
71  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
72
73Mon Aug  9 16:36:23 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
74  * nodemaker.py: Make nodemaker expose a way to create MDMF files
75
76Mon Aug  9 16:37:55 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
77  * web: Alter the webapi to get along with and take advantage of the MDMF changes
78 
79  The main benefit that the webapi gets from MDMF, at least initially, is
80  the ability to do a streaming download of an MDMF mutable file. It also
81  exposes a way (through the PUT verb) to append to or otherwise modify
82  (in-place) an MDMF mutable file.
83
84Mon Aug  9 16:40:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/layout.py and interfaces.py: add MDMF writer and reader
86 
87  The MDMF writer is responsible for keeping state as plaintext is
88  gradually processed into share data by the upload process. When the
89  upload finishes, it will write all of its share data to a remote server,
90  reporting its status back to the publisher.
91 
92  The MDMF reader is responsible for abstracting an MDMF file as it sits
93  on the grid from the downloader; specifically, by receiving and
94  responding to requests for arbitrary data within the MDMF file.
95 
96  The interfaces.py file has also been modified to contain an interface
97  for the writer.
98
99Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
101
102Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
103  * immutable/literal.py: implement the same interfaces as other filenodes
104
105Mon Aug  9 17:07:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
106  * tests:
107 
108      - A lot of existing tests relied on aspects of the mutable file
109        implementation that were changed. This patch updates those tests
110        to work with the changes.
111      - This patch also adds tests for new features.
112
113New patches:
114
115[mutable/servermap.py: Alter the servermap updater to work with MDMF files
116Kevan Carstensen <kevan@isnotajoke.com>**20100809231510
117 Ignore-this: 26f95723688adc5d9457224ac006fd65
118 
119 These modifications were basically all to the end of having the
120 servermap updater use the unified MDMF + SDMF read interface whenever
121 possible -- this reduces the complexity of the code, making it easier to
122 read and maintain. To do this, I needed to modify the process of
123 updating the servermap a little bit.
124 
125 To support partial-file updates, I also modified the servermap updater
126 to fetch the block hash trees and certain segments of files while it
127 performed a servermap update (this can be done without adding any new
128 roundtrips because of batch-read functionality that the read proxy has).
129 
130] {
131hunk ./src/allmydata/mutable/servermap.py 7
132 from itertools import count
133 from twisted.internet import defer
134 from twisted.python import failure
135-from foolscap.api import DeadReferenceError, RemoteException, eventually
136-from allmydata.util import base32, hashutil, idlib, log
137+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
138+                         fireEventually
139+from allmydata.util import base32, hashutil, idlib, log, deferredutil
140 from allmydata.storage.server import si_b2a
141 from allmydata.interfaces import IServermapUpdaterStatus
142 from pycryptopp.publickey import rsa
143hunk ./src/allmydata/mutable/servermap.py 17
144 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
145      DictOfSets, CorruptShareError, NeedMoreDataError
146 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
147-     SIGNED_PREFIX_LENGTH
148+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
149 
150 class UpdateStatus:
151     implements(IServermapUpdaterStatus)
152hunk ./src/allmydata/mutable/servermap.py 124
153         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
154         self.last_update_mode = None
155         self.last_update_time = 0
156+        self.update_data = {} # (verinfo,shnum) => data
157 
158     def copy(self):
159         s = ServerMap()
160hunk ./src/allmydata/mutable/servermap.py 255
161         """Return a set of versionids, one for each version that is currently
162         recoverable."""
163         versionmap = self.make_versionmap()
164-
165         recoverable_versions = set()
166         for (verinfo, shares) in versionmap.items():
167             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
168hunk ./src/allmydata/mutable/servermap.py 340
169         return False
170 
171 
172+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
173+        """
174+        I return the update data for the given shnum
175+        """
176+        update_data = self.update_data[shnum]
177+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
178+        return update_datum
179+
180+
181+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
182+        """
183+        I record the block hash tree for the given shnum.
184+        """
185+        self.update_data.setdefault(shnum , []).append((verinfo, data))
186+
187+
188 class ServermapUpdater:
189     def __init__(self, filenode, storage_broker, monitor, servermap,
190hunk ./src/allmydata/mutable/servermap.py 358
191-                 mode=MODE_READ, add_lease=False):
192+                 mode=MODE_READ, add_lease=False, update_range=None):
193         """I update a servermap, locating a sufficient number of useful
194         shares and remembering where they are located.
195 
196hunk ./src/allmydata/mutable/servermap.py 390
197         #  * if we need the encrypted private key, we want [-1216ish:]
198         #   * but we can't read from negative offsets
199         #   * the offset table tells us the 'ish', also the positive offset
200-        # A future version of the SMDF slot format should consider using
201-        # fixed-size slots so we can retrieve less data. For now, we'll just
202-        # read 2000 bytes, which also happens to read enough actual data to
203-        # pre-fetch a 9-entry dirnode.
204+        # MDMF:
205+        #  * Checkstring? [0:72]
206+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
207+        #    the offset table will tell us for sure.
208+        #  * If we need the verification key, we have to consult the offset
209+        #    table as well.
210+        # At this point, we don't know which we are. Our filenode can
211+        # tell us, but it might be lying -- in some cases, we're
212+        # responsible for telling it which kind of file it is.
213         self._read_size = 4000
214         if mode == MODE_CHECK:
215             # we use unpack_prefix_and_signature, so we need 1k
216hunk ./src/allmydata/mutable/servermap.py 410
217         # to ask for it during the check, we'll have problems doing the
218         # publish.
219 
220+        self.fetch_update_data = False
221+        if mode == MODE_WRITE and update_range:
222+            # We're updating the servermap in preparation for an
223+            # in-place file update, so we need to fetch some additional
224+            # data from each share that we find.
225+            assert len(update_range) == 2
226+
227+            self.start_segment = update_range[0]
228+            self.end_segment = update_range[1]
229+            self.fetch_update_data = True
230+
231         prefix = si_b2a(self._storage_index)[:5]
232         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
233                                    si=prefix, mode=mode)
234hunk ./src/allmydata/mutable/servermap.py 459
235         self._queries_completed = 0
236 
237         sb = self._storage_broker
238+        # All of the peers, permuted by the storage index, as usual.
239         full_peerlist = sb.get_servers_for_index(self._storage_index)
240         self.full_peerlist = full_peerlist # for use later, immutable
241         self.extra_peers = full_peerlist[:] # peers are removed as we use them
242hunk ./src/allmydata/mutable/servermap.py 466
243         self._good_peers = set() # peers who had some shares
244         self._empty_peers = set() # peers who don't have any shares
245         self._bad_peers = set() # peers to whom our queries failed
246+        self._readers = {} # peerid -> dict(sharewriters), filled in
247+                           # after responses come in.
248 
249         k = self._node.get_required_shares()
250hunk ./src/allmydata/mutable/servermap.py 470
251+        # For what cases can these conditions work?
252         if k is None:
253             # make a guess
254             k = 3
255hunk ./src/allmydata/mutable/servermap.py 483
256         self.num_peers_to_query = k + self.EPSILON
257 
258         if self.mode == MODE_CHECK:
259+            # We want to query all of the peers.
260             initial_peers_to_query = dict(full_peerlist)
261             must_query = set(initial_peers_to_query.keys())
262             self.extra_peers = []
263hunk ./src/allmydata/mutable/servermap.py 491
264             # we're planning to replace all the shares, so we want a good
265             # chance of finding them all. We will keep searching until we've
266             # seen epsilon that don't have a share.
267+            # We don't query all of the peers because that could take a while.
268             self.num_peers_to_query = N + self.EPSILON
269             initial_peers_to_query, must_query = self._build_initial_querylist()
270             self.required_num_empty_peers = self.EPSILON
271hunk ./src/allmydata/mutable/servermap.py 501
272             # might also avoid the round trip required to read the encrypted
273             # private key.
274 
275-        else:
276+        else: # MODE_READ, MODE_ANYTHING
277+            # 2k peers is good enough.
278             initial_peers_to_query, must_query = self._build_initial_querylist()
279 
280         # this is a set of peers that we are required to get responses from:
281hunk ./src/allmydata/mutable/servermap.py 517
282         # before we can consider ourselves finished, and self.extra_peers
283         # contains the overflow (peers that we should tap if we don't get
284         # enough responses)
285+        # I guess that self._must_query is a subset of
286+        # initial_peers_to_query?
287+        assert set(must_query).issubset(set(initial_peers_to_query))
288 
289         self._send_initial_requests(initial_peers_to_query)
290         self._status.timings["initial_queries"] = time.time() - self._started
291hunk ./src/allmydata/mutable/servermap.py 576
292         # errors that aren't handled by _query_failed (and errors caused by
293         # _query_failed) get logged, but we still want to check for doneness.
294         d.addErrback(log.err)
295-        d.addBoth(self._check_for_done)
296         d.addErrback(self._fatal_error)
297hunk ./src/allmydata/mutable/servermap.py 577
298+        d.addCallback(self._check_for_done)
299         return d
300 
301     def _do_read(self, ss, peerid, storage_index, shnums, readv):
302hunk ./src/allmydata/mutable/servermap.py 596
303         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
304         return d
305 
306+
307+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
308+        """
309+        I am called when a remote server returns a corrupt share in
310+        response to one of our queries. By corrupt, I mean a share
311+        without a valid signature. I then record the failure, notify the
312+        server of the corruption, and record the share as bad.
313+        """
314+        f = failure.Failure(e)
315+        self.log(format="bad share: %(f_value)s", f_value=str(f),
316+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
317+        # Notify the server that its share is corrupt.
318+        self.notify_server_corruption(peerid, shnum, str(e))
319+        # By flagging this as a bad peer, we won't count any of
320+        # the other shares on that peer as valid, though if we
321+        # happen to find a valid version string amongst those
322+        # shares, we'll keep track of it so that we don't need
323+        # to validate the signature on those again.
324+        self._bad_peers.add(peerid)
325+        self._last_failure = f
326+        # XXX: Use the reader for this?
327+        checkstring = data[:SIGNED_PREFIX_LENGTH]
328+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
329+        self._servermap.problems.append(f)
330+
331+
332+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
333+        """
334+        If one of my queries returns successfully (which means that we
335+        were able to and successfully did validate the signature), I
336+        cache the data that we initially fetched from the storage
337+        server. This will help reduce the number of roundtrips that need
338+        to occur when the file is downloaded, or when the file is
339+        updated.
340+        """
341+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
342+
343+
344     def _got_results(self, datavs, peerid, readsize, stuff, started):
345         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
346                       peerid=idlib.shortnodeid_b2a(peerid),
347hunk ./src/allmydata/mutable/servermap.py 641
348                       level=log.NOISY)
349         now = time.time()
350         elapsed = now - started
351-        self._queries_outstanding.discard(peerid)
352-        self._servermap.reachable_peers.add(peerid)
353-        self._must_query.discard(peerid)
354-        self._queries_completed += 1
355+        def _done_processing(ignored=None):
356+            self._queries_outstanding.discard(peerid)
357+            self._servermap.reachable_peers.add(peerid)
358+            self._must_query.discard(peerid)
359+            self._queries_completed += 1
360         if not self._running:
361             self.log("but we're not running, so we'll ignore it", parent=lp,
362                      level=log.NOISY)
363hunk ./src/allmydata/mutable/servermap.py 649
364+            _done_processing()
365             self._status.add_per_server_time(peerid, "late", started, elapsed)
366             return
367         self._status.add_per_server_time(peerid, "query", started, elapsed)
368hunk ./src/allmydata/mutable/servermap.py 659
369         else:
370             self._empty_peers.add(peerid)
371 
372-        last_verinfo = None
373-        last_shnum = None
374+        ss, storage_index = stuff
375+        ds = []
376+
377         for shnum,datav in datavs.items():
378             data = datav[0]
379hunk ./src/allmydata/mutable/servermap.py 664
380-            try:
381-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
382-                last_verinfo = verinfo
383-                last_shnum = shnum
384-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
385-            except CorruptShareError, e:
386-                # log it and give the other shares a chance to be processed
387-                f = failure.Failure()
388-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
389-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
390-                self.notify_server_corruption(peerid, shnum, str(e))
391-                self._bad_peers.add(peerid)
392-                self._last_failure = f
393-                checkstring = data[:SIGNED_PREFIX_LENGTH]
394-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
395-                self._servermap.problems.append(f)
396-                pass
397+            reader = MDMFSlotReadProxy(ss,
398+                                       storage_index,
399+                                       shnum,
400+                                       data)
401+            self._readers.setdefault(peerid, dict())[shnum] = reader
402+            # our goal, with each response, is to validate the version
403+            # information and share data as best we can at this point --
404+            # we do this by validating the signature. To do this, we
405+            # need to do the following:
406+            #   - If we don't already have the public key, fetch the
407+            #     public key. We use this to validate the signature.
408+            if not self._node.get_pubkey():
409+                # fetch and set the public key.
410+                d = reader.get_verification_key(queue=True)
411+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
412+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
413+                # XXX: Make self._pubkey_query_failed?
414+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
415+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
416+            else:
417+                # we already have the public key.
418+                d = defer.succeed(None)
419 
420hunk ./src/allmydata/mutable/servermap.py 687
421-        self._status.timings["cumulative_verify"] += (time.time() - now)
422+            # Neither of these two branches return anything of
423+            # consequence, so the first entry in our deferredlist will
424+            # be None.
425 
426hunk ./src/allmydata/mutable/servermap.py 691
427-        if self._need_privkey and last_verinfo:
428-            # send them a request for the privkey. We send one request per
429-            # server.
430-            lp2 = self.log("sending privkey request",
431-                           parent=lp, level=log.NOISY)
432-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
433-             offsets_tuple) = last_verinfo
434-            o = dict(offsets_tuple)
435+            # - Next, we need the version information. We almost
436+            #   certainly got this by reading the first thousand or so
437+            #   bytes of the share on the storage server, so we
438+            #   shouldn't need to fetch anything at this step.
439+            d2 = reader.get_verinfo()
440+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
441+                self._got_corrupt_share(error, shnum, peerid, data, lp))
442+            # - Next, we need the signature. For an SDMF share, it is
443+            #   likely that we fetched this when doing our initial fetch
444+            #   to get the version information. In MDMF, this lives at
445+            #   the end of the share, so unless the file is quite small,
446+            #   we'll need to do a remote fetch to get it.
447+            d3 = reader.get_signature(queue=True)
448+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
449+                self._got_corrupt_share(error, shnum, peerid, data, lp))
450+            #  Once we have all three of these responses, we can move on
451+            #  to validating the signature
452 
453hunk ./src/allmydata/mutable/servermap.py 709
454-            self._queries_outstanding.add(peerid)
455-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
456-            ss = self._servermap.connections[peerid]
457-            privkey_started = time.time()
458-            d = self._do_read(ss, peerid, self._storage_index,
459-                              [last_shnum], readv)
460-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
461-                          privkey_started, lp2)
462-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
463-            d.addErrback(log.err)
464-            d.addCallback(self._check_for_done)
465-            d.addErrback(self._fatal_error)
466+            # Does the node already have a privkey? If not, we'll try to
467+            # fetch it here.
468+            if self._need_privkey:
469+                d4 = reader.get_encprivkey(queue=True)
470+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
471+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
472+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
473+                    self._privkey_query_failed(error, shnum, data, lp))
474+            else:
475+                d4 = defer.succeed(None)
476+
477+
478+            if self.fetch_update_data:
479+                # fetch the block hash tree and first + last segment, as
480+                # configured earlier.
481+                # Then set them in wherever we happen to want to set
482+                # them.
483+                ds = []
484+                # XXX: We do this above, too. Is there a good way to
485+                # make the two routines share the value without
486+                # introducing more roundtrips?
487+                ds.append(reader.get_verinfo())
488+                ds.append(reader.get_blockhashes(queue=True))
489+                ds.append(reader.get_block_and_salt(self.start_segment,
490+                                                    queue=True))
491+                ds.append(reader.get_block_and_salt(self.end_segment,
492+                                                    queue=True))
493+                d5 = deferredutil.gatherResults(ds)
494+                d5.addCallback(self._got_update_results_one_share, shnum)
495+            else:
496+                d5 = defer.succeed(None)
497 
498hunk ./src/allmydata/mutable/servermap.py 741
499+            dl = defer.DeferredList([d, d2, d3, d4, d5])
500+            reader.flush()
501+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
502+                self._got_signature_one_share(results, shnum, peerid, lp))
503+            dl.addErrback(lambda error, shnum=shnum, data=data:
504+               self._got_corrupt_share(error, shnum, peerid, data, lp))
505+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
506+                self._cache_good_sharedata(verinfo, shnum, now, data))
507+            ds.append(dl)
508+        # dl is a deferred list that will fire when all of the shares
509+        # that we found on this peer are done processing. When dl fires,
510+        # we know that processing is done, so we can decrement the
511+        # semaphore-like thing that we incremented earlier.
512+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
513+        # Are we done? Done means that there are no more queries to
514+        # send, that there are no outstanding queries, and that we
515+        # haven't received any queries that are still processing. If we
516+        # are done, self._check_for_done will cause the done deferred
517+        # that we returned to our caller to fire, which tells them that
518+        # they have a complete servermap, and that we won't be touching
519+        # the servermap anymore.
520+        dl.addCallback(_done_processing)
521+        dl.addCallback(self._check_for_done)
522+        dl.addErrback(self._fatal_error)
523         # all done!
524         self.log("_got_results done", parent=lp, level=log.NOISY)
525hunk ./src/allmydata/mutable/servermap.py 767
526+        return dl
527+
528+
529+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
530+        if self._node.get_pubkey():
531+            return # don't go through this again if we don't have to
532+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
533+        assert len(fingerprint) == 32
534+        if fingerprint != self._node.get_fingerprint():
535+            raise CorruptShareError(peerid, shnum,
536+                                "pubkey doesn't match fingerprint")
537+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
538+        assert self._node.get_pubkey()
539+
540 
541     def notify_server_corruption(self, peerid, shnum, reason):
542         ss = self._servermap.connections[peerid]
543hunk ./src/allmydata/mutable/servermap.py 787
544         ss.callRemoteOnly("advise_corrupt_share",
545                           "mutable", self._storage_index, shnum, reason)
546 
547-    def _got_results_one_share(self, shnum, data, peerid, lp):
548+
549+    def _got_signature_one_share(self, results, shnum, peerid, lp):
550+        # It is our job to give versioninfo to our caller. We need to
551+        # raise CorruptShareError if the share is corrupt for any
552+        # reason, something that our caller will handle.
553         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
554                  shnum=shnum,
555                  peerid=idlib.shortnodeid_b2a(peerid),
556hunk ./src/allmydata/mutable/servermap.py 797
557                  level=log.NOISY,
558                  parent=lp)
559-
560-        # this might raise NeedMoreDataError, if the pubkey and signature
561-        # live at some weird offset. That shouldn't happen, so I'm going to
562-        # treat it as a bad share.
563-        (seqnum, root_hash, IV, k, N, segsize, datalength,
564-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
565-
566-        if not self._node.get_pubkey():
567-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
568-            assert len(fingerprint) == 32
569-            if fingerprint != self._node.get_fingerprint():
570-                raise CorruptShareError(peerid, shnum,
571-                                        "pubkey doesn't match fingerprint")
572-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
573-
574-        if self._need_privkey:
575-            self._try_to_extract_privkey(data, peerid, shnum, lp)
576-
577-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
578-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
579+        _, verinfo, signature, __, ___ = results
580+        (seqnum,
581+         root_hash,
582+         saltish,
583+         segsize,
584+         datalen,
585+         k,
586+         n,
587+         prefix,
588+         offsets) = verinfo[1]
589         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
590 
591hunk ./src/allmydata/mutable/servermap.py 809
592-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
593+        # XXX: This should be done for us in the method, so
594+        # presumably you can go in there and fix it.
595+        verinfo = (seqnum,
596+                   root_hash,
597+                   saltish,
598+                   segsize,
599+                   datalen,
600+                   k,
601+                   n,
602+                   prefix,
603                    offsets_tuple)
604hunk ./src/allmydata/mutable/servermap.py 820
605+        # This tuple uniquely identifies a share on the grid; we use it
606+        # to keep track of the ones that we've already seen.
607 
608         if verinfo not in self._valid_versions:
609hunk ./src/allmydata/mutable/servermap.py 824
610-            # it's a new pair. Verify the signature.
611-            valid = self._node.get_pubkey().verify(prefix, signature)
612+            # This is a new version tuple, and we need to validate it
613+            # against the public key before keeping track of it.
614+            assert self._node.get_pubkey()
615+            valid = self._node.get_pubkey().verify(prefix, signature[1])
616             if not valid:
617hunk ./src/allmydata/mutable/servermap.py 829
618-                raise CorruptShareError(peerid, shnum, "signature is invalid")
619+                raise CorruptShareError(peerid, shnum,
620+                                        "signature is invalid")
621 
622hunk ./src/allmydata/mutable/servermap.py 832
623-            # ok, it's a valid verinfo. Add it to the list of validated
624-            # versions.
625-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
626-                     % (seqnum, base32.b2a(root_hash)[:4],
627-                        idlib.shortnodeid_b2a(peerid), shnum,
628-                        k, N, segsize, datalength),
629-                     parent=lp)
630-            self._valid_versions.add(verinfo)
631-        # We now know that this is a valid candidate verinfo.
632+        # ok, it's a valid verinfo. Add it to the list of validated
633+        # versions.
634+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
635+                 % (seqnum, base32.b2a(root_hash)[:4],
636+                    idlib.shortnodeid_b2a(peerid), shnum,
637+                    k, n, segsize, datalen),
638+                    parent=lp)
639+        self._valid_versions.add(verinfo)
640+        # We now know that this is a valid candidate verinfo. Whether or
641+        # not this instance of it is valid is a matter for the next
642+        # statement; at this point, we just know that if we see this
643+        # version info again, that its signature checks out and that
644+        # we're okay to skip the signature-checking step.
645 
646hunk ./src/allmydata/mutable/servermap.py 846
647+        # (peerid, shnum) are bound in the method invocation.
648         if (peerid, shnum) in self._servermap.bad_shares:
649             # we've been told that the rest of the data in this share is
650             # unusable, so don't add it to the servermap.
651hunk ./src/allmydata/mutable/servermap.py 861
652         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
653         return verinfo
654 
655-    def _deserialize_pubkey(self, pubkey_s):
656-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
657-        return verifier
658 
659hunk ./src/allmydata/mutable/servermap.py 862
660-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
661-        try:
662-            r = unpack_share(data)
663-        except NeedMoreDataError, e:
664-            # this share won't help us. oh well.
665-            offset = e.encprivkey_offset
666-            length = e.encprivkey_length
667-            self.log("shnum %d on peerid %s: share was too short (%dB) "
668-                     "to get the encprivkey; [%d:%d] ought to hold it" %
669-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
670-                      offset, offset+length),
671-                     parent=lp)
672-            # NOTE: if uncoordinated writes are taking place, someone might
673-            # change the share (and most probably move the encprivkey) before
674-            # we get a chance to do one of these reads and fetch it. This
675-            # will cause us to see a NotEnoughSharesError(unable to fetch
676-            # privkey) instead of an UncoordinatedWriteError . This is a
677-            # nuisance, but it will go away when we move to DSA-based mutable
678-            # files (since the privkey will be small enough to fit in the
679-            # write cap).
680+    def _got_update_results_one_share(self, results, share):
681+        """
682+        I record the update results in results.
683+        """
684+        assert len(results) == 4
685+        verinfo, blockhashes, start, end = results
686+        (seqnum,
687+         root_hash,
688+         saltish,
689+         segsize,
690+         datalen,
691+         k,
692+         n,
693+         prefix,
694+         offsets) = verinfo
695+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
696 
697hunk ./src/allmydata/mutable/servermap.py 879
698-            return
699+        # XXX: This should be done for us in the method, so
700+        # presumably you can go in there and fix it.
701+        verinfo = (seqnum,
702+                   root_hash,
703+                   saltish,
704+                   segsize,
705+                   datalen,
706+                   k,
707+                   n,
708+                   prefix,
709+                   offsets_tuple)
710 
711hunk ./src/allmydata/mutable/servermap.py 891
712-        (seqnum, root_hash, IV, k, N, segsize, datalen,
713-         pubkey, signature, share_hash_chain, block_hash_tree,
714-         share_data, enc_privkey) = r
715+        update_data = (blockhashes, start, end)
716+        self._servermap.set_update_data_for_share_and_verinfo(share,
717+                                                              verinfo,
718+                                                              update_data)
719 
720hunk ./src/allmydata/mutable/servermap.py 896
721-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
722+
723+    def _deserialize_pubkey(self, pubkey_s):
724+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
725+        return verifier
726 
727hunk ./src/allmydata/mutable/servermap.py 901
728-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
729 
730hunk ./src/allmydata/mutable/servermap.py 902
731+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
732+        """
733+        Given a writekey from a remote server, I validate it against the
734+        writekey stored in my node. If it is valid, then I set the
735+        privkey and encprivkey properties of the node.
736+        """
737         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
738         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
739         if alleged_writekey != self._node.get_writekey():
740hunk ./src/allmydata/mutable/servermap.py 980
741         self._queries_completed += 1
742         self._last_failure = f
743 
744-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
745-        now = time.time()
746-        elapsed = now - started
747-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
748-        self._queries_outstanding.discard(peerid)
749-        if not self._need_privkey:
750-            return
751-        if shnum not in datavs:
752-            self.log("privkey wasn't there when we asked it",
753-                     level=log.WEIRD, umid="VA9uDQ")
754-            return
755-        datav = datavs[shnum]
756-        enc_privkey = datav[0]
757-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
758 
759     def _privkey_query_failed(self, f, peerid, shnum, lp):
760         self._queries_outstanding.discard(peerid)
761hunk ./src/allmydata/mutable/servermap.py 994
762         self._servermap.problems.append(f)
763         self._last_failure = f
764 
765+
766     def _check_for_done(self, res):
767         # exit paths:
768         #  return self._send_more_queries(outstanding) : send some more queries
769hunk ./src/allmydata/mutable/servermap.py 1000
770         #  return self._done() : all done
771         #  return : keep waiting, no new queries
772-
773         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
774                               "%(outstanding)d queries outstanding, "
775                               "%(extra)d extra peers available, "
776hunk ./src/allmydata/mutable/servermap.py 1205
777         self._servermap.last_update_time = self._started
778         # the servermap will not be touched after this
779         self.log("servermap: %s" % self._servermap.summarize_versions())
780+
781         eventually(self._done_deferred.callback, self._servermap)
782 
783     def _fatal_error(self, f):
784}
785[mutable/publish.py: Modify the publish process to support MDMF
786Kevan Carstensen <kevan@isnotajoke.com>**20100809232025
787 Ignore-this: 1cbb42e34c5ecef9f5fcb2180c018965
788 
789 The inner workings of the publishing process needed to be reworked to a
790 large extend to cope with segmented mutable files, and to cope with
791 partial-file updates of mutable files. This patch does that. It also
792 introduces wrappers for uploadable data, allowing the use of
793 filehandle-like objects as data sources, in addition to strings. This
794 reduces memory inefficiency when dealing with large files through the
795 webapi, and clarifies update code there.
796] {
797hunk ./src/allmydata/mutable/publish.py 4
798 
799 
800 import os, struct, time
801+from StringIO import StringIO
802 from itertools import count
803 from zope.interface import implements
804 from twisted.internet import defer
805hunk ./src/allmydata/mutable/publish.py 9
806 from twisted.python import failure
807-from allmydata.interfaces import IPublishStatus
808+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
809+                                 IMutableUploadable
810 from allmydata.util import base32, hashutil, mathutil, idlib, log
811 from allmydata import hashtree, codec
812 from allmydata.storage.server import si_b2a
813hunk ./src/allmydata/mutable/publish.py 21
814      UncoordinatedWriteError, NotEnoughServersError
815 from allmydata.mutable.servermap import ServerMap
816 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
817-     unpack_checkstring, SIGNED_PREFIX
818+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
819+     SDMFSlotWriteProxy
820+
821+KiB = 1024
822+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
823+PUSHING_BLOCKS_STATE = 0
824+PUSHING_EVERYTHING_ELSE_STATE = 1
825+DONE_STATE = 2
826 
827 class PublishStatus:
828     implements(IPublishStatus)
829hunk ./src/allmydata/mutable/publish.py 118
830         self._status.set_helper(False)
831         self._status.set_progress(0.0)
832         self._status.set_active(True)
833+        self._version = self._node.get_version()
834+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
835+
836 
837     def get_status(self):
838         return self._status
839hunk ./src/allmydata/mutable/publish.py 132
840             kwargs["facility"] = "tahoe.mutable.publish"
841         return log.msg(*args, **kwargs)
842 
843+
844+    def update(self, data, offset, blockhashes, version):
845+        """
846+        I replace the contents of this file with the contents of data,
847+        starting at offset. I return a Deferred that fires with None
848+        when the replacement has been completed, or with an error if
849+        something went wrong during the process.
850+
851+        Note that this process will not upload new shares. If the file
852+        being updated is in need of repair, callers will have to repair
853+        it on their own.
854+        """
855+        # How this works:
856+        # 1: Make peer assignments. We'll assign each share that we know
857+        # about on the grid to that peer that currently holds that
858+        # share, and will not place any new shares.
859+        # 2: Setup encoding parameters. Most of these will stay the same
860+        # -- datalength will change, as will some of the offsets.
861+        # 3. Upload the new segments.
862+        # 4. Be done.
863+        assert IMutableUploadable.providedBy(data)
864+
865+        self.data = data
866+
867+        # XXX: Use the MutableFileVersion instead.
868+        self.datalength = self._node.get_size()
869+        if data.get_size() > self.datalength:
870+            self.datalength = data.get_size()
871+
872+        if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
873+            self._version = MDMF_VERSION
874+        else:
875+            self._version = SDMF_VERSION
876+
877+        self.log("starting update")
878+        self.log("adding new data of length %d at offset %d" % \
879+                    (data.get_size(), offset))
880+        self.log("new data length is %d" % self.datalength)
881+        self._status.set_size(self.datalength)
882+        self._status.set_status("Started")
883+        self._started = time.time()
884+
885+        self.done_deferred = defer.Deferred()
886+
887+        self._writekey = self._node.get_writekey()
888+        assert self._writekey, "need write capability to publish"
889+
890+        # first, which servers will we publish to? We require that the
891+        # servermap was updated in MODE_WRITE, so we can depend upon the
892+        # peerlist computed by that process instead of computing our own.
893+        assert self._servermap
894+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
895+        # we will push a version that is one larger than anything present
896+        # in the grid, according to the servermap.
897+        self._new_seqnum = self._servermap.highest_seqnum() + 1
898+        self._status.set_servermap(self._servermap)
899+
900+        self.log(format="new seqnum will be %(seqnum)d",
901+                 seqnum=self._new_seqnum, level=log.NOISY)
902+
903+        # We're updating an existing file, so all of the following
904+        # should be available.
905+        self.readkey = self._node.get_readkey()
906+        self.required_shares = self._node.get_required_shares()
907+        assert self.required_shares is not None
908+        self.total_shares = self._node.get_total_shares()
909+        assert self.total_shares is not None
910+        self._status.set_encoding(self.required_shares, self.total_shares)
911+
912+        self._pubkey = self._node.get_pubkey()
913+        assert self._pubkey
914+        self._privkey = self._node.get_privkey()
915+        assert self._privkey
916+        self._encprivkey = self._node.get_encprivkey()
917+
918+        sb = self._storage_broker
919+        full_peerlist = sb.get_servers_for_index(self._storage_index)
920+        self.full_peerlist = full_peerlist # for use later, immutable
921+        self.bad_peers = set() # peerids who have errbacked/refused requests
922+
923+        # This will set self.segment_size, self.num_segments, and
924+        # self.fec. TODO: Does it know how to do the offset? Probably
925+        # not. So do that part next.
926+        self.setup_encoding_parameters(offset=offset)
927+
928+        # if we experience any surprises (writes which were rejected because
929+        # our test vector did not match, or shares which we didn't expect to
930+        # see), we set this flag and report an UncoordinatedWriteError at the
931+        # end of the publish process.
932+        self.surprised = False
933+
934+        # as a failsafe, refuse to iterate through self.loop more than a
935+        # thousand times.
936+        self.looplimit = 1000
937+
938+        # we keep track of three tables. The first is our goal: which share
939+        # we want to see on which servers. This is initially populated by the
940+        # existing servermap.
941+        self.goal = set() # pairs of (peerid, shnum) tuples
942+
943+        # the second table is our list of outstanding queries: those which
944+        # are in flight and may or may not be delivered, accepted, or
945+        # acknowledged. Items are added to this table when the request is
946+        # sent, and removed when the response returns (or errbacks).
947+        self.outstanding = set() # (peerid, shnum) tuples
948+
949+        # the third is a table of successes: share which have actually been
950+        # placed. These are populated when responses come back with success.
951+        # When self.placed == self.goal, we're done.
952+        self.placed = set() # (peerid, shnum) tuples
953+
954+        # we also keep a mapping from peerid to RemoteReference. Each time we
955+        # pull a connection out of the full peerlist, we add it to this for
956+        # use later.
957+        self.connections = {}
958+
959+        self.bad_share_checkstrings = {}
960+
961+        # This is set at the last step of the publishing process.
962+        self.versioninfo = ""
963+
964+        # we use the servermap to populate the initial goal: this way we will
965+        # try to update each existing share in place. Since we're
966+        # updating, we ignore damaged and missing shares -- callers must
967+        # do a repair to repair and recreate these.
968+        for (peerid, shnum) in self._servermap.servermap:
969+            self.goal.add( (peerid, shnum) )
970+            self.connections[peerid] = self._servermap.connections[peerid]
971+        self.writers = {}
972+        if self._version == MDMF_VERSION:
973+            writer_class = MDMFSlotWriteProxy
974+        else:
975+            writer_class = SDMFSlotWriteProxy
976+
977+        # For each (peerid, shnum) in self.goal, we make a
978+        # write proxy for that peer. We'll use this to write
979+        # shares to the peer.
980+        for key in self.goal:
981+            peerid, shnum = key
982+            write_enabler = self._node.get_write_enabler(peerid)
983+            renew_secret = self._node.get_renewal_secret(peerid)
984+            cancel_secret = self._node.get_cancel_secret(peerid)
985+            secrets = (write_enabler, renew_secret, cancel_secret)
986+
987+            self.writers[shnum] =  writer_class(shnum,
988+                                                self.connections[peerid],
989+                                                self._storage_index,
990+                                                secrets,
991+                                                self._new_seqnum,
992+                                                self.required_shares,
993+                                                self.total_shares,
994+                                                self.segment_size,
995+                                                self.datalength)
996+            self.writers[shnum].peerid = peerid
997+            assert (peerid, shnum) in self._servermap.servermap
998+            old_versionid, old_timestamp = self._servermap.servermap[key]
999+            (old_seqnum, old_root_hash, old_salt, old_segsize,
1000+             old_datalength, old_k, old_N, old_prefix,
1001+             old_offsets_tuple) = old_versionid
1002+            self.writers[shnum].set_checkstring(old_seqnum,
1003+                                                old_root_hash,
1004+                                                old_salt)
1005+
1006+        # Our remote shares will not have a complete checkstring until
1007+        # after we are done writing share data and have started to write
1008+        # blocks. In the meantime, we need to know what to look for when
1009+        # writing, so that we can detect UncoordinatedWriteErrors.
1010+        self._checkstring = self.writers.values()[0].get_checkstring()
1011+
1012+        # Now, we start pushing shares.
1013+        self._status.timings["setup"] = time.time() - self._started
1014+        # First, we encrypt, encode, and publish the shares that we need
1015+        # to encrypt, encode, and publish.
1016+
1017+        # Our update process fetched these for us. We need to update
1018+        # them in place as publishing happens.
1019+        self.blockhashes = {} # (shnum, [blochashes])
1020+        for (i, bht) in blockhashes.iteritems():
1021+            # We need to extract the leaves from our old hash tree.
1022+            old_segcount = mathutil.div_ceil(version[4],
1023+                                             version[3])
1024+            h = hashtree.IncompleteHashTree(old_segcount)
1025+            bht = dict(enumerate(bht))
1026+            h.set_hashes(bht)
1027+            leaves = h[h.get_leaf_index(0):]
1028+            for j in xrange(self.num_segments - len(leaves)):
1029+                leaves.append(None)
1030+
1031+            assert len(leaves) >= self.num_segments
1032+            self.blockhashes[i] = leaves
1033+            # This list will now be the leaves that were set during the
1034+            # initial upload + enough empty hashes to make it a
1035+            # power-of-two. If we exceed a power of two boundary, we
1036+            # should be encoding the file over again, and should not be
1037+            # here. So, we have
1038+            #assert len(self.blockhashes[i]) == \
1039+            #    hashtree.roundup_pow2(self.num_segments), \
1040+            #        len(self.blockhashes[i])
1041+            # XXX: Except this doesn't work. Figure out why.
1042+
1043+        # These are filled in later, after we've modified the block hash
1044+        # tree suitably.
1045+        self.sharehash_leaves = None # eventually [sharehashes]
1046+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1047+                              # validate the share]
1048+
1049+        d = defer.succeed(None)
1050+        self.log("Starting push")
1051+
1052+        self._state = PUSHING_BLOCKS_STATE
1053+        self._push()
1054+
1055+        return self.done_deferred
1056+
1057+
1058     def publish(self, newdata):
1059         """Publish the filenode's current contents.  Returns a Deferred that
1060         fires (with None) when the publish has done as much work as it's ever
1061hunk ./src/allmydata/mutable/publish.py 354
1062         simultaneous write.
1063         """
1064 
1065-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1066-        # 2: perform peer selection, get candidate servers
1067-        #  2a: send queries to n+epsilon servers, to determine current shares
1068-        #  2b: based upon responses, create target map
1069-        # 3: send slot_testv_and_readv_and_writev messages
1070-        # 4: as responses return, update share-dispatch table
1071-        # 4a: may need to run recovery algorithm
1072-        # 5: when enough responses are back, we're done
1073+        # 0. Setup encoding parameters, encoder, and other such things.
1074+        # 1. Encrypt, encode, and publish segments.
1075+        assert IMutableUploadable.providedBy(newdata)
1076 
1077hunk ./src/allmydata/mutable/publish.py 358
1078-        self.log("starting publish, datalen is %s" % len(newdata))
1079-        self._status.set_size(len(newdata))
1080+        self.data = newdata
1081+        self.datalength = newdata.get_size()
1082+        if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
1083+            self._version = MDMF_VERSION
1084+        else:
1085+            self._version = SDMF_VERSION
1086+
1087+        self.log("starting publish, datalen is %s" % self.datalength)
1088+        self._status.set_size(self.datalength)
1089         self._status.set_status("Started")
1090         self._started = time.time()
1091 
1092hunk ./src/allmydata/mutable/publish.py 414
1093         self.full_peerlist = full_peerlist # for use later, immutable
1094         self.bad_peers = set() # peerids who have errbacked/refused requests
1095 
1096-        self.newdata = newdata
1097-        self.salt = os.urandom(16)
1098-
1099+        # This will set self.segment_size, self.num_segments, and
1100+        # self.fec.
1101         self.setup_encoding_parameters()
1102 
1103         # if we experience any surprises (writes which were rejected because
1104hunk ./src/allmydata/mutable/publish.py 451
1105 
1106         self.bad_share_checkstrings = {}
1107 
1108+        # This is set at the last step of the publishing process.
1109+        self.versioninfo = ""
1110+
1111         # we use the servermap to populate the initial goal: this way we will
1112         # try to update each existing share in place.
1113         for (peerid, shnum) in self._servermap.servermap:
1114hunk ./src/allmydata/mutable/publish.py 467
1115             self.bad_share_checkstrings[key] = old_checkstring
1116             self.connections[peerid] = self._servermap.connections[peerid]
1117 
1118-        # create the shares. We'll discard these as they are delivered. SDMF:
1119-        # we're allowed to hold everything in memory.
1120+        # TODO: Make this part do peer selection.
1121+        self.update_goal()
1122+        self.writers = {}
1123+        if self._version == MDMF_VERSION:
1124+            writer_class = MDMFSlotWriteProxy
1125+        else:
1126+            writer_class = SDMFSlotWriteProxy
1127 
1128hunk ./src/allmydata/mutable/publish.py 475
1129+        # For each (peerid, shnum) in self.goal, we make a
1130+        # write proxy for that peer. We'll use this to write
1131+        # shares to the peer.
1132+        for key in self.goal:
1133+            peerid, shnum = key
1134+            write_enabler = self._node.get_write_enabler(peerid)
1135+            renew_secret = self._node.get_renewal_secret(peerid)
1136+            cancel_secret = self._node.get_cancel_secret(peerid)
1137+            secrets = (write_enabler, renew_secret, cancel_secret)
1138+
1139+            self.writers[shnum] =  writer_class(shnum,
1140+                                                self.connections[peerid],
1141+                                                self._storage_index,
1142+                                                secrets,
1143+                                                self._new_seqnum,
1144+                                                self.required_shares,
1145+                                                self.total_shares,
1146+                                                self.segment_size,
1147+                                                self.datalength)
1148+            self.writers[shnum].peerid = peerid
1149+            if (peerid, shnum) in self._servermap.servermap:
1150+                old_versionid, old_timestamp = self._servermap.servermap[key]
1151+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1152+                 old_datalength, old_k, old_N, old_prefix,
1153+                 old_offsets_tuple) = old_versionid
1154+                self.writers[shnum].set_checkstring(old_seqnum,
1155+                                                    old_root_hash,
1156+                                                    old_salt)
1157+            elif (peerid, shnum) in self.bad_share_checkstrings:
1158+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
1159+                self.writers[shnum].set_checkstring(old_checkstring)
1160+
1161+        # Our remote shares will not have a complete checkstring until
1162+        # after we are done writing share data and have started to write
1163+        # blocks. In the meantime, we need to know what to look for when
1164+        # writing, so that we can detect UncoordinatedWriteErrors.
1165+        self._checkstring = self.writers.values()[0].get_checkstring()
1166+
1167+        # Now, we start pushing shares.
1168         self._status.timings["setup"] = time.time() - self._started
1169hunk ./src/allmydata/mutable/publish.py 515
1170-        d = self._encrypt_and_encode()
1171-        d.addCallback(self._generate_shares)
1172-        def _start_pushing(res):
1173-            self._started_pushing = time.time()
1174-            return res
1175-        d.addCallback(_start_pushing)
1176-        d.addCallback(self.loop) # trigger delivery
1177-        d.addErrback(self._fatal_error)
1178+        # First, we encrypt, encode, and publish the shares that we need
1179+        # to encrypt, encode, and publish.
1180+
1181+        # This will eventually hold the block hash chain for each share
1182+        # that we publish. We define it this way so that empty publishes
1183+        # will still have something to write to the remote slot.
1184+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1185+        for i in xrange(self.total_shares):
1186+            blocks = self.blockhashes[i]
1187+            for j in xrange(self.num_segments):
1188+                blocks.append(None)
1189+        self.sharehash_leaves = None # eventually [sharehashes]
1190+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1191+                              # validate the share]
1192+
1193+        d = defer.succeed(None)
1194+        self.log("Starting push")
1195+
1196+        self._state = PUSHING_BLOCKS_STATE
1197+        self._push()
1198 
1199         return self.done_deferred
1200 
1201hunk ./src/allmydata/mutable/publish.py 538
1202-    def setup_encoding_parameters(self):
1203-        segment_size = len(self.newdata)
1204+
1205+    def setup_encoding_parameters(self, offset=0):
1206+        if self._version == MDMF_VERSION:
1207+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1208+        else:
1209+            segment_size = self.datalength # SDMF is only one segment
1210         # this must be a multiple of self.required_shares
1211         segment_size = mathutil.next_multiple(segment_size,
1212                                               self.required_shares)
1213hunk ./src/allmydata/mutable/publish.py 548
1214         self.segment_size = segment_size
1215+
1216+        # Calculate the starting segment for the upload.
1217         if segment_size:
1218hunk ./src/allmydata/mutable/publish.py 551
1219-            self.num_segments = mathutil.div_ceil(len(self.newdata),
1220+            self.num_segments = mathutil.div_ceil(self.datalength,
1221                                                   segment_size)
1222hunk ./src/allmydata/mutable/publish.py 553
1223+            self.starting_segment = mathutil.div_ceil(offset,
1224+                                                      segment_size)
1225+            self.starting_segment -= 1
1226+            if offset == 0:
1227+                self.starting_segment = 0
1228+
1229         else:
1230             self.num_segments = 0
1231hunk ./src/allmydata/mutable/publish.py 561
1232-        assert self.num_segments in [0, 1,] # SDMF restrictions
1233+            self.starting_segment = 0
1234+
1235+
1236+        self.log("building encoding parameters for file")
1237+        self.log("got segsize %d" % self.segment_size)
1238+        self.log("got %d segments" % self.num_segments)
1239+
1240+        if self._version == SDMF_VERSION:
1241+            assert self.num_segments in (0, 1) # SDMF
1242+        # calculate the tail segment size.
1243+
1244+        if segment_size and self.datalength:
1245+            self.tail_segment_size = self.datalength % segment_size
1246+            self.log("got tail segment size %d" % self.tail_segment_size)
1247+        else:
1248+            self.tail_segment_size = 0
1249+
1250+        if self.tail_segment_size == 0 and segment_size:
1251+            # The tail segment is the same size as the other segments.
1252+            self.tail_segment_size = segment_size
1253+
1254+        # Make FEC encoders
1255+        fec = codec.CRSEncoder()
1256+        fec.set_params(self.segment_size,
1257+                       self.required_shares, self.total_shares)
1258+        self.piece_size = fec.get_block_size()
1259+        self.fec = fec
1260+
1261+        if self.tail_segment_size == self.segment_size:
1262+            self.tail_fec = self.fec
1263+        else:
1264+            tail_fec = codec.CRSEncoder()
1265+            tail_fec.set_params(self.tail_segment_size,
1266+                                self.required_shares,
1267+                                self.total_shares)
1268+            self.tail_fec = tail_fec
1269+
1270+        self._current_segment = self.starting_segment
1271+        self.end_segment = self.num_segments - 1
1272+        # Now figure out where the last segment should be.
1273+        if self.data.get_size() != self.datalength:
1274+            end = self.data.get_size()
1275+            self.end_segment = mathutil.div_ceil(end,
1276+                                                 segment_size)
1277+            self.end_segment -= 1
1278+        self.log("got start segment %d" % self.starting_segment)
1279+        self.log("got end segment %d" % self.end_segment)
1280+
1281+
1282+    def _push(self, ignored=None):
1283+        """
1284+        I manage state transitions. In particular, I see that we still
1285+        have a good enough number of writers to complete the upload
1286+        successfully.
1287+        """
1288+        # Can we still successfully publish this file?
1289+        # TODO: Keep track of outstanding queries before aborting the
1290+        #       process.
1291+        if len(self.writers) <= self.required_shares or self.surprised:
1292+            return self._failure()
1293+
1294+        # Figure out what we need to do next. Each of these needs to
1295+        # return a deferred so that we don't block execution when this
1296+        # is first called in the upload method.
1297+        if self._state == PUSHING_BLOCKS_STATE:
1298+            return self.push_segment(self._current_segment)
1299+
1300+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
1301+            return self.push_everything_else()
1302+
1303+        # If we make it to this point, we were successful in placing the
1304+        # file.
1305+        return self._done(None)
1306+
1307+
1308+    def push_segment(self, segnum):
1309+        if self.num_segments == 0 and self._version == SDMF_VERSION:
1310+            self._add_dummy_salts()
1311 
1312hunk ./src/allmydata/mutable/publish.py 640
1313-    def _fatal_error(self, f):
1314-        self.log("error during loop", failure=f, level=log.UNUSUAL)
1315-        self._done(f)
1316+        if segnum > self.end_segment:
1317+            # We don't have any more segments to push.
1318+            self._state = PUSHING_EVERYTHING_ELSE_STATE
1319+            return self._push()
1320+
1321+        d = self._encode_segment(segnum)
1322+        d.addCallback(self._push_segment, segnum)
1323+        def _increment_segnum(ign):
1324+            self._current_segment += 1
1325+        # XXX: I don't think we need to do addBoth here -- any errBacks
1326+        # should be handled within push_segment.
1327+        d.addBoth(_increment_segnum)
1328+        d.addBoth(self._push)
1329+
1330+
1331+    def _add_dummy_salts(self):
1332+        """
1333+        SDMF files need a salt even if they're empty, or the signature
1334+        won't make sense. This method adds a dummy salt to each of our
1335+        SDMF writers so that they can write the signature later.
1336+        """
1337+        salt = os.urandom(16)
1338+        assert self._version == SDMF_VERSION
1339+
1340+        for writer in self.writers.itervalues():
1341+            writer.put_salt(salt)
1342+
1343+
1344+    def _encode_segment(self, segnum):
1345+        """
1346+        I encrypt and encode the segment segnum.
1347+        """
1348+        started = time.time()
1349+
1350+        if segnum + 1 == self.num_segments:
1351+            segsize = self.tail_segment_size
1352+        else:
1353+            segsize = self.segment_size
1354+
1355+
1356+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1357+        data = self.data.read(segsize)
1358+        # XXX: This is dumb. Why return a list?
1359+        data = "".join(data)
1360+
1361+        assert len(data) == segsize, len(data)
1362+
1363+        salt = os.urandom(16)
1364+
1365+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1366+        enc = AES(key)
1367+        crypttext = enc.process(data)
1368+        assert len(crypttext) == len(data)
1369+
1370+        now = time.time()
1371+        self._status.timings["encrypt"] = now - started
1372+        started = now
1373+
1374+        # now apply FEC
1375+        if segnum + 1 == self.num_segments:
1376+            fec = self.tail_fec
1377+        else:
1378+            fec = self.fec
1379+
1380+        self._status.set_status("Encoding")
1381+        crypttext_pieces = [None] * self.required_shares
1382+        piece_size = fec.get_block_size()
1383+        for i in range(len(crypttext_pieces)):
1384+            offset = i * piece_size
1385+            piece = crypttext[offset:offset+piece_size]
1386+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1387+            crypttext_pieces[i] = piece
1388+            assert len(piece) == piece_size
1389+        d = fec.encode(crypttext_pieces)
1390+        def _done_encoding(res):
1391+            elapsed = time.time() - started
1392+            self._status.timings["encode"] = elapsed
1393+            return (res, salt)
1394+        d.addCallback(_done_encoding)
1395+        return d
1396+
1397+
1398+    def _push_segment(self, encoded_and_salt, segnum):
1399+        """
1400+        I push (data, salt) as segment number segnum.
1401+        """
1402+        results, salt = encoded_and_salt
1403+        shares, shareids = results
1404+        started = time.time()
1405+        for i in xrange(len(shares)):
1406+            sharedata = shares[i]
1407+            shareid = shareids[i]
1408+            if self._version == MDMF_VERSION:
1409+                hashed = salt + sharedata
1410+            else:
1411+                hashed = sharedata
1412+            block_hash = hashutil.block_hash(hashed)
1413+            old_hash = self.blockhashes[shareid][segnum]
1414+            self.blockhashes[shareid][segnum] = block_hash
1415+            # find the writer for this share
1416+            writer = self.writers[shareid]
1417+            writer.put_block(sharedata, segnum, salt)
1418+
1419+
1420+    def push_everything_else(self):
1421+        """
1422+        I put everything else associated with a share.
1423+        """
1424+        encprivkey = self._encprivkey
1425+        self.push_encprivkey()
1426+        self.push_blockhashes()
1427+        self.push_sharehashes()
1428+        self.push_toplevel_hashes_and_signature()
1429+        d = self.finish_publishing()
1430+        def _change_state(ignored):
1431+            self._state = DONE_STATE
1432+        d.addCallback(_change_state)
1433+        d.addCallback(self._push)
1434+        return d
1435+
1436+
1437+    def push_encprivkey(self):
1438+        started = time.time()
1439+        encprivkey = self._encprivkey
1440+        for writer in self.writers.itervalues():
1441+            writer.put_encprivkey(encprivkey)
1442+
1443+
1444+    def push_blockhashes(self):
1445+        started = time.time()
1446+        self.sharehash_leaves = [None] * len(self.blockhashes)
1447+        self.log("%s" % self.blockhashes)
1448+        for shnum, blockhashes in self.blockhashes.iteritems():
1449+            t = hashtree.HashTree(blockhashes)
1450+            self.blockhashes[shnum] = list(t)
1451+            # set the leaf for future use.
1452+            self.sharehash_leaves[shnum] = t[0]
1453+
1454+            writer = self.writers[shnum]
1455+            writer.put_blockhashes(self.blockhashes[shnum])
1456+
1457+
1458+    def push_sharehashes(self):
1459+        started = time.time()
1460+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1461+        share_hash_chain = {}
1462+        for shnum in xrange(len(self.sharehash_leaves)):
1463+            needed_indices = share_hash_tree.needed_hashes(shnum)
1464+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1465+                                             for i in needed_indices] )
1466+            writer = self.writers[shnum]
1467+            writer.put_sharehashes(self.sharehashes[shnum])
1468+        self.root_hash = share_hash_tree[0]
1469+
1470+
1471+    def push_toplevel_hashes_and_signature(self):
1472+        # We need to to three things here:
1473+        #   - Push the root hash and salt hash
1474+        #   - Get the checkstring of the resulting layout; sign that.
1475+        #   - Push the signature
1476+        started = time.time()
1477+        for shnum in xrange(self.total_shares):
1478+            writer = self.writers[shnum]
1479+            writer.put_root_hash(self.root_hash)
1480+        self._update_checkstring()
1481+        self._make_and_place_signature()
1482+
1483+
1484+    def _update_checkstring(self):
1485+        """
1486+        After putting the root hash, MDMF files will have the
1487+        checkstring written to the storage server. This means that we
1488+        can update our copy of the checkstring so we can detect
1489+        uncoordinated writes. SDMF files will have the same checkstring,
1490+        so we need not do anything.
1491+        """
1492+        self._checkstring = self.writers.values()[0].get_checkstring()
1493+
1494+
1495+    def _make_and_place_signature(self):
1496+        """
1497+        I create and place the signature.
1498+        """
1499+        started = time.time()
1500+        signable = self.writers[0].get_signable()
1501+        self.signature = self._privkey.sign(signable)
1502+
1503+        for (shnum, writer) in self.writers.iteritems():
1504+            writer.put_signature(self.signature)
1505+
1506+
1507+    def finish_publishing(self):
1508+        # We're almost done -- we just need to put the verification key
1509+        # and the offsets
1510+        started = time.time()
1511+        ds = []
1512+        verification_key = self._pubkey.serialize()
1513+
1514+
1515+        # TODO: Bad, since we remove from this same dict. We need to
1516+        # make a copy, or just use a non-iterated value.
1517+        for (shnum, writer) in self.writers.iteritems():
1518+            writer.put_verification_key(verification_key)
1519+            d = writer.finish_publishing()
1520+            d.addCallback(self._got_write_answer, writer, started)
1521+            d.addErrback(self._connection_problem, writer)
1522+            ds.append(d)
1523+        self._record_verinfo()
1524+        return defer.DeferredList(ds)
1525+
1526+
1527+    def _record_verinfo(self):
1528+        self.versioninfo = self.writers.values()[0].get_verinfo()
1529+
1530+
1531+    def _connection_problem(self, f, writer):
1532+        """
1533+        We ran into a connection problem while working with writer, and
1534+        need to deal with that.
1535+        """
1536+        self.log("found problem: %s" % str(f))
1537+        self._last_failure = f
1538+        del(self.writers[writer.shnum])
1539 
1540hunk ./src/allmydata/mutable/publish.py 864
1541-    def _update_status(self):
1542-        self._status.set_status("Sending Shares: %d placed out of %d, "
1543-                                "%d messages outstanding" %
1544-                                (len(self.placed),
1545-                                 len(self.goal),
1546-                                 len(self.outstanding)))
1547-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
1548 
1549     def loop(self, ignored=None):
1550         self.log("entering loop", level=log.NOISY)
1551hunk ./src/allmydata/mutable/publish.py 988
1552             self.log_goal(self.goal, "after update: ")
1553 
1554 
1555+    def _got_write_answer(self, answer, writer, started):
1556+        if not answer:
1557+            # SDMF writers only pretend to write when readers set their
1558+            # blocks, salts, and so on -- they actually just write once,
1559+            # at the end of the upload process. In fake writes, they
1560+            # return defer.succeed(None). If we see that, we shouldn't
1561+            # bother checking it.
1562+            return
1563 
1564hunk ./src/allmydata/mutable/publish.py 997
1565-    def _encrypt_and_encode(self):
1566-        # this returns a Deferred that fires with a list of (sharedata,
1567-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
1568-        # shares that we care about.
1569-        self.log("_encrypt_and_encode")
1570-
1571-        self._status.set_status("Encrypting")
1572-        started = time.time()
1573-
1574-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
1575-        enc = AES(key)
1576-        crypttext = enc.process(self.newdata)
1577-        assert len(crypttext) == len(self.newdata)
1578+        peerid = writer.peerid
1579+        lp = self.log("_got_write_answer from %s, share %d" %
1580+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
1581 
1582         now = time.time()
1583hunk ./src/allmydata/mutable/publish.py 1002
1584-        self._status.timings["encrypt"] = now - started
1585-        started = now
1586-
1587-        # now apply FEC
1588-
1589-        self._status.set_status("Encoding")
1590-        fec = codec.CRSEncoder()
1591-        fec.set_params(self.segment_size,
1592-                       self.required_shares, self.total_shares)
1593-        piece_size = fec.get_block_size()
1594-        crypttext_pieces = [None] * self.required_shares
1595-        for i in range(len(crypttext_pieces)):
1596-            offset = i * piece_size
1597-            piece = crypttext[offset:offset+piece_size]
1598-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1599-            crypttext_pieces[i] = piece
1600-            assert len(piece) == piece_size
1601-
1602-        d = fec.encode(crypttext_pieces)
1603-        def _done_encoding(res):
1604-            elapsed = time.time() - started
1605-            self._status.timings["encode"] = elapsed
1606-            return res
1607-        d.addCallback(_done_encoding)
1608-        return d
1609-
1610-    def _generate_shares(self, shares_and_shareids):
1611-        # this sets self.shares and self.root_hash
1612-        self.log("_generate_shares")
1613-        self._status.set_status("Generating Shares")
1614-        started = time.time()
1615-
1616-        # we should know these by now
1617-        privkey = self._privkey
1618-        encprivkey = self._encprivkey
1619-        pubkey = self._pubkey
1620-
1621-        (shares, share_ids) = shares_and_shareids
1622-
1623-        assert len(shares) == len(share_ids)
1624-        assert len(shares) == self.total_shares
1625-        all_shares = {}
1626-        block_hash_trees = {}
1627-        share_hash_leaves = [None] * len(shares)
1628-        for i in range(len(shares)):
1629-            share_data = shares[i]
1630-            shnum = share_ids[i]
1631-            all_shares[shnum] = share_data
1632-
1633-            # build the block hash tree. SDMF has only one leaf.
1634-            leaves = [hashutil.block_hash(share_data)]
1635-            t = hashtree.HashTree(leaves)
1636-            block_hash_trees[shnum] = list(t)
1637-            share_hash_leaves[shnum] = t[0]
1638-        for leaf in share_hash_leaves:
1639-            assert leaf is not None
1640-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
1641-        share_hash_chain = {}
1642-        for shnum in range(self.total_shares):
1643-            needed_hashes = share_hash_tree.needed_hashes(shnum)
1644-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
1645-                                              for i in needed_hashes ] )
1646-        root_hash = share_hash_tree[0]
1647-        assert len(root_hash) == 32
1648-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
1649-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
1650-
1651-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
1652-                             self.required_shares, self.total_shares,
1653-                             self.segment_size, len(self.newdata))
1654-
1655-        # now pack the beginning of the share. All shares are the same up
1656-        # to the signature, then they have divergent share hash chains,
1657-        # then completely different block hash trees + salt + share data,
1658-        # then they all share the same encprivkey at the end. The sizes
1659-        # of everything are the same for all shares.
1660-
1661-        sign_started = time.time()
1662-        signature = privkey.sign(prefix)
1663-        self._status.timings["sign"] = time.time() - sign_started
1664-
1665-        verification_key = pubkey.serialize()
1666-
1667-        final_shares = {}
1668-        for shnum in range(self.total_shares):
1669-            final_share = pack_share(prefix,
1670-                                     verification_key,
1671-                                     signature,
1672-                                     share_hash_chain[shnum],
1673-                                     block_hash_trees[shnum],
1674-                                     all_shares[shnum],
1675-                                     encprivkey)
1676-            final_shares[shnum] = final_share
1677-        elapsed = time.time() - started
1678-        self._status.timings["pack"] = elapsed
1679-        self.shares = final_shares
1680-        self.root_hash = root_hash
1681-
1682-        # we also need to build up the version identifier for what we're
1683-        # pushing. Extract the offsets from one of our shares.
1684-        assert final_shares
1685-        offsets = unpack_header(final_shares.values()[0])[-1]
1686-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
1687-        verinfo = (self._new_seqnum, root_hash, self.salt,
1688-                   self.segment_size, len(self.newdata),
1689-                   self.required_shares, self.total_shares,
1690-                   prefix, offsets_tuple)
1691-        self.versioninfo = verinfo
1692-
1693-
1694-
1695-    def _send_shares(self, needed):
1696-        self.log("_send_shares")
1697-
1698-        # we're finally ready to send out our shares. If we encounter any
1699-        # surprises here, it's because somebody else is writing at the same
1700-        # time. (Note: in the future, when we remove the _query_peers() step
1701-        # and instead speculate about [or remember] which shares are where,
1702-        # surprises here are *not* indications of UncoordinatedWriteError,
1703-        # and we'll need to respond to them more gracefully.)
1704-
1705-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
1706-        # organize it by peerid.
1707-
1708-        peermap = DictOfSets()
1709-        for (peerid, shnum) in needed:
1710-            peermap.add(peerid, shnum)
1711-
1712-        # the next thing is to build up a bunch of test vectors. The
1713-        # semantics of Publish are that we perform the operation if the world
1714-        # hasn't changed since the ServerMap was constructed (more or less).
1715-        # For every share we're trying to place, we create a test vector that
1716-        # tests to see if the server*share still corresponds to the
1717-        # map.
1718-
1719-        all_tw_vectors = {} # maps peerid to tw_vectors
1720-        sm = self._servermap.servermap
1721-
1722-        for key in needed:
1723-            (peerid, shnum) = key
1724-
1725-            if key in sm:
1726-                # an old version of that share already exists on the
1727-                # server, according to our servermap. We will create a
1728-                # request that attempts to replace it.
1729-                old_versionid, old_timestamp = sm[key]
1730-                (old_seqnum, old_root_hash, old_salt, old_segsize,
1731-                 old_datalength, old_k, old_N, old_prefix,
1732-                 old_offsets_tuple) = old_versionid
1733-                old_checkstring = pack_checkstring(old_seqnum,
1734-                                                   old_root_hash,
1735-                                                   old_salt)
1736-                testv = (0, len(old_checkstring), "eq", old_checkstring)
1737-
1738-            elif key in self.bad_share_checkstrings:
1739-                old_checkstring = self.bad_share_checkstrings[key]
1740-                testv = (0, len(old_checkstring), "eq", old_checkstring)
1741-
1742-            else:
1743-                # add a testv that requires the share not exist
1744-
1745-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
1746-                # constraints are handled. If the same object is referenced
1747-                # multiple times inside the arguments, foolscap emits a
1748-                # 'reference' token instead of a distinct copy of the
1749-                # argument. The bug is that these 'reference' tokens are not
1750-                # accepted by the inbound constraint code. To work around
1751-                # this, we need to prevent python from interning the
1752-                # (constant) tuple, by creating a new copy of this vector
1753-                # each time.
1754-
1755-                # This bug is fixed in foolscap-0.2.6, and even though this
1756-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
1757-                # supposed to be able to interoperate with older versions of
1758-                # Tahoe which are allowed to use older versions of foolscap,
1759-                # including foolscap-0.2.5 . In addition, I've seen other
1760-                # foolscap problems triggered by 'reference' tokens (see #541
1761-                # for details). So we must keep this workaround in place.
1762-
1763-                #testv = (0, 1, 'eq', "")
1764-                testv = tuple([0, 1, 'eq', ""])
1765-
1766-            testvs = [testv]
1767-            # the write vector is simply the share
1768-            writev = [(0, self.shares[shnum])]
1769-
1770-            if peerid not in all_tw_vectors:
1771-                all_tw_vectors[peerid] = {}
1772-                # maps shnum to (testvs, writevs, new_length)
1773-            assert shnum not in all_tw_vectors[peerid]
1774-
1775-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
1776-
1777-        # we read the checkstring back from each share, however we only use
1778-        # it to detect whether there was a new share that we didn't know
1779-        # about. The success or failure of the write will tell us whether
1780-        # there was a collision or not. If there is a collision, the first
1781-        # thing we'll do is update the servermap, which will find out what
1782-        # happened. We could conceivably reduce a roundtrip by using the
1783-        # readv checkstring to populate the servermap, but really we'd have
1784-        # to read enough data to validate the signatures too, so it wouldn't
1785-        # be an overall win.
1786-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
1787-
1788-        # ok, send the messages!
1789-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
1790-        started = time.time()
1791-        for (peerid, tw_vectors) in all_tw_vectors.items():
1792-
1793-            write_enabler = self._node.get_write_enabler(peerid)
1794-            renew_secret = self._node.get_renewal_secret(peerid)
1795-            cancel_secret = self._node.get_cancel_secret(peerid)
1796-            secrets = (write_enabler, renew_secret, cancel_secret)
1797-            shnums = tw_vectors.keys()
1798-
1799-            for shnum in shnums:
1800-                self.outstanding.add( (peerid, shnum) )
1801+        elapsed = now - started
1802 
1803hunk ./src/allmydata/mutable/publish.py 1004
1804-            d = self._do_testreadwrite(peerid, secrets,
1805-                                       tw_vectors, read_vector)
1806-            d.addCallbacks(self._got_write_answer, self._got_write_error,
1807-                           callbackArgs=(peerid, shnums, started),
1808-                           errbackArgs=(peerid, shnums, started))
1809-            # tolerate immediate errback, like with DeadReferenceError
1810-            d.addBoth(fireEventually)
1811-            d.addCallback(self.loop)
1812-            d.addErrback(self._fatal_error)
1813+        self._status.add_per_server_time(peerid, elapsed)
1814 
1815hunk ./src/allmydata/mutable/publish.py 1006
1816-        self._update_status()
1817-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
1818+        wrote, read_data = answer
1819 
1820hunk ./src/allmydata/mutable/publish.py 1008
1821-    def _do_testreadwrite(self, peerid, secrets,
1822-                          tw_vectors, read_vector):
1823-        storage_index = self._storage_index
1824-        ss = self.connections[peerid]
1825+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
1826 
1827hunk ./src/allmydata/mutable/publish.py 1010
1828-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
1829-        d = ss.callRemote("slot_testv_and_readv_and_writev",
1830-                          storage_index,
1831-                          secrets,
1832-                          tw_vectors,
1833-                          read_vector)
1834-        return d
1835+        # We need to remove from surprise_shares any shares that we are
1836+        # knowingly also writing to that peer from other writers.
1837 
1838hunk ./src/allmydata/mutable/publish.py 1013
1839-    def _got_write_answer(self, answer, peerid, shnums, started):
1840-        lp = self.log("_got_write_answer from %s" %
1841-                      idlib.shortnodeid_b2a(peerid))
1842-        for shnum in shnums:
1843-            self.outstanding.discard( (peerid, shnum) )
1844+        # TODO: Precompute this.
1845+        known_shnums = [x.shnum for x in self.writers.values()
1846+                        if x.peerid == peerid]
1847+        surprise_shares -= set(known_shnums)
1848+        self.log("found the following surprise shares: %s" %
1849+                 str(surprise_shares))
1850 
1851hunk ./src/allmydata/mutable/publish.py 1020
1852-        now = time.time()
1853-        elapsed = now - started
1854-        self._status.add_per_server_time(peerid, elapsed)
1855-
1856-        wrote, read_data = answer
1857-
1858-        surprise_shares = set(read_data.keys()) - set(shnums)
1859+        # Now surprise shares contains all of the shares that we did not
1860+        # expect to be there.
1861 
1862         surprised = False
1863         for shnum in surprise_shares:
1864hunk ./src/allmydata/mutable/publish.py 1027
1865             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
1866             checkstring = read_data[shnum][0]
1867-            their_version_info = unpack_checkstring(checkstring)
1868-            if their_version_info == self._new_version_info:
1869+            # What we want to do here is to see if their (seqnum,
1870+            # roothash, salt) is the same as our (seqnum, roothash,
1871+            # salt), or the equivalent for MDMF. The best way to do this
1872+            # is to store a packed representation of our checkstring
1873+            # somewhere, then not bother unpacking the other
1874+            # checkstring.
1875+            if checkstring == self._checkstring:
1876                 # they have the right share, somehow
1877 
1878                 if (peerid,shnum) in self.goal:
1879hunk ./src/allmydata/mutable/publish.py 1112
1880             self.log("our testv failed, so the write did not happen",
1881                      parent=lp, level=log.WEIRD, umid="8sc26g")
1882             self.surprised = True
1883-            self.bad_peers.add(peerid) # don't ask them again
1884+            # TODO: This needs to
1885+            self.bad_peers.add(writer) # don't ask them again
1886             # use the checkstring to add information to the log message
1887             for (shnum,readv) in read_data.items():
1888                 checkstring = readv[0]
1889hunk ./src/allmydata/mutable/publish.py 1138
1890             # self.loop() will take care of finding new homes
1891             return
1892 
1893-        for shnum in shnums:
1894-            self.placed.add( (peerid, shnum) )
1895-            # and update the servermap
1896-            self._servermap.add_new_share(peerid, shnum,
1897+        # and update the servermap
1898+        # self.versioninfo is set during the last phase of publishing.
1899+        # If we get there, we know that responses correspond to placed
1900+        # shares, and can safely execute these statements.
1901+        if self.versioninfo:
1902+            self.log("wrote successfully: adding new share to servermap")
1903+            self._servermap.add_new_share(peerid, writer.shnum,
1904                                           self.versioninfo, started)
1905hunk ./src/allmydata/mutable/publish.py 1146
1906+            self.placed.add( (peerid, writer.shnum) )
1907 
1908         # self.loop() will take care of checking to see if we're done
1909         return
1910hunk ./src/allmydata/mutable/publish.py 1151
1911 
1912-    def _got_write_error(self, f, peerid, shnums, started):
1913-        for shnum in shnums:
1914-            self.outstanding.discard( (peerid, shnum) )
1915-        self.bad_peers.add(peerid)
1916-        if self._first_write_error is None:
1917-            self._first_write_error = f
1918-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
1919-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
1920-                 failure=f,
1921-                 level=log.UNUSUAL)
1922-        # self.loop() will take care of checking to see if we're done
1923-        return
1924-
1925 
1926     def _done(self, res):
1927         if not self._running:
1928hunk ./src/allmydata/mutable/publish.py 1159
1929         now = time.time()
1930         self._status.timings["total"] = now - self._started
1931         self._status.set_active(False)
1932-        if isinstance(res, failure.Failure):
1933-            self.log("Publish done, with failure", failure=res,
1934-                     level=log.WEIRD, umid="nRsR9Q")
1935-            self._status.set_status("Failed")
1936-        elif self.surprised:
1937-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
1938-            self._status.set_status("UncoordinatedWriteError")
1939-            # deliver a failure
1940-            res = failure.Failure(UncoordinatedWriteError())
1941-            # TODO: recovery
1942-        else:
1943-            self.log("Publish done, success")
1944-            self._status.set_status("Finished")
1945-            self._status.set_progress(1.0)
1946+        self.log("Publish done, success")
1947+        self._status.set_status("Finished")
1948+        self._status.set_progress(1.0)
1949         eventually(self.done_deferred.callback, res)
1950 
1951hunk ./src/allmydata/mutable/publish.py 1164
1952+    def _failure(self):
1953+
1954+        if not self.surprised:
1955+            # We ran out of servers
1956+            self.log("Publish ran out of good servers, "
1957+                     "last failure was: %s" % str(self._last_failure))
1958+            e = NotEnoughServersError("Ran out of non-bad servers, "
1959+                                      "last failure was %s" %
1960+                                      str(self._last_failure))
1961+        else:
1962+            # We ran into shares that we didn't recognize, which means
1963+            # that we need to return an UncoordinatedWriteError.
1964+            self.log("Publish failed with UncoordinatedWriteError")
1965+            e = UncoordinatedWriteError()
1966+        f = failure.Failure(e)
1967+        eventually(self.done_deferred.callback, f)
1968+
1969+
1970+class MutableFileHandle:
1971+    """
1972+    I am a mutable uploadable built around a filehandle-like object,
1973+    usually either a StringIO instance or a handle to an actual file.
1974+    """
1975+    implements(IMutableUploadable)
1976+
1977+    def __init__(self, filehandle):
1978+        # The filehandle is defined as a generally file-like object that
1979+        # has these two methods. We don't care beyond that.
1980+        assert hasattr(filehandle, "read")
1981+        assert hasattr(filehandle, "close")
1982+
1983+        self._filehandle = filehandle
1984+        # We must start reading at the beginning of the file, or we risk
1985+        # encountering errors when the data read does not match the size
1986+        # reported to the uploader.
1987+        self._filehandle.seek(0)
1988+
1989+        # We have not yet read anything, so our position is 0.
1990+        self._marker = 0
1991+
1992+
1993+    def get_size(self):
1994+        """
1995+        I return the amount of data in my filehandle.
1996+        """
1997+        if not hasattr(self, "_size"):
1998+            old_position = self._filehandle.tell()
1999+            # Seek to the end of the file by seeking 0 bytes from the
2000+            # file's end
2001+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
2002+            self._size = self._filehandle.tell()
2003+            # Restore the previous position, in case this was called
2004+            # after a read.
2005+            self._filehandle.seek(old_position)
2006+            assert self._filehandle.tell() == old_position
2007+
2008+        assert hasattr(self, "_size")
2009+        return self._size
2010+
2011+
2012+    def pos(self):
2013+        """
2014+        I return the position of my read marker -- i.e., how much data I
2015+        have already read and returned to callers.
2016+        """
2017+        return self._marker
2018+
2019+
2020+    def read(self, length):
2021+        """
2022+        I return some data (up to length bytes) from my filehandle.
2023+
2024+        In most cases, I return length bytes, but sometimes I won't --
2025+        for example, if I am asked to read beyond the end of a file, or
2026+        an error occurs.
2027+        """
2028+        results = self._filehandle.read(length)
2029+        self._marker += len(results)
2030+        return [results]
2031+
2032+
2033+    def close(self):
2034+        """
2035+        I close the underlying filehandle. Any further operations on the
2036+        filehandle fail at this point.
2037+        """
2038+        self._filehandle.close()
2039+
2040+
2041+class MutableData(MutableFileHandle):
2042+    """
2043+    I am a mutable uploadable built around a string, which I then cast
2044+    into a StringIO and treat as a filehandle.
2045+    """
2046+
2047+    def __init__(self, s):
2048+        # Take a string and return a file-like uploadable.
2049+        assert isinstance(s, str)
2050+
2051+        MutableFileHandle.__init__(self, StringIO(s))
2052+
2053+
2054+class TransformingUploadable:
2055+    """
2056+    I am an IMutableUploadable that wraps another IMutableUploadable,
2057+    and some segments that are already on the grid. When I am called to
2058+    read, I handle merging of boundary segments.
2059+    """
2060+    implements(IMutableUploadable)
2061+
2062+
2063+    def __init__(self, data, offset, segment_size, start, end):
2064+        assert IMutableUploadable.providedBy(data)
2065+
2066+        self._newdata = data
2067+        self._offset = offset
2068+        self._segment_size = segment_size
2069+        self._start = start
2070+        self._end = end
2071+
2072+        self._read_marker = 0
2073+
2074+        self._first_segment_offset = offset % segment_size
2075+
2076+        num = self.log("TransformingUploadable: starting", parent=None)
2077+        self._log_number = num
2078+        self.log("got fso: %d" % self._first_segment_offset)
2079+        self.log("got offset: %d" % self._offset)
2080+
2081+
2082+    def log(self, *args, **kwargs):
2083+        if 'parent' not in kwargs:
2084+            kwargs['parent'] = self._log_number
2085+        if "facility" not in kwargs:
2086+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
2087+        return log.msg(*args, **kwargs)
2088+
2089+
2090+    def get_size(self):
2091+        return self._offset + self._newdata.get_size()
2092+
2093+
2094+    def read(self, length):
2095+        # We can get data from 3 sources here.
2096+        #   1. The first of the segments provided to us.
2097+        #   2. The data that we're replacing things with.
2098+        #   3. The last of the segments provided to us.
2099+
2100+        # are we in state 0?
2101+        self.log("reading %d bytes" % length)
2102+
2103+        old_start_data = ""
2104+        old_data_length = self._first_segment_offset - self._read_marker
2105+        if old_data_length > 0:
2106+            if old_data_length > length:
2107+                old_data_length = length
2108+            self.log("returning %d bytes of old start data" % old_data_length)
2109+
2110+            old_data_end = old_data_length + self._read_marker
2111+            old_start_data = self._start[self._read_marker:old_data_end]
2112+            length -= old_data_length
2113+        else:
2114+            # otherwise calculations later get screwed up.
2115+            old_data_length = 0
2116+
2117+        # Is there enough new data to satisfy this read? If not, we need
2118+        # to pad the end of the data with data from our last segment.
2119+        old_end_length = length - \
2120+            (self._newdata.get_size() - self._newdata.pos())
2121+        old_end_data = ""
2122+        if old_end_length > 0:
2123+            self.log("reading %d bytes of old end data" % old_end_length)
2124+
2125+            # TODO: We're not explicitly checking for tail segment size
2126+            # here. Is that a problem?
2127+            old_data_offset = (length - old_end_length + \
2128+                               old_data_length) % self._segment_size
2129+            self.log("reading at offset %d" % old_data_offset)
2130+            old_end = old_data_offset + old_end_length
2131+            old_end_data = self._end[old_data_offset:old_end]
2132+            length -= old_end_length
2133+            assert length == self._newdata.get_size() - self._newdata.pos()
2134+
2135+        self.log("reading %d bytes of new data" % length)
2136+        new_data = self._newdata.read(length)
2137+        new_data = "".join(new_data)
2138+
2139+        self._read_marker += len(old_start_data + new_data + old_end_data)
2140+
2141+        return old_start_data + new_data + old_end_data
2142 
2143hunk ./src/allmydata/mutable/publish.py 1355
2144+    def close(self):
2145+        pass
2146}
2147[mutable/retrieve.py: Modify the retrieval process to support MDMF
2148Kevan Carstensen <kevan@isnotajoke.com>**20100809232312
2149 Ignore-this: 956ff24c26b494f1be8db81cf3fdd61e
2150 
2151 The logic behind a mutable file download had to be adapted to work with
2152 segmented mutable files; this patch performs those adaptations. It also
2153 exposes some decoding and decrypting functionality to make partial-file
2154 updates a little easier, and supports efficient random-access downloads
2155 of parts of an MDMF file.
2156] {
2157hunk ./src/allmydata/mutable/retrieve.py 7
2158 from zope.interface import implements
2159 from twisted.internet import defer
2160 from twisted.python import failure
2161+from twisted.internet.interfaces import IPushProducer, IConsumer
2162 from foolscap.api import DeadReferenceError, eventually, fireEventually
2163hunk ./src/allmydata/mutable/retrieve.py 9
2164-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
2165-from allmydata.util import hashutil, idlib, log
2166+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
2167+                                 MDMF_VERSION, SDMF_VERSION
2168+from allmydata.util import hashutil, idlib, log, mathutil
2169 from allmydata import hashtree, codec
2170 from allmydata.storage.server import si_b2a
2171 from pycryptopp.cipher.aes import AES
2172hunk ./src/allmydata/mutable/retrieve.py 18
2173 from pycryptopp.publickey import rsa
2174 
2175 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
2176-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
2177+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
2178+                                     MDMFSlotReadProxy
2179 
2180 class RetrieveStatus:
2181     implements(IRetrieveStatus)
2182hunk ./src/allmydata/mutable/retrieve.py 86
2183     # times, and each will have a separate response chain. However the
2184     # Retrieve object will remain tied to a specific version of the file, and
2185     # will use a single ServerMap instance.
2186+    implements(IPushProducer)
2187 
2188hunk ./src/allmydata/mutable/retrieve.py 88
2189-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
2190+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
2191+                 verify=False):
2192         self._node = filenode
2193         assert self._node.get_pubkey()
2194         self._storage_index = filenode.get_storage_index()
2195hunk ./src/allmydata/mutable/retrieve.py 107
2196         self.verinfo = verinfo
2197         # during repair, we may be called upon to grab the private key, since
2198         # it wasn't picked up during a verify=False checker run, and we'll
2199-        # need it for repair to generate the a new version.
2200-        self._need_privkey = fetch_privkey
2201-        if self._node.get_privkey():
2202+        # need it for repair to generate a new version.
2203+        self._need_privkey = fetch_privkey or verify
2204+        if self._node.get_privkey() and not verify:
2205             self._need_privkey = False
2206 
2207hunk ./src/allmydata/mutable/retrieve.py 112
2208+        if self._need_privkey:
2209+            # TODO: Evaluate the need for this. We'll use it if we want
2210+            # to limit how many queries are on the wire for the privkey
2211+            # at once.
2212+            self._privkey_query_markers = [] # one Marker for each time we've
2213+                                             # tried to get the privkey.
2214+
2215+        # verify means that we are using the downloader logic to verify all
2216+        # of our shares. This tells the downloader a few things.
2217+        #
2218+        # 1. We need to download all of the shares.
2219+        # 2. We don't need to decode or decrypt the shares, since our
2220+        #    caller doesn't care about the plaintext, only the
2221+        #    information about which shares are or are not valid.
2222+        # 3. When we are validating readers, we need to validate the
2223+        #    signature on the prefix. Do we? We already do this in the
2224+        #    servermap update?
2225+        self._verify = False
2226+        if verify:
2227+            self._verify = True
2228+
2229         self._status = RetrieveStatus()
2230         self._status.set_storage_index(self._storage_index)
2231         self._status.set_helper(False)
2232hunk ./src/allmydata/mutable/retrieve.py 142
2233          offsets_tuple) = self.verinfo
2234         self._status.set_size(datalength)
2235         self._status.set_encoding(k, N)
2236+        self.readers = {}
2237+        self._paused = False
2238+        self._paused_deferred = None
2239+        self._offset = None
2240+        self._read_length = None
2241+
2242 
2243     def get_status(self):
2244         return self._status
2245hunk ./src/allmydata/mutable/retrieve.py 159
2246             kwargs["facility"] = "tahoe.mutable.retrieve"
2247         return log.msg(*args, **kwargs)
2248 
2249-    def download(self):
2250+
2251+    ###################
2252+    # IPushProducer
2253+
2254+    def pauseProducing(self):
2255+        """
2256+        I am called by my download target if we have produced too much
2257+        data for it to handle. I make the downloader stop producing new
2258+        data until my resumeProducing method is called.
2259+        """
2260+        if self._paused:
2261+            return
2262+
2263+        # fired when the download is unpaused.
2264+        self._pause_deferred = defer.Deferred()
2265+        self._paused = True
2266+
2267+
2268+    def resumeProducing(self):
2269+        """
2270+        I am called by my download target once it is ready to begin
2271+        receiving data again.
2272+        """
2273+        if not self._paused:
2274+            return
2275+
2276+        self._paused = False
2277+        p = self._pause_deferred
2278+        self._pause_deferred = None
2279+        eventually(p.callback, None)
2280+
2281+
2282+    def _check_for_paused(self, res):
2283+        """
2284+        I am called just before a write to the consumer. I return a
2285+        Deferred that eventually fires with the data that is to be
2286+        written to the consumer. If the download has not been paused,
2287+        the Deferred fires immediately. Otherwise, the Deferred fires
2288+        when the downloader is unpaused.
2289+        """
2290+        if self._paused:
2291+            d = defer.Deferred()
2292+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
2293+            return d
2294+        return defer.succeed(res)
2295+
2296+
2297+    def download(self, consumer=None, offset=0, size=None):
2298+        assert IConsumer.providedBy(consumer) or self._verify
2299+
2300+        if consumer:
2301+            self._consumer = consumer
2302+            # we provide IPushProducer, so streaming=True, per
2303+            # IConsumer.
2304+            self._consumer.registerProducer(self, streaming=True)
2305+
2306         self._done_deferred = defer.Deferred()
2307         self._started = time.time()
2308         self._status.set_status("Retrieving Shares")
2309hunk ./src/allmydata/mutable/retrieve.py 219
2310 
2311+        self._offset = offset
2312+        self._read_length = size
2313+
2314         # first, which servers can we use?
2315         versionmap = self.servermap.make_versionmap()
2316         shares = versionmap[self.verinfo]
2317hunk ./src/allmydata/mutable/retrieve.py 229
2318         self.remaining_sharemap = DictOfSets()
2319         for (shnum, peerid, timestamp) in shares:
2320             self.remaining_sharemap.add(shnum, peerid)
2321+            # If the servermap update fetched anything, it fetched at least 1
2322+            # KiB, so we ask for that much.
2323+            # TODO: Change the cache methods to allow us to fetch all of the
2324+            # data that they have, then change this method to do that.
2325+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
2326+                                                               shnum,
2327+                                                               0,
2328+                                                               1000)
2329+            ss = self.servermap.connections[peerid]
2330+            reader = MDMFSlotReadProxy(ss,
2331+                                       self._storage_index,
2332+                                       shnum,
2333+                                       any_cache)
2334+            reader.peerid = peerid
2335+            self.readers[shnum] = reader
2336+
2337 
2338         self.shares = {} # maps shnum to validated blocks
2339hunk ./src/allmydata/mutable/retrieve.py 247
2340+        self._active_readers = [] # list of active readers for this dl.
2341+        self._validated_readers = set() # set of readers that we have
2342+                                        # validated the prefix of
2343+        self._block_hash_trees = {} # shnum => hashtree
2344+        # TODO: Make this into a file-backed consumer or something to
2345+        # conserve memory.
2346+        self._plaintext = ""
2347 
2348         # how many shares do we need?
2349hunk ./src/allmydata/mutable/retrieve.py 256
2350-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2351+        (seqnum,
2352+         root_hash,
2353+         IV,
2354+         segsize,
2355+         datalength,
2356+         k,
2357+         N,
2358+         prefix,
2359          offsets_tuple) = self.verinfo
2360hunk ./src/allmydata/mutable/retrieve.py 265
2361-        assert len(self.remaining_sharemap) >= k
2362-        # we start with the lowest shnums we have available, since FEC is
2363-        # faster if we're using "primary shares"
2364-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
2365-        for shnum in self.active_shnums:
2366-            # we use an arbitrary peer who has the share. If shares are
2367-            # doubled up (more than one share per peer), we could make this
2368-            # run faster by spreading the load among multiple peers. But the
2369-            # algorithm to do that is more complicated than I want to write
2370-            # right now, and a well-provisioned grid shouldn't have multiple
2371-            # shares per peer.
2372-            peerid = list(self.remaining_sharemap[shnum])[0]
2373-            self.get_data(shnum, peerid)
2374 
2375hunk ./src/allmydata/mutable/retrieve.py 266
2376-        # control flow beyond this point: state machine. Receiving responses
2377-        # from queries is the input. We might send out more queries, or we
2378-        # might produce a result.
2379 
2380hunk ./src/allmydata/mutable/retrieve.py 267
2381+        # We need one share hash tree for the entire file; its leaves
2382+        # are the roots of the block hash trees for the shares that
2383+        # comprise it, and its root is in the verinfo.
2384+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
2385+        self.share_hash_tree.set_hashes({0: root_hash})
2386+
2387+        # This will set up both the segment decoder and the tail segment
2388+        # decoder, as well as a variety of other instance variables that
2389+        # the download process will use.
2390+        self._setup_encoding_parameters()
2391+        assert len(self.remaining_sharemap) >= k
2392+
2393+        self.log("starting download")
2394+        self._paused = False
2395+        self._add_active_peers()
2396+        # The download process beyond this is a state machine.
2397+        # _add_active_peers will select the peers that we want to use
2398+        # for the download, and then attempt to start downloading. After
2399+        # each segment, it will check for doneness, reacting to broken
2400+        # peers and corrupt shares as necessary. If it runs out of good
2401+        # peers before downloading all of the segments, _done_deferred
2402+        # will errback.  Otherwise, it will eventually callback with the
2403+        # contents of the mutable file.
2404         return self._done_deferred
2405 
2406hunk ./src/allmydata/mutable/retrieve.py 292
2407-    def get_data(self, shnum, peerid):
2408-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
2409-                 shnum=shnum,
2410-                 peerid=idlib.shortnodeid_b2a(peerid),
2411-                 level=log.NOISY)
2412-        ss = self.servermap.connections[peerid]
2413-        started = time.time()
2414-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2415+
2416+    def decode(self, blocks_and_salts, segnum):
2417+        """
2418+        I am a helper method that the mutable file update process uses
2419+        as a shortcut to decode and decrypt the segments that it needs
2420+        to fetch in order to perform a file update. I take in a
2421+        collection of blocks and salts, and pick some of those to make a
2422+        segment with. I return the plaintext associated with that
2423+        segment.
2424+        """
2425+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
2426+        # want to set this.
2427+        # XXX: Make it so that it won't set this if we're just decoding.
2428+        self._block_hash_trees = {}
2429+        self._setup_encoding_parameters()
2430+        # This is the form expected by decode.
2431+        blocks_and_salts = blocks_and_salts.items()
2432+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
2433+
2434+        d = self._decode_blocks(blocks_and_salts, segnum)
2435+        d.addCallback(self._decrypt_segment)
2436+        return d
2437+
2438+
2439+    def _setup_encoding_parameters(self):
2440+        """
2441+        I set up the encoding parameters, including k, n, the number
2442+        of segments associated with this file, and the segment decoder.
2443+        """
2444+        (seqnum,
2445+         root_hash,
2446+         IV,
2447+         segsize,
2448+         datalength,
2449+         k,
2450+         n,
2451+         known_prefix,
2452          offsets_tuple) = self.verinfo
2453hunk ./src/allmydata/mutable/retrieve.py 330
2454-        offsets = dict(offsets_tuple)
2455+        self._required_shares = k
2456+        self._total_shares = n
2457+        self._segment_size = segsize
2458+        self._data_length = datalength
2459 
2460hunk ./src/allmydata/mutable/retrieve.py 335
2461-        # we read the checkstring, to make sure that the data we grab is from
2462-        # the right version.
2463-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
2464+        if not IV:
2465+            self._version = MDMF_VERSION
2466+        else:
2467+            self._version = SDMF_VERSION
2468 
2469hunk ./src/allmydata/mutable/retrieve.py 340
2470-        # We also read the data, and the hashes necessary to validate them
2471-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
2472-        # signature or the pubkey, since that was handled during the
2473-        # servermap phase, and we'll be comparing the share hash chain
2474-        # against the roothash that was validated back then.
2475+        if datalength and segsize:
2476+            self._num_segments = mathutil.div_ceil(datalength, segsize)
2477+            self._tail_data_size = datalength % segsize
2478+        else:
2479+            self._num_segments = 0
2480+            self._tail_data_size = 0
2481 
2482hunk ./src/allmydata/mutable/retrieve.py 347
2483-        readv.append( (offsets['share_hash_chain'],
2484-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
2485+        self._segment_decoder = codec.CRSDecoder()
2486+        self._segment_decoder.set_params(segsize, k, n)
2487 
2488hunk ./src/allmydata/mutable/retrieve.py 350
2489-        # if we need the private key (for repair), we also fetch that
2490-        if self._need_privkey:
2491-            readv.append( (offsets['enc_privkey'],
2492-                           offsets['EOF'] - offsets['enc_privkey']) )
2493+        if  not self._tail_data_size:
2494+            self._tail_data_size = segsize
2495+
2496+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
2497+                                                         self._required_shares)
2498+        if self._tail_segment_size == self._segment_size:
2499+            self._tail_decoder = self._segment_decoder
2500+        else:
2501+            self._tail_decoder = codec.CRSDecoder()
2502+            self._tail_decoder.set_params(self._tail_segment_size,
2503+                                          self._required_shares,
2504+                                          self._total_shares)
2505 
2506hunk ./src/allmydata/mutable/retrieve.py 363
2507-        m = Marker()
2508-        self._outstanding_queries[m] = (peerid, shnum, started)
2509+        self.log("got encoding parameters: "
2510+                 "k: %d "
2511+                 "n: %d "
2512+                 "%d segments of %d bytes each (%d byte tail segment)" % \
2513+                 (k, n, self._num_segments, self._segment_size,
2514+                  self._tail_segment_size))
2515 
2516hunk ./src/allmydata/mutable/retrieve.py 370
2517-        # ask the cache first
2518-        got_from_cache = False
2519-        datavs = []
2520-        for (offset, length) in readv:
2521-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
2522-                                                            offset, length)
2523-            if data is not None:
2524-                datavs.append(data)
2525-        if len(datavs) == len(readv):
2526-            self.log("got data from cache")
2527-            got_from_cache = True
2528-            d = fireEventually({shnum: datavs})
2529-            # datavs is a dict mapping shnum to a pair of strings
2530+        for i in xrange(self._total_shares):
2531+            # So we don't have to do this later.
2532+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
2533+
2534+        # Our last task is to tell the downloader where to start and
2535+        # where to stop. We use three parameters for that:
2536+        #   - self._start_segment: the segment that we need to start
2537+        #     downloading from.
2538+        #   - self._current_segment: the next segment that we need to
2539+        #     download.
2540+        #   - self._last_segment: The last segment that we were asked to
2541+        #     download.
2542+        #
2543+        #  We say that the download is complete when
2544+        #  self._current_segment > self._last_segment. We use
2545+        #  self._start_segment and self._last_segment to know when to
2546+        #  strip things off of segments, and how much to strip.
2547+        if self._offset:
2548+            self.log("got offset: %d" % self._offset)
2549+            # our start segment is the first segment containing the
2550+            # offset we were given.
2551+            start = mathutil.div_ceil(self._offset,
2552+                                      self._segment_size)
2553+            # this gets us the first segment after self._offset. Then
2554+            # our start segment is the one before it.
2555+            start -= 1
2556+
2557+            assert start < self._num_segments
2558+            self._start_segment = start
2559+            self.log("got start segment: %d" % self._start_segment)
2560         else:
2561hunk ./src/allmydata/mutable/retrieve.py 401
2562-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
2563-        self.remaining_sharemap.discard(shnum, peerid)
2564+            self._start_segment = 0
2565 
2566hunk ./src/allmydata/mutable/retrieve.py 403
2567-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
2568-        d.addErrback(self._query_failed, m, peerid)
2569-        # errors that aren't handled by _query_failed (and errors caused by
2570-        # _query_failed) get logged, but we still want to check for doneness.
2571-        def _oops(f):
2572-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
2573-                     shnum=shnum,
2574-                     peerid=idlib.shortnodeid_b2a(peerid),
2575-                     failure=f,
2576-                     level=log.WEIRD, umid="W0xnQA")
2577-        d.addErrback(_oops)
2578-        d.addBoth(self._check_for_done)
2579-        # any error during _check_for_done means the download fails. If the
2580-        # download is successful, _check_for_done will fire _done by itself.
2581-        d.addErrback(self._done)
2582-        d.addErrback(log.err)
2583-        return d # purely for testing convenience
2584 
2585hunk ./src/allmydata/mutable/retrieve.py 404
2586-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
2587-        # isolate the callRemote to a separate method, so tests can subclass
2588-        # Publish and override it
2589-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
2590-        return d
2591+        if self._read_length:
2592+            # our end segment is the last segment containing part of the
2593+            # segment that we were asked to read.
2594+            self.log("got read length %d" % self._read_length)
2595+            end_data = self._offset + self._read_length
2596+            end = mathutil.div_ceil(end_data,
2597+                                    self._segment_size)
2598+            end -= 1
2599+            assert end < self._num_segments
2600+            self._last_segment = end
2601+            self.log("got end segment: %d" % self._last_segment)
2602+        else:
2603+            self._last_segment = self._num_segments - 1
2604 
2605hunk ./src/allmydata/mutable/retrieve.py 418
2606-    def remove_peer(self, peerid):
2607-        for shnum in list(self.remaining_sharemap.keys()):
2608-            self.remaining_sharemap.discard(shnum, peerid)
2609+        self._current_segment = self._start_segment
2610 
2611hunk ./src/allmydata/mutable/retrieve.py 420
2612-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
2613-        now = time.time()
2614-        elapsed = now - started
2615-        if not got_from_cache:
2616-            self._status.add_fetch_timing(peerid, elapsed)
2617-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
2618-                 shares=len(datavs),
2619-                 peerid=idlib.shortnodeid_b2a(peerid),
2620-                 level=log.NOISY)
2621-        self._outstanding_queries.pop(marker, None)
2622-        if not self._running:
2623-            return
2624+    def _add_active_peers(self):
2625+        """
2626+        I populate self._active_readers with enough active readers to
2627+        retrieve the contents of this mutable file. I am called before
2628+        downloading starts, and (eventually) after each validation
2629+        error, connection error, or other problem in the download.
2630+        """
2631+        # TODO: It would be cool to investigate other heuristics for
2632+        # reader selection. For instance, the cost (in time the user
2633+        # spends waiting for their file) of selecting a really slow peer
2634+        # that happens to have a primary share is probably more than
2635+        # selecting a really fast peer that doesn't have a primary
2636+        # share. Maybe the servermap could be extended to provide this
2637+        # information; it could keep track of latency information while
2638+        # it gathers more important data, and then this routine could
2639+        # use that to select active readers.
2640+        #
2641+        # (these and other questions would be easier to answer with a
2642+        #  robust, configurable tahoe-lafs simulator, which modeled node
2643+        #  failures, differences in node speed, and other characteristics
2644+        #  that we expect storage servers to have.  You could have
2645+        #  presets for really stable grids (like allmydata.com),
2646+        #  friendnets, make it easy to configure your own settings, and
2647+        #  then simulate the effect of big changes on these use cases
2648+        #  instead of just reasoning about what the effect might be. Out
2649+        #  of scope for MDMF, though.)
2650 
2651hunk ./src/allmydata/mutable/retrieve.py 447
2652-        # note that we only ask for a single share per query, so we only
2653-        # expect a single share back. On the other hand, we use the extra
2654-        # shares if we get them.. seems better than an assert().
2655+        # We need at least self._required_shares readers to download a
2656+        # segment.
2657+        if self._verify:
2658+            needed = self._total_shares
2659+        else:
2660+            needed = self._required_shares - len(self._active_readers)
2661+        # XXX: Why don't format= log messages work here?
2662+        self.log("adding %d peers to the active peers list" % needed)
2663 
2664hunk ./src/allmydata/mutable/retrieve.py 456
2665-        for shnum,datav in datavs.items():
2666-            (prefix, hash_and_data) = datav[:2]
2667-            try:
2668-                self._got_results_one_share(shnum, peerid,
2669-                                            prefix, hash_and_data)
2670-            except CorruptShareError, e:
2671-                # log it and give the other shares a chance to be processed
2672-                f = failure.Failure()
2673-                self.log(format="bad share: %(f_value)s",
2674-                         f_value=str(f.value), failure=f,
2675-                         level=log.WEIRD, umid="7fzWZw")
2676-                self.notify_server_corruption(peerid, shnum, str(e))
2677-                self.remove_peer(peerid)
2678-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2679-                self._bad_shares.add( (peerid, shnum) )
2680-                self._status.problems[peerid] = f
2681-                self._last_failure = f
2682-                pass
2683-            if self._need_privkey and len(datav) > 2:
2684-                lp = None
2685-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2686-        # all done!
2687+        # We favor lower numbered shares, since FEC is faster with
2688+        # primary shares than with other shares, and lower-numbered
2689+        # shares are more likely to be primary than higher numbered
2690+        # shares.
2691+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
2692+        # We shouldn't consider adding shares that we already have; this
2693+        # will cause problems later.
2694+        active_shnums -= set([reader.shnum for reader in self._active_readers])
2695+        active_shnums = list(active_shnums)[:needed]
2696+        if len(active_shnums) < needed and not self._verify:
2697+            # We don't have enough readers to retrieve the file; fail.
2698+            return self._failed()
2699 
2700hunk ./src/allmydata/mutable/retrieve.py 469
2701-    def notify_server_corruption(self, peerid, shnum, reason):
2702-        ss = self.servermap.connections[peerid]
2703-        ss.callRemoteOnly("advise_corrupt_share",
2704-                          "mutable", self._storage_index, shnum, reason)
2705+        for shnum in active_shnums:
2706+            self._active_readers.append(self.readers[shnum])
2707+            self.log("added reader for share %d" % shnum)
2708+        assert len(self._active_readers) >= self._required_shares
2709+        # Conceptually, this is part of the _add_active_peers step. It
2710+        # validates the prefixes of newly added readers to make sure
2711+        # that they match what we are expecting for self.verinfo. If
2712+        # validation is successful, _validate_active_prefixes will call
2713+        # _download_current_segment for us. If validation is
2714+        # unsuccessful, then _validate_prefixes will remove the peer and
2715+        # call _add_active_peers again, where we will attempt to rectify
2716+        # the problem by choosing another peer.
2717+        return self._validate_active_prefixes()
2718 
2719hunk ./src/allmydata/mutable/retrieve.py 483
2720-    def _got_results_one_share(self, shnum, peerid,
2721-                               got_prefix, got_hash_and_data):
2722-        self.log("_got_results: got shnum #%d from peerid %s"
2723-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2724-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2725-         offsets_tuple) = self.verinfo
2726-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2727-        if got_prefix != prefix:
2728-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2729-            raise UncoordinatedWriteError(msg)
2730-        (share_hash_chain, block_hash_tree,
2731-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2732 
2733hunk ./src/allmydata/mutable/retrieve.py 484
2734-        assert isinstance(share_data, str)
2735-        # build the block hash tree. SDMF has only one leaf.
2736-        leaves = [hashutil.block_hash(share_data)]
2737-        t = hashtree.HashTree(leaves)
2738-        if list(t) != block_hash_tree:
2739-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2740-        share_hash_leaf = t[0]
2741-        t2 = hashtree.IncompleteHashTree(N)
2742-        # root_hash was checked by the signature
2743-        t2.set_hashes({0: root_hash})
2744-        try:
2745-            t2.set_hashes(hashes=share_hash_chain,
2746-                          leaves={shnum: share_hash_leaf})
2747-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2748-                IndexError), e:
2749-            msg = "corrupt hashes: %s" % (e,)
2750-            raise CorruptShareError(peerid, shnum, msg)
2751-        self.log(" data valid! len=%d" % len(share_data))
2752-        # each query comes down to this: placing validated share data into
2753-        # self.shares
2754-        self.shares[shnum] = share_data
2755+    def _validate_active_prefixes(self):
2756+        """
2757+        I check to make sure that the prefixes on the peers that I am
2758+        currently reading from match the prefix that we want to see, as
2759+        said in self.verinfo.
2760 
2761hunk ./src/allmydata/mutable/retrieve.py 490
2762-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2763+        If I find that all of the active peers have acceptable prefixes,
2764+        I pass control to _download_current_segment, which will use
2765+        those peers to do cool things. If I find that some of the active
2766+        peers have unacceptable prefixes, I will remove them from active
2767+        peers (and from further consideration) and call
2768+        _add_active_peers to attempt to rectify the situation. I keep
2769+        track of which peers I have already validated so that I don't
2770+        need to do so again.
2771+        """
2772+        assert self._active_readers, "No more active readers"
2773 
2774hunk ./src/allmydata/mutable/retrieve.py 501
2775-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2776-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2777-        if alleged_writekey != self._node.get_writekey():
2778-            self.log("invalid privkey from %s shnum %d" %
2779-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2780-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2781-            return
2782+        ds = []
2783+        new_readers = set(self._active_readers) - self._validated_readers
2784+        self.log('validating %d newly-added active readers' % len(new_readers))
2785 
2786hunk ./src/allmydata/mutable/retrieve.py 505
2787-        # it's good
2788-        self.log("got valid privkey from shnum %d on peerid %s" %
2789-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2790-                 parent=lp)
2791-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2792-        self._node._populate_encprivkey(enc_privkey)
2793-        self._node._populate_privkey(privkey)
2794-        self._need_privkey = False
2795+        for reader in new_readers:
2796+            # We force a remote read here -- otherwise, we are relying
2797+            # on cached data that we already verified as valid, and we
2798+            # won't detect an uncoordinated write that has occurred
2799+            # since the last servermap update.
2800+            d = reader.get_prefix(force_remote=True)
2801+            d.addCallback(self._try_to_validate_prefix, reader)
2802+            ds.append(d)
2803+        dl = defer.DeferredList(ds, consumeErrors=True)
2804+        def _check_results(results):
2805+            # Each result in results will be of the form (success, msg).
2806+            # We don't care about msg, but success will tell us whether
2807+            # or not the checkstring validated. If it didn't, we need to
2808+            # remove the offending (peer,share) from our active readers,
2809+            # and ensure that active readers is again populated.
2810+            bad_readers = []
2811+            for i, result in enumerate(results):
2812+                if not result[0]:
2813+                    reader = self._active_readers[i]
2814+                    f = result[1]
2815+                    assert isinstance(f, failure.Failure)
2816 
2817hunk ./src/allmydata/mutable/retrieve.py 527
2818-    def _query_failed(self, f, marker, peerid):
2819-        self.log(format="query to [%(peerid)s] failed",
2820-                 peerid=idlib.shortnodeid_b2a(peerid),
2821-                 level=log.NOISY)
2822-        self._status.problems[peerid] = f
2823-        self._outstanding_queries.pop(marker, None)
2824-        if not self._running:
2825-            return
2826-        self._last_failure = f
2827-        self.remove_peer(peerid)
2828-        level = log.WEIRD
2829-        if f.check(DeadReferenceError):
2830-            level = log.UNUSUAL
2831-        self.log(format="error during query: %(f_value)s",
2832-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2833+                    self.log("The reader %s failed to "
2834+                             "properly validate: %s" % \
2835+                             (reader, str(f.value)))
2836+                    bad_readers.append((reader, f))
2837+                else:
2838+                    reader = self._active_readers[i]
2839+                    self.log("the reader %s checks out, so we'll use it" % \
2840+                             reader)
2841+                    self._validated_readers.add(reader)
2842+                    # Each time we validate a reader, we check to see if
2843+                    # we need the private key. If we do, we politely ask
2844+                    # for it and then continue computing. If we find
2845+                    # that we haven't gotten it at the end of
2846+                    # segment decoding, then we'll take more drastic
2847+                    # measures.
2848+                    if self._need_privkey and not self._node.is_readonly():
2849+                        d = reader.get_encprivkey()
2850+                        d.addCallback(self._try_to_validate_privkey, reader)
2851+            if bad_readers:
2852+                # We do them all at once, or else we screw up list indexing.
2853+                for (reader, f) in bad_readers:
2854+                    self._mark_bad_share(reader, f)
2855+                if self._verify:
2856+                    if len(self._active_readers) >= self._required_shares:
2857+                        return self._download_current_segment()
2858+                    else:
2859+                        return self._failed()
2860+                else:
2861+                    return self._add_active_peers()
2862+            else:
2863+                return self._download_current_segment()
2864+            # The next step will assert that it has enough active
2865+            # readers to fetch shares; we just need to remove it.
2866+        dl.addCallback(_check_results)
2867+        return dl
2868 
2869hunk ./src/allmydata/mutable/retrieve.py 563
2870-    def _check_for_done(self, res):
2871-        # exit paths:
2872-        #  return : keep waiting, no new queries
2873-        #  return self._send_more_queries(outstanding) : send some more queries
2874-        #  fire self._done(plaintext) : download successful
2875-        #  raise exception : download fails
2876 
2877hunk ./src/allmydata/mutable/retrieve.py 564
2878-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2879-                 running=self._running, decoding=self._decoding,
2880-                 level=log.NOISY)
2881-        if not self._running:
2882-            return
2883-        if self._decoding:
2884-            return
2885-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2886+    def _try_to_validate_prefix(self, prefix, reader):
2887+        """
2888+        I check that the prefix returned by a candidate server for
2889+        retrieval matches the prefix that the servermap knows about
2890+        (and, hence, the prefix that was validated earlier). If it does,
2891+        I return True, which means that I approve of the use of the
2892+        candidate server for segment retrieval. If it doesn't, I return
2893+        False, which means that another server must be chosen.
2894+        """
2895+        (seqnum,
2896+         root_hash,
2897+         IV,
2898+         segsize,
2899+         datalength,
2900+         k,
2901+         N,
2902+         known_prefix,
2903          offsets_tuple) = self.verinfo
2904hunk ./src/allmydata/mutable/retrieve.py 582
2905+        if known_prefix != prefix:
2906+            self.log("prefix from share %d doesn't match" % reader.shnum)
2907+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2908+                                          "indicate an uncoordinated write")
2909+        # Otherwise, we're okay -- no issues.
2910 
2911hunk ./src/allmydata/mutable/retrieve.py 588
2912-        if len(self.shares) < k:
2913-            # we don't have enough shares yet
2914-            return self._maybe_send_more_queries(k)
2915-        if self._need_privkey:
2916-            # we got k shares, but none of them had a valid privkey. TODO:
2917-            # look further. Adding code to do this is a bit complicated, and
2918-            # I want to avoid that complication, and this should be pretty
2919-            # rare (k shares with bitflips in the enc_privkey but not in the
2920-            # data blocks). If we actually do get here, the subsequent repair
2921-            # will fail for lack of a privkey.
2922-            self.log("got k shares but still need_privkey, bummer",
2923-                     level=log.WEIRD, umid="MdRHPA")
2924 
2925hunk ./src/allmydata/mutable/retrieve.py 589
2926-        # we have enough to finish. All the shares have had their hashes
2927-        # checked, so if something fails at this point, we don't know how
2928-        # to fix it, so the download will fail.
2929+    def _remove_reader(self, reader):
2930+        """
2931+        At various points, we will wish to remove a peer from
2932+        consideration and/or use. These include, but are not necessarily
2933+        limited to:
2934 
2935hunk ./src/allmydata/mutable/retrieve.py 595
2936-        self._decoding = True # avoid reentrancy
2937-        self._status.set_status("decoding")
2938-        now = time.time()
2939-        elapsed = now - self._started
2940-        self._status.timings["fetch"] = elapsed
2941+            - A connection error.
2942+            - A mismatched prefix (that is, a prefix that does not match
2943+              our conception of the version information string).
2944+            - A failing block hash, salt hash, or share hash, which can
2945+              indicate disk failure/bit flips, or network trouble.
2946 
2947hunk ./src/allmydata/mutable/retrieve.py 601
2948-        d = defer.maybeDeferred(self._decode)
2949-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2950-        d.addBoth(self._done)
2951-        return d # purely for test convenience
2952+        This method will do that. I will make sure that the
2953+        (shnum,reader) combination represented by my reader argument is
2954+        not used for anything else during this download. I will not
2955+        advise the reader of any corruption, something that my callers
2956+        may wish to do on their own.
2957+        """
2958+        # TODO: When you're done writing this, see if this is ever
2959+        # actually used for something that _mark_bad_share isn't. I have
2960+        # a feeling that they will be used for very similar things, and
2961+        # that having them both here is just going to be an epic amount
2962+        # of code duplication.
2963+        #
2964+        # (well, okay, not epic, but meaningful)
2965+        self.log("removing reader %s" % reader)
2966+        # Remove the reader from _active_readers
2967+        self._active_readers.remove(reader)
2968+        # TODO: self.readers.remove(reader)?
2969+        for shnum in list(self.remaining_sharemap.keys()):
2970+            self.remaining_sharemap.discard(shnum, reader.peerid)
2971 
2972hunk ./src/allmydata/mutable/retrieve.py 621
2973-    def _maybe_send_more_queries(self, k):
2974-        # we don't have enough shares yet. Should we send out more queries?
2975-        # There are some number of queries outstanding, each for a single
2976-        # share. If we can generate 'needed_shares' additional queries, we do
2977-        # so. If we can't, then we know this file is a goner, and we raise
2978-        # NotEnoughSharesError.
2979-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2980-                         "outstanding=%(outstanding)d"),
2981-                 have=len(self.shares), k=k,
2982-                 outstanding=len(self._outstanding_queries),
2983-                 level=log.NOISY)
2984 
2985hunk ./src/allmydata/mutable/retrieve.py 622
2986-        remaining_shares = k - len(self.shares)
2987-        needed = remaining_shares - len(self._outstanding_queries)
2988-        if not needed:
2989-            # we have enough queries in flight already
2990+    def _mark_bad_share(self, reader, f):
2991+        """
2992+        I mark the (peerid, shnum) encapsulated by my reader argument as
2993+        a bad share, which means that it will not be used anywhere else.
2994 
2995hunk ./src/allmydata/mutable/retrieve.py 627
2996-            # TODO: but if they've been in flight for a long time, and we
2997-            # have reason to believe that new queries might respond faster
2998-            # (i.e. we've seen other queries come back faster, then consider
2999-            # sending out new queries. This could help with peers which have
3000-            # silently gone away since the servermap was updated, for which
3001-            # we're still waiting for the 15-minute TCP disconnect to happen.
3002-            self.log("enough queries are in flight, no more are needed",
3003-                     level=log.NOISY)
3004-            return
3005+        There are several reasons to want to mark something as a bad
3006+        share. These include:
3007+
3008+            - A connection error to the peer.
3009+            - A mismatched prefix (that is, a prefix that does not match
3010+              our local conception of the version information string).
3011+            - A failing block hash, salt hash, share hash, or other
3012+              integrity check.
3013 
3014hunk ./src/allmydata/mutable/retrieve.py 636
3015-        outstanding_shnums = set([shnum
3016-                                  for (peerid, shnum, started)
3017-                                  in self._outstanding_queries.values()])
3018-        # prefer low-numbered shares, they are more likely to be primary
3019-        available_shnums = sorted(self.remaining_sharemap.keys())
3020-        for shnum in available_shnums:
3021-            if shnum in outstanding_shnums:
3022-                # skip ones that are already in transit
3023-                continue
3024-            if shnum not in self.remaining_sharemap:
3025-                # no servers for that shnum. note that DictOfSets removes
3026-                # empty sets from the dict for us.
3027-                continue
3028-            peerid = list(self.remaining_sharemap[shnum])[0]
3029-            # get_data will remove that peerid from the sharemap, and add the
3030-            # query to self._outstanding_queries
3031-            self._status.set_status("Retrieving More Shares")
3032-            self.get_data(shnum, peerid)
3033-            needed -= 1
3034-            if not needed:
3035+        This method will ensure that readers that we wish to mark bad
3036+        (for these reasons or other reasons) are not used for the rest
3037+        of the download. Additionally, it will attempt to tell the
3038+        remote peer (with no guarantee of success) that its share is
3039+        corrupt.
3040+        """
3041+        self.log("marking share %d on server %s as bad" % \
3042+                 (reader.shnum, reader))
3043+        prefix = self.verinfo[-2]
3044+        self.servermap.mark_bad_share(reader.peerid,
3045+                                      reader.shnum,
3046+                                      prefix)
3047+        self._remove_reader(reader)
3048+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3049+        self._status.problems[reader.peerid] = f
3050+        self._last_failure = f
3051+        self.notify_server_corruption(reader.peerid, reader.shnum,
3052+                                      str(f.value))
3053+
3054+
3055+    def _download_current_segment(self):
3056+        """
3057+        I download, validate, decode, decrypt, and assemble the segment
3058+        that this Retrieve is currently responsible for downloading.
3059+        """
3060+        assert len(self._active_readers) >= self._required_shares
3061+        if self._current_segment <= self._last_segment:
3062+            d = self._process_segment(self._current_segment)
3063+        else:
3064+            d = defer.succeed(None)
3065+        d.addCallback(self._check_for_done)
3066+        return d
3067+
3068+
3069+    def _process_segment(self, segnum):
3070+        """
3071+        I download, validate, decode, and decrypt one segment of the
3072+        file that this Retrieve is retrieving. This means coordinating
3073+        the process of getting k blocks of that file, validating them,
3074+        assembling them into one segment with the decoder, and then
3075+        decrypting them.
3076+        """
3077+        self.log("processing segment %d" % segnum)
3078+
3079+        # TODO: The old code uses a marker. Should this code do that
3080+        # too? What did the Marker do?
3081+        assert len(self._active_readers) >= self._required_shares
3082+
3083+        # We need to ask each of our active readers for its block and
3084+        # salt. We will then validate those. If validation is
3085+        # successful, we will assemble the results into plaintext.
3086+        ds = []
3087+        for reader in self._active_readers:
3088+            d = reader.get_block_and_salt(segnum, queue=True)
3089+            d2 = self._get_needed_hashes(reader, segnum)
3090+            dl = defer.DeferredList([d, d2], consumeErrors=True)
3091+            dl.addCallback(self._validate_block, segnum, reader)
3092+            dl.addErrback(self._validation_or_decoding_failed, [reader])
3093+            ds.append(dl)
3094+            reader.flush()
3095+        dl = defer.DeferredList(ds)
3096+        if self._verify:
3097+            dl.addCallback(lambda ignored: "")
3098+            dl.addCallback(self._set_segment)
3099+        else:
3100+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3101+        return dl
3102+
3103+
3104+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
3105+        """
3106+        I take the results of fetching and validating the blocks from a
3107+        callback chain in another method. If the results are such that
3108+        they tell me that validation and fetching succeeded without
3109+        incident, I will proceed with decoding and decryption.
3110+        Otherwise, I will do nothing.
3111+        """
3112+        self.log("trying to decode and decrypt segment %d" % segnum)
3113+        failures = False
3114+        for block_and_salt in blocks_and_salts:
3115+            if not block_and_salt[0] or block_and_salt[1] == None:
3116+                self.log("some validation operations failed; not proceeding")
3117+                failures = True
3118                 break
3119hunk ./src/allmydata/mutable/retrieve.py 720
3120+        if not failures:
3121+            self.log("everything looks ok, building segment %d" % segnum)
3122+            d = self._decode_blocks(blocks_and_salts, segnum)
3123+            d.addCallback(self._decrypt_segment)
3124+            d.addErrback(self._validation_or_decoding_failed,
3125+                         self._active_readers)
3126+            # check to see whether we've been paused before writing
3127+            # anything.
3128+            d.addCallback(self._check_for_paused)
3129+            d.addCallback(self._set_segment)
3130+            return d
3131+        else:
3132+            return defer.succeed(None)
3133+
3134+
3135+    def _set_segment(self, segment):
3136+        """
3137+        Given a plaintext segment, I register that segment with the
3138+        target that is handling the file download.
3139+        """
3140+        self.log("got plaintext for segment %d" % self._current_segment)
3141+        if self._current_segment == self._start_segment:
3142+            # We're on the first segment. It's possible that we want
3143+            # only some part of the end of this segment, and that we
3144+            # just downloaded the whole thing to get that part. If so,
3145+            # we need to account for that and give the reader just the
3146+            # data that they want.
3147+            n = self._offset % self._segment_size
3148+            self.log("stripping %d bytes off of the first segment" % n)
3149+            self.log("original segment length: %d" % len(segment))
3150+            segment = segment[n:]
3151+            self.log("new segment length: %d" % len(segment))
3152+
3153+        if self._current_segment == self._last_segment and self._read_length is not None:
3154+            # We're on the last segment. It's possible that we only want
3155+            # part of the beginning of this segment, and that we
3156+            # downloaded the whole thing anyway. Make sure to give the
3157+            # caller only the portion of the segment that they want to
3158+            # receive.
3159+            extra = self._read_length
3160+            if self._start_segment != self._last_segment:
3161+                extra -= self._segment_size - \
3162+                            (self._offset % self._segment_size)
3163+            extra %= self._segment_size
3164+            self.log("original segment length: %d" % len(segment))
3165+            segment = segment[:extra]
3166+            self.log("new segment length: %d" % len(segment))
3167+            self.log("only taking %d bytes of the last segment" % extra)
3168+
3169+        if not self._verify:
3170+            self._consumer.write(segment)
3171+        else:
3172+            # we don't care about the plaintext if we are doing a verify.
3173+            segment = None
3174+        self._current_segment += 1
3175 
3176hunk ./src/allmydata/mutable/retrieve.py 776
3177-        # at this point, we have as many outstanding queries as we can. If
3178-        # needed!=0 then we might not have enough to recover the file.
3179-        if needed:
3180-            format = ("ran out of peers: "
3181-                      "have %(have)d shares (k=%(k)d), "
3182-                      "%(outstanding)d queries in flight, "
3183-                      "need %(need)d more, "
3184-                      "found %(bad)d bad shares")
3185-            args = {"have": len(self.shares),
3186-                    "k": k,
3187-                    "outstanding": len(self._outstanding_queries),
3188-                    "need": needed,
3189-                    "bad": len(self._bad_shares),
3190-                    }
3191-            self.log(format=format,
3192-                     level=log.WEIRD, umid="ezTfjw", **args)
3193-            err = NotEnoughSharesError("%s, last failure: %s" %
3194-                                      (format % args, self._last_failure))
3195-            if self._bad_shares:
3196-                self.log("We found some bad shares this pass. You should "
3197-                         "update the servermap and try again to check "
3198-                         "more peers",
3199-                         level=log.WEIRD, umid="EFkOlA")
3200-                err.servermap = self.servermap
3201-            raise err
3202 
3203hunk ./src/allmydata/mutable/retrieve.py 777
3204+    def _validation_or_decoding_failed(self, f, readers):
3205+        """
3206+        I am called when a block or a salt fails to correctly validate, or when
3207+        the decryption or decoding operation fails for some reason.  I react to
3208+        this failure by notifying the remote server of corruption, and then
3209+        removing the remote peer from further activity.
3210+        """
3211+        assert isinstance(readers, list)
3212+        bad_shnums = [reader.shnum for reader in readers]
3213+
3214+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
3215+                 ", segment %d: %s" % \
3216+                 (bad_shnums, readers, self._current_segment, str(f)))
3217+        for reader in readers:
3218+            self._mark_bad_share(reader, f)
3219         return
3220 
3221hunk ./src/allmydata/mutable/retrieve.py 794
3222-    def _decode(self):
3223-        started = time.time()
3224-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3225-         offsets_tuple) = self.verinfo
3226 
3227hunk ./src/allmydata/mutable/retrieve.py 795
3228-        # shares_dict is a dict mapping shnum to share data, but the codec
3229-        # wants two lists.
3230-        shareids = []; shares = []
3231-        for shareid, share in self.shares.items():
3232+    def _validate_block(self, results, segnum, reader):
3233+        """
3234+        I validate a block from one share on a remote server.
3235+        """
3236+        # Grab the part of the block hash tree that is necessary to
3237+        # validate this block, then generate the block hash root.
3238+        self.log("validating share %d for segment %d" % (reader.shnum,
3239+                                                             segnum))
3240+        # Did we fail to fetch either of the things that we were
3241+        # supposed to? Fail if so.
3242+        if not results[0][0] and results[1][0]:
3243+            # handled by the errback handler.
3244+
3245+            # These all get batched into one query, so the resulting
3246+            # failure should be the same for all of them, so we can just
3247+            # use the first one.
3248+            assert isinstance(results[0][1], failure.Failure)
3249+
3250+            f = results[0][1]
3251+            raise CorruptShareError(reader.peerid,
3252+                                    reader.shnum,
3253+                                    "Connection error: %s" % str(f))
3254+
3255+        block_and_salt, block_and_sharehashes = results
3256+        block, salt = block_and_salt[1]
3257+        blockhashes, sharehashes = block_and_sharehashes[1]
3258+
3259+        blockhashes = dict(enumerate(blockhashes[1]))
3260+        self.log("the reader gave me the following blockhashes: %s" % \
3261+                 blockhashes.keys())
3262+        self.log("the reader gave me the following sharehashes: %s" % \
3263+                 sharehashes[1].keys())
3264+        bht = self._block_hash_trees[reader.shnum]
3265+
3266+        if bht.needed_hashes(segnum, include_leaf=True):
3267+            try:
3268+                bht.set_hashes(blockhashes)
3269+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
3270+                    IndexError), e:
3271+                raise CorruptShareError(reader.peerid,
3272+                                        reader.shnum,
3273+                                        "block hash tree failure: %s" % e)
3274+
3275+        if self._version == MDMF_VERSION:
3276+            blockhash = hashutil.block_hash(salt + block)
3277+        else:
3278+            blockhash = hashutil.block_hash(block)
3279+        # If this works without an error, then validation is
3280+        # successful.
3281+        try:
3282+           bht.set_hashes(leaves={segnum: blockhash})
3283+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
3284+                IndexError), e:
3285+            raise CorruptShareError(reader.peerid,
3286+                                    reader.shnum,
3287+                                    "block hash tree failure: %s" % e)
3288+
3289+        # Reaching this point means that we know that this segment
3290+        # is correct. Now we need to check to see whether the share
3291+        # hash chain is also correct.
3292+        # SDMF wrote share hash chains that didn't contain the
3293+        # leaves, which would be produced from the block hash tree.
3294+        # So we need to validate the block hash tree first. If
3295+        # successful, then bht[0] will contain the root for the
3296+        # shnum, which will be a leaf in the share hash tree, which
3297+        # will allow us to validate the rest of the tree.
3298+        if self.share_hash_tree.needed_hashes(reader.shnum,
3299+                                              include_leaf=True) or \
3300+                                              self._verify:
3301+            try:
3302+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3303+                                            leaves={reader.shnum: bht[0]})
3304+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
3305+                    IndexError), e:
3306+                raise CorruptShareError(reader.peerid,
3307+                                        reader.shnum,
3308+                                        "corrupt hashes: %s" % e)
3309+
3310+        self.log('share %d is valid for segment %d' % (reader.shnum,
3311+                                                       segnum))
3312+        return {reader.shnum: (block, salt)}
3313+
3314+
3315+    def _get_needed_hashes(self, reader, segnum):
3316+        """
3317+        I get the hashes needed to validate segnum from the reader, then return
3318+        to my caller when this is done.
3319+        """
3320+        bht = self._block_hash_trees[reader.shnum]
3321+        needed = bht.needed_hashes(segnum, include_leaf=True)
3322+        # The root of the block hash tree is also a leaf in the share
3323+        # hash tree. So we don't need to fetch it from the remote
3324+        # server. In the case of files with one segment, this means that
3325+        # we won't fetch any block hash tree from the remote server,
3326+        # since the hash of each share of the file is the entire block
3327+        # hash tree, and is a leaf in the share hash tree. This is fine,
3328+        # since any share corruption will be detected in the share hash
3329+        # tree.
3330+        #needed.discard(0)
3331+        self.log("getting blockhashes for segment %d, share %d: %s" % \
3332+                 (segnum, reader.shnum, str(needed)))
3333+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
3334+        if self.share_hash_tree.needed_hashes(reader.shnum):
3335+            need = self.share_hash_tree.needed_hashes(reader.shnum)
3336+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
3337+                                                                 str(need)))
3338+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
3339+        else:
3340+            d2 = defer.succeed({}) # the logic in the next method
3341+                                   # expects a dict
3342+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
3343+        return dl
3344+
3345+
3346+    def _decode_blocks(self, blocks_and_salts, segnum):
3347+        """
3348+        I take a list of k blocks and salts, and decode that into a
3349+        single encrypted segment.
3350+        """
3351+        d = {}
3352+        # We want to merge our dictionaries to the form
3353+        # {shnum: blocks_and_salts}
3354+        #
3355+        # The dictionaries come from validate block that way, so we just
3356+        # need to merge them.
3357+        for block_and_salt in blocks_and_salts:
3358+            d.update(block_and_salt[1])
3359+
3360+        # All of these blocks should have the same salt; in SDMF, it is
3361+        # the file-wide IV, while in MDMF it is the per-segment salt. In
3362+        # either case, we just need to get one of them and use it.
3363+        #
3364+        # d.items()[0] is like (shnum, (block, salt))
3365+        # d.items()[0][1] is like (block, salt)
3366+        # d.items()[0][1][1] is the salt.
3367+        salt = d.items()[0][1][1]
3368+        # Next, extract just the blocks from the dict. We'll use the
3369+        # salt in the next step.
3370+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
3371+        d2 = dict(share_and_shareids)
3372+        shareids = []
3373+        shares = []
3374+        for shareid, share in d2.items():
3375             shareids.append(shareid)
3376             shares.append(share)
3377 
3378hunk ./src/allmydata/mutable/retrieve.py 941
3379-        assert len(shareids) >= k, len(shareids)
3380+        assert len(shareids) >= self._required_shares, len(shareids)
3381         # zfec really doesn't want extra shares
3382hunk ./src/allmydata/mutable/retrieve.py 943
3383-        shareids = shareids[:k]
3384-        shares = shares[:k]
3385-
3386-        fec = codec.CRSDecoder()
3387-        fec.set_params(segsize, k, N)
3388-
3389-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
3390-        self.log("about to decode, shareids=%s" % (shareids,))
3391-        d = defer.maybeDeferred(fec.decode, shares, shareids)
3392-        def _done(buffers):
3393-            self._status.timings["decode"] = time.time() - started
3394-            self.log(" decode done, %d buffers" % len(buffers))
3395+        shareids = shareids[:self._required_shares]
3396+        shares = shares[:self._required_shares]
3397+        self.log("decoding segment %d" % segnum)
3398+        if segnum == self._num_segments - 1:
3399+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
3400+        else:
3401+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
3402+        def _process(buffers):
3403             segment = "".join(buffers)
3404hunk ./src/allmydata/mutable/retrieve.py 952
3405+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
3406+                     segnum=segnum,
3407+                     numsegs=self._num_segments,
3408+                     level=log.NOISY)
3409             self.log(" joined length %d, datalength %d" %
3410hunk ./src/allmydata/mutable/retrieve.py 957
3411-                     (len(segment), datalength))
3412-            segment = segment[:datalength]
3413+                     (len(segment), self._data_length))
3414+            if segnum == self._num_segments - 1:
3415+                size_to_use = self._tail_data_size
3416+            else:
3417+                size_to_use = self._segment_size
3418+            segment = segment[:size_to_use]
3419             self.log(" segment len=%d" % len(segment))
3420hunk ./src/allmydata/mutable/retrieve.py 964
3421-            return segment
3422-        def _err(f):
3423-            self.log(" decode failed: %s" % f)
3424-            return f
3425-        d.addCallback(_done)
3426-        d.addErrback(_err)
3427+            return segment, salt
3428+        d.addCallback(_process)
3429         return d
3430 
3431hunk ./src/allmydata/mutable/retrieve.py 968
3432-    def _decrypt(self, crypttext, IV, readkey):
3433+
3434+    def _decrypt_segment(self, segment_and_salt):
3435+        """
3436+        I take a single segment and its salt, and decrypt it. I return
3437+        the plaintext of the segment that is in my argument.
3438+        """
3439+        segment, salt = segment_and_salt
3440         self._status.set_status("decrypting")
3441hunk ./src/allmydata/mutable/retrieve.py 976
3442+        self.log("decrypting segment %d" % self._current_segment)
3443         started = time.time()
3444hunk ./src/allmydata/mutable/retrieve.py 978
3445-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
3446+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
3447         decryptor = AES(key)
3448hunk ./src/allmydata/mutable/retrieve.py 980
3449-        plaintext = decryptor.process(crypttext)
3450+        plaintext = decryptor.process(segment)
3451         self._status.timings["decrypt"] = time.time() - started
3452         return plaintext
3453 
3454hunk ./src/allmydata/mutable/retrieve.py 984
3455-    def _done(self, res):
3456-        if not self._running:
3457+
3458+    def notify_server_corruption(self, peerid, shnum, reason):
3459+        ss = self.servermap.connections[peerid]
3460+        ss.callRemoteOnly("advise_corrupt_share",
3461+                          "mutable", self._storage_index, shnum, reason)
3462+
3463+
3464+    def _try_to_validate_privkey(self, enc_privkey, reader):
3465+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3466+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3467+        if alleged_writekey != self._node.get_writekey():
3468+            self.log("invalid privkey from %s shnum %d" %
3469+                     (reader, reader.shnum),
3470+                     level=log.WEIRD, umid="YIw4tA")
3471+            if self._verify:
3472+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3473+                                              self.verinfo[-2])
3474+                e = CorruptShareError(reader.peerid,
3475+                                      reader.shnum,
3476+                                      "invalid privkey")
3477+                f = failure.Failure(e)
3478+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3479             return
3480hunk ./src/allmydata/mutable/retrieve.py 1007
3481-        self._running = False
3482-        self._status.set_active(False)
3483-        self._status.timings["total"] = time.time() - self._started
3484-        # res is either the new contents, or a Failure
3485-        if isinstance(res, failure.Failure):
3486-            self.log("Retrieve done, with failure", failure=res,
3487-                     level=log.UNUSUAL)
3488-            self._status.set_status("Failed")
3489+
3490+        # it's good
3491+        self.log("got valid privkey from shnum %d on reader %s" %
3492+                 (reader.shnum, reader))
3493+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
3494+        self._node._populate_encprivkey(enc_privkey)
3495+        self._node._populate_privkey(privkey)
3496+        self._need_privkey = False
3497+
3498+
3499+    def _check_for_done(self, res):
3500+        """
3501+        I check to see if this Retrieve object has successfully finished
3502+        its work.
3503+
3504+        I can exit in the following ways:
3505+            - If there are no more segments to download, then I exit by
3506+              causing self._done_deferred to fire with the plaintext
3507+              content requested by the caller.
3508+            - If there are still segments to be downloaded, and there
3509+              are enough active readers (readers which have not broken
3510+              and have not given us corrupt data) to continue
3511+              downloading, I send control back to
3512+              _download_current_segment.
3513+            - If there are still segments to be downloaded but there are
3514+              not enough active peers to download them, I ask
3515+              _add_active_peers to add more peers. If it is successful,
3516+              it will call _download_current_segment. If there are not
3517+              enough peers to retrieve the file, then that will cause
3518+              _done_deferred to errback.
3519+        """
3520+        self.log("checking for doneness")
3521+        if self._current_segment > self._last_segment:
3522+            # No more segments to download, we're done.
3523+            self.log("got plaintext, done")
3524+            return self._done()
3525+
3526+        if len(self._active_readers) >= self._required_shares:
3527+            # More segments to download, but we have enough good peers
3528+            # in self._active_readers that we can do that without issue,
3529+            # so go nab the next segment.
3530+            self.log("not done yet: on segment %d of %d" % \
3531+                     (self._current_segment + 1, self._num_segments))
3532+            return self._download_current_segment()
3533+
3534+        self.log("not done yet: on segment %d of %d, need to add peers" % \
3535+                 (self._current_segment + 1, self._num_segments))
3536+        return self._add_active_peers()
3537+
3538+
3539+    def _done(self):
3540+        """
3541+        I am called by _check_for_done when the download process has
3542+        finished successfully. After making some useful logging
3543+        statements, I return the decrypted contents to the owner of this
3544+        Retrieve object through self._done_deferred.
3545+        """
3546+        if self._verify:
3547+            ret = list(self._bad_shares)
3548+            self.log("done verifying, found %d bad shares" % len(ret))
3549         else:
3550hunk ./src/allmydata/mutable/retrieve.py 1068
3551-            self.log("Retrieve done, success!")
3552-            self._status.set_status("Finished")
3553-            self._status.set_progress(1.0)
3554-            # remember the encoding parameters, use them again next time
3555-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3556-             offsets_tuple) = self.verinfo
3557-            self._node._populate_required_shares(k)
3558-            self._node._populate_total_shares(N)
3559-        eventually(self._done_deferred.callback, res)
3560+            # TODO: upload status here?
3561+            ret = self._consumer
3562+            self._consumer.unregisterProducer()
3563+        eventually(self._done_deferred.callback, ret)
3564+
3565 
3566hunk ./src/allmydata/mutable/retrieve.py 1074
3567+    def _failed(self):
3568+        """
3569+        I am called by _add_active_peers when there are not enough
3570+        active peers left to complete the download. After making some
3571+        useful logging statements, I return an exception to that effect
3572+        to the caller of this Retrieve object through
3573+        self._done_deferred.
3574+        """
3575+        if self._verify:
3576+            ret = list(self._bad_shares)
3577+        else:
3578+            format = ("ran out of peers: "
3579+                      "have %(have)d of %(total)d segments "
3580+                      "found %(bad)d bad shares "
3581+                      "encoding %(k)d-of-%(n)d")
3582+            args = {"have": self._current_segment,
3583+                    "total": self._num_segments,
3584+                    "need": self._last_segment,
3585+                    "k": self._required_shares,
3586+                    "n": self._total_shares,
3587+                    "bad": len(self._bad_shares)}
3588+            e = NotEnoughSharesError("%s, last failure: %s" % \
3589+                                     (format % args, str(self._last_failure)))
3590+            f = failure.Failure(e)
3591+            ret = f
3592+        eventually(self._done_deferred.callback, ret)
3593}
3594[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
3595Kevan Carstensen <kevan@isnotajoke.com>**20100809232514
3596 Ignore-this: 1bcef2f262c868f61e57cc19a3cac89a
3597 
3598 The checker and repairer required minimal changes to work with the MDMF
3599 modifications made elsewhere. The checker duplicated a lot of the code
3600 that was already in the downloader, so I modified the downloader
3601 slightly to expose this functionality to the checker and removed the
3602 duplicated code. The repairer only required a minor change to deal with
3603 data representation.
3604] {
3605hunk ./src/allmydata/mutable/checker.py 12
3606 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3607 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3608 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3609+from allmydata.mutable.retrieve import Retrieve # for verifying
3610 
3611 class MutableChecker:
3612 
3613hunk ./src/allmydata/mutable/checker.py 29
3614 
3615     def check(self, verify=False, add_lease=False):
3616         servermap = ServerMap()
3617+        # Updating the servermap in MODE_CHECK will stand a good chance
3618+        # of finding all of the shares, and getting a good idea of
3619+        # recoverability, etc, without verifying.
3620         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3621                              servermap, MODE_CHECK, add_lease=add_lease)
3622         if self._history:
3623hunk ./src/allmydata/mutable/checker.py 55
3624         if num_recoverable:
3625             self.best_version = servermap.best_recoverable_version()
3626 
3627+        # The file is unhealthy and needs to be repaired if:
3628+        # - There are unrecoverable versions.
3629         if servermap.unrecoverable_versions():
3630             self.need_repair = True
3631hunk ./src/allmydata/mutable/checker.py 59
3632+        # - There isn't a recoverable version.
3633         if num_recoverable != 1:
3634             self.need_repair = True
3635hunk ./src/allmydata/mutable/checker.py 62
3636+        # - The best recoverable version is missing some shares.
3637         if self.best_version:
3638             available_shares = servermap.shares_available()
3639             (num_distinct_shares, k, N) = available_shares[self.best_version]
3640hunk ./src/allmydata/mutable/checker.py 73
3641 
3642     def _verify_all_shares(self, servermap):
3643         # read every byte of each share
3644+        #
3645+        # This logic is going to be very nearly the same as the
3646+        # downloader. I bet we could pass the downloader a flag that
3647+        # makes it do this, and piggyback onto that instead of
3648+        # duplicating a bunch of code.
3649+        #
3650+        # Like:
3651+        #  r = Retrieve(blah, blah, blah, verify=True)
3652+        #  d = r.download()
3653+        #  (wait, wait, wait, d.callback)
3654+        # 
3655+        #  Then, when it has finished, we can check the servermap (which
3656+        #  we provided to Retrieve) to figure out which shares are bad,
3657+        #  since the Retrieve process will have updated the servermap as
3658+        #  it went along.
3659+        #
3660+        #  By passing the verify=True flag to the constructor, we are
3661+        #  telling the downloader a few things.
3662+        #
3663+        #  1. It needs to download all N shares, not just K shares.
3664+        #  2. It doesn't need to decrypt or decode the shares, only
3665+        #     verify them.
3666         if not self.best_version:
3667             return
3668hunk ./src/allmydata/mutable/checker.py 97
3669-        versionmap = servermap.make_versionmap()
3670-        shares = versionmap[self.best_version]
3671-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3672-         offsets_tuple) = self.best_version
3673-        offsets = dict(offsets_tuple)
3674-        readv = [ (0, offsets["EOF"]) ]
3675-        dl = []
3676-        for (shnum, peerid, timestamp) in shares:
3677-            ss = servermap.connections[peerid]
3678-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3679-            d.addCallback(self._got_answer, peerid, servermap)
3680-            dl.append(d)
3681-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3682 
3683hunk ./src/allmydata/mutable/checker.py 98
3684-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3685-        # isolate the callRemote to a separate method, so tests can subclass
3686-        # Publish and override it
3687-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3688+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3689+        d = r.download()
3690+        d.addCallback(self._process_bad_shares)
3691         return d
3692 
3693hunk ./src/allmydata/mutable/checker.py 103
3694-    def _got_answer(self, datavs, peerid, servermap):
3695-        for shnum,datav in datavs.items():
3696-            data = datav[0]
3697-            try:
3698-                self._got_results_one_share(shnum, peerid, data)
3699-            except CorruptShareError:
3700-                f = failure.Failure()
3701-                self.need_repair = True
3702-                self.bad_shares.append( (peerid, shnum, f) )
3703-                prefix = data[:SIGNED_PREFIX_LENGTH]
3704-                servermap.mark_bad_share(peerid, shnum, prefix)
3705-                ss = servermap.connections[peerid]
3706-                self.notify_server_corruption(ss, shnum, str(f.value))
3707-
3708-    def check_prefix(self, peerid, shnum, data):
3709-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3710-         offsets_tuple) = self.best_version
3711-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3712-        if got_prefix != prefix:
3713-            raise CorruptShareError(peerid, shnum,
3714-                                    "prefix mismatch: share changed while we were reading it")
3715-
3716-    def _got_results_one_share(self, shnum, peerid, data):
3717-        self.check_prefix(peerid, shnum, data)
3718-
3719-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3720-        # which checks their signature against the pubkey known to be
3721-        # associated with this file.
3722 
3723hunk ./src/allmydata/mutable/checker.py 104
3724-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3725-         share_hash_chain, block_hash_tree, share_data,
3726-         enc_privkey) = unpack_share(data)
3727-
3728-        # validate [share_hash_chain,block_hash_tree,share_data]
3729-
3730-        leaves = [hashutil.block_hash(share_data)]
3731-        t = hashtree.HashTree(leaves)
3732-        if list(t) != block_hash_tree:
3733-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3734-        share_hash_leaf = t[0]
3735-        t2 = hashtree.IncompleteHashTree(N)
3736-        # root_hash was checked by the signature
3737-        t2.set_hashes({0: root_hash})
3738-        try:
3739-            t2.set_hashes(hashes=share_hash_chain,
3740-                          leaves={shnum: share_hash_leaf})
3741-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3742-                IndexError), e:
3743-            msg = "corrupt hashes: %s" % (e,)
3744-            raise CorruptShareError(peerid, shnum, msg)
3745-
3746-        # validate enc_privkey: only possible if we have a write-cap
3747-        if not self._node.is_readonly():
3748-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3749-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3750-            if alleged_writekey != self._node.get_writekey():
3751-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3752+    def _process_bad_shares(self, bad_shares):
3753+        if bad_shares:
3754+            self.need_repair = True
3755+        self.bad_shares = bad_shares
3756 
3757hunk ./src/allmydata/mutable/checker.py 109
3758-    def notify_server_corruption(self, ss, shnum, reason):
3759-        ss.callRemoteOnly("advise_corrupt_share",
3760-                          "mutable", self._storage_index, shnum, reason)
3761 
3762     def _count_shares(self, smap, version):
3763         available_shares = smap.shares_available()
3764hunk ./src/allmydata/mutable/repairer.py 5
3765 from zope.interface import implements
3766 from twisted.internet import defer
3767 from allmydata.interfaces import IRepairResults, ICheckResults
3768+from allmydata.mutable.publish import MutableData
3769 
3770 class RepairResults:
3771     implements(IRepairResults)
3772hunk ./src/allmydata/mutable/repairer.py 108
3773             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
3774 
3775         d = self.node.download_version(smap, best_version, fetch_privkey=True)
3776+        d.addCallback(lambda data:
3777+            MutableData(data))
3778         d.addCallback(self.node.upload, smap)
3779         d.addCallback(self.get_results, smap)
3780         return d
3781}
3782[mutable/filenode.py: add versions and partial-file updates to the mutable file node
3783Kevan Carstensen <kevan@isnotajoke.com>**20100809232741
3784 Ignore-this: c886a359bd1ab2122a9ca2b8c04d48ec
3785 
3786 One of the goals of MDMF as a GSoC project is to lay the groundwork for
3787 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
3788 multiple versions of a single cap on the grid. In line with this, there
3789 is a now a distinction between an overriding mutable file (which can be
3790 thought to correspond to the cap/unique identifier for that mutable
3791 file) and versions of the mutable file (which we can download, update,
3792 and so on). All download, upload, and modification operations end up
3793 happening on a particular version of a mutable file, but there are
3794 shortcut methods on the object representing the overriding mutable file
3795 that perform these operations on the best version of the mutable file
3796 (which is what code should be doing until we have LDMF and better
3797 support for other paradigms).
3798 
3799 Another goal of MDMF was to take advantage of segmentation to give
3800 callers more efficient partial file updates or appends. This patch
3801 implements methods that do that, too.
3802 
3803] {
3804hunk ./src/allmydata/mutable/filenode.py 7
3805 from zope.interface import implements
3806 from twisted.internet import defer, reactor
3807 from foolscap.api import eventually
3808-from allmydata.interfaces import IMutableFileNode, \
3809-     ICheckable, ICheckResults, NotEnoughSharesError
3810-from allmydata.util import hashutil, log
3811+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
3812+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
3813+     IMutableFileVersion, IWritable
3814+from allmydata import hashtree
3815+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
3816 from allmydata.util.assertutil import precondition
3817 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
3818 from allmydata.monitor import Monitor
3819hunk ./src/allmydata/mutable/filenode.py 17
3820 from pycryptopp.cipher.aes import AES
3821 
3822-from allmydata.mutable.publish import Publish
3823+from allmydata.mutable.publish import Publish, MutableFileHandle, \
3824+                                      MutableData,\
3825+                                      DEFAULT_MAX_SEGMENT_SIZE, \
3826+                                      TransformingUploadable
3827 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
3828      ResponseCache, UncoordinatedWriteError
3829 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3830hunk ./src/allmydata/mutable/filenode.py 72
3831         self._sharemap = {} # known shares, shnum-to-[nodeids]
3832         self._cache = ResponseCache()
3833         self._most_recent_size = None
3834+        # filled in after __init__ if we're being created for the first time;
3835+        # filled in by the servermap updater before publishing, otherwise.
3836+        # set to this default value in case neither of those things happen,
3837+        # or in case the servermap can't find any shares to tell us what
3838+        # to publish as.
3839+        # TODO: Set this back to None, and find out why the tests fail
3840+        #       with it set to None.
3841+        self._protocol_version = SDMF_VERSION
3842 
3843         # all users of this MutableFileNode go through the serializer. This
3844         # takes advantage of the fact that Deferreds discard the callbacks
3845hunk ./src/allmydata/mutable/filenode.py 136
3846         return self._upload(initial_contents, None)
3847 
3848     def _get_initial_contents(self, contents):
3849-        if isinstance(contents, str):
3850-            return contents
3851         if contents is None:
3852hunk ./src/allmydata/mutable/filenode.py 137
3853-            return ""
3854+            return MutableData("")
3855+
3856+        if IMutableUploadable.providedBy(contents):
3857+            return contents
3858+
3859         assert callable(contents), "%s should be callable, not %s" % \
3860                (contents, type(contents))
3861         return contents(self)
3862hunk ./src/allmydata/mutable/filenode.py 211
3863 
3864     def get_size(self):
3865         return self._most_recent_size
3866+
3867     def get_current_size(self):
3868         d = self.get_size_of_best_version()
3869         d.addCallback(self._stash_size)
3870hunk ./src/allmydata/mutable/filenode.py 216
3871         return d
3872+
3873     def _stash_size(self, size):
3874         self._most_recent_size = size
3875         return size
3876hunk ./src/allmydata/mutable/filenode.py 275
3877             return cmp(self.__class__, them.__class__)
3878         return cmp(self._uri, them._uri)
3879 
3880-    def _do_serialized(self, cb, *args, **kwargs):
3881-        # note: to avoid deadlock, this callable is *not* allowed to invoke
3882-        # other serialized methods within this (or any other)
3883-        # MutableFileNode. The callable should be a bound method of this same
3884-        # MFN instance.
3885-        d = defer.Deferred()
3886-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
3887-        # we need to put off d.callback until this Deferred is finished being
3888-        # processed. Otherwise the caller's subsequent activities (like,
3889-        # doing other things with this node) can cause reentrancy problems in
3890-        # the Deferred code itself
3891-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
3892-        # add a log.err just in case something really weird happens, because
3893-        # self._serializer stays around forever, therefore we won't see the
3894-        # usual Unhandled Error in Deferred that would give us a hint.
3895-        self._serializer.addErrback(log.err)
3896-        return d
3897 
3898     #################################
3899     # ICheckable
3900hunk ./src/allmydata/mutable/filenode.py 300
3901 
3902 
3903     #################################
3904-    # IMutableFileNode
3905+    # IFileNode
3906+
3907+    def get_best_readable_version(self):
3908+        """
3909+        I return a Deferred that fires with a MutableFileVersion
3910+        representing the best readable version of the file that I
3911+        represent
3912+        """
3913+        return self.get_readable_version()
3914+
3915+
3916+    def get_readable_version(self, servermap=None, version=None):
3917+        """
3918+        I return a Deferred that fires with an MutableFileVersion for my
3919+        version argument, if there is a recoverable file of that version
3920+        on the grid. If there is no recoverable version, I fire with an
3921+        UnrecoverableFileError.
3922+
3923+        If a servermap is provided, I look in there for the requested
3924+        version. If no servermap is provided, I create and update a new
3925+        one.
3926+
3927+        If no version is provided, then I return a MutableFileVersion
3928+        representing the best recoverable version of the file.
3929+        """
3930+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
3931+        def _build_version((servermap, their_version)):
3932+            assert their_version in servermap.recoverable_versions()
3933+            assert their_version in servermap.make_versionmap()
3934+
3935+            mfv = MutableFileVersion(self,
3936+                                     servermap,
3937+                                     their_version,
3938+                                     self._storage_index,
3939+                                     self._storage_broker,
3940+                                     self._readkey,
3941+                                     history=self._history)
3942+            assert mfv.is_readonly()
3943+            # our caller can use this to download the contents of the
3944+            # mutable file.
3945+            return mfv
3946+        return d.addCallback(_build_version)
3947+
3948+
3949+    def _get_version_from_servermap(self,
3950+                                    mode,
3951+                                    servermap=None,
3952+                                    version=None):
3953+        """
3954+        I return a Deferred that fires with (servermap, version).
3955+
3956+        This function performs validation and a servermap update. If it
3957+        returns (servermap, version), the caller can assume that:
3958+            - servermap was last updated in mode.
3959+            - version is recoverable, and corresponds to the servermap.
3960+
3961+        If version and servermap are provided to me, I will validate
3962+        that version exists in the servermap, and that the servermap was
3963+        updated correctly.
3964+
3965+        If version is not provided, but servermap is, I will validate
3966+        the servermap and return the best recoverable version that I can
3967+        find in the servermap.
3968+
3969+        If the version is provided but the servermap isn't, I will
3970+        obtain a servermap that has been updated in the correct mode and
3971+        validate that version is found and recoverable.
3972+
3973+        If neither servermap nor version are provided, I will obtain a
3974+        servermap updated in the correct mode, and return the best
3975+        recoverable version that I can find in there.
3976+        """
3977+        # XXX: wording ^^^^
3978+        if servermap and servermap.last_update_mode == mode:
3979+            d = defer.succeed(servermap)
3980+        else:
3981+            d = self._get_servermap(mode)
3982+
3983+        def _get_version(servermap, version):
3984+            if version and version not in servermap.recoverable_versions():
3985+                version = None
3986+            else:
3987+                version = servermap.best_recoverable_version()
3988+            if not version:
3989+                raise UnrecoverableFileError("no recoverable versions")
3990+            return (servermap, version)
3991+        return d.addCallback(_get_version, version)
3992+
3993 
3994     def download_best_version(self):
3995hunk ./src/allmydata/mutable/filenode.py 390
3996+        """
3997+        I return a Deferred that fires with the contents of the best
3998+        version of this mutable file.
3999+        """
4000         return self._do_serialized(self._download_best_version)
4001hunk ./src/allmydata/mutable/filenode.py 395
4002+
4003+
4004     def _download_best_version(self):
4005hunk ./src/allmydata/mutable/filenode.py 398
4006-        servermap = ServerMap()
4007-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
4008-        def _maybe_retry(f):
4009-            f.trap(NotEnoughSharesError)
4010-            # the download is worth retrying once. Make sure to use the
4011-            # old servermap, since it is what remembers the bad shares,
4012-            # but use MODE_WRITE to make it look for even more shares.
4013-            # TODO: consider allowing this to retry multiple times.. this
4014-            # approach will let us tolerate about 8 bad shares, I think.
4015-            return self._try_once_to_download_best_version(servermap,
4016-                                                           MODE_WRITE)
4017+        """
4018+        I am the serialized sibling of download_best_version.
4019+        """
4020+        d = self.get_best_readable_version()
4021+        d.addCallback(self._record_size)
4022+        d.addCallback(lambda version: version.download_to_data())
4023+
4024+        # It is possible that the download will fail because there
4025+        # aren't enough shares to be had. If so, we will try again after
4026+        # updating the servermap in MODE_WRITE, which may find more
4027+        # shares than updating in MODE_READ, as we just did. We can do
4028+        # this by getting the best mutable version and downloading from
4029+        # that -- the best mutable version will be a MutableFileVersion
4030+        # with a servermap that was last updated in MODE_WRITE, as we
4031+        # want. If this fails, then we give up.
4032+        def _maybe_retry(failure):
4033+            failure.trap(NotEnoughSharesError)
4034+
4035+            d = self.get_best_mutable_version()
4036+            d.addCallback(self._record_size)
4037+            d.addCallback(lambda version: version.download_to_data())
4038+            return d
4039+
4040         d.addErrback(_maybe_retry)
4041         return d
4042hunk ./src/allmydata/mutable/filenode.py 423
4043-    def _try_once_to_download_best_version(self, servermap, mode):
4044-        d = self._update_servermap(servermap, mode)
4045-        d.addCallback(self._once_updated_download_best_version, servermap)
4046-        return d
4047-    def _once_updated_download_best_version(self, ignored, servermap):
4048-        goal = servermap.best_recoverable_version()
4049-        if not goal:
4050-            raise UnrecoverableFileError("no recoverable versions")
4051-        return self._try_once_to_download_version(servermap, goal)
4052+
4053+
4054+    def _record_size(self, mfv):
4055+        """
4056+        I record the size of a mutable file version.
4057+        """
4058+        self._most_recent_size = mfv.get_size()
4059+        return mfv
4060+
4061 
4062     def get_size_of_best_version(self):
4063hunk ./src/allmydata/mutable/filenode.py 434
4064-        d = self.get_servermap(MODE_READ)
4065-        def _got_servermap(smap):
4066-            ver = smap.best_recoverable_version()
4067-            if not ver:
4068-                raise UnrecoverableFileError("no recoverable version")
4069-            return smap.size_of_version(ver)
4070-        d.addCallback(_got_servermap)
4071-        return d
4072+        """
4073+        I return the size of the best version of this mutable file.
4074 
4075hunk ./src/allmydata/mutable/filenode.py 437
4076+        This is equivalent to calling get_size() on the result of
4077+        get_best_readable_version().
4078+        """
4079+        d = self.get_best_readable_version()
4080+        return d.addCallback(lambda mfv: mfv.get_size())
4081+
4082+
4083+    #################################
4084+    # IMutableFileNode
4085+
4086+    def get_best_mutable_version(self, servermap=None):
4087+        """
4088+        I return a Deferred that fires with a MutableFileVersion
4089+        representing the best readable version of the file that I
4090+        represent. I am like get_best_readable_version, except that I
4091+        will try to make a writable version if I can.
4092+        """
4093+        return self.get_mutable_version(servermap=servermap)
4094+
4095+
4096+    def get_mutable_version(self, servermap=None, version=None):
4097+        """
4098+        I return a version of this mutable file. I return a Deferred
4099+        that fires with a MutableFileVersion
4100+
4101+        If version is provided, the Deferred will fire with a
4102+        MutableFileVersion initailized with that version. Otherwise, it
4103+        will fire with the best version that I can recover.
4104+
4105+        If servermap is provided, I will use that to find versions
4106+        instead of performing my own servermap update.
4107+        """
4108+        if self.is_readonly():
4109+            return self.get_readable_version(servermap=servermap,
4110+                                             version=version)
4111+
4112+        # get_mutable_version => write intent, so we require that the
4113+        # servermap is updated in MODE_WRITE
4114+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
4115+        def _build_version((servermap, smap_version)):
4116+            # these should have been set by the servermap update.
4117+            assert self._secret_holder
4118+            assert self._writekey
4119+
4120+            mfv = MutableFileVersion(self,
4121+                                     servermap,
4122+                                     smap_version,
4123+                                     self._storage_index,
4124+                                     self._storage_broker,
4125+                                     self._readkey,
4126+                                     self._writekey,
4127+                                     self._secret_holder,
4128+                                     history=self._history)
4129+            assert not mfv.is_readonly()
4130+            return mfv
4131+
4132+        return d.addCallback(_build_version)
4133+
4134+
4135+    # XXX: I'm uncomfortable with the difference between upload and
4136+    #      overwrite, which, FWICT, is basically that you don't have to
4137+    #      do a servermap update before you overwrite. We split them up
4138+    #      that way anyway, so I guess there's no real difficulty in
4139+    #      offering both ways to callers, but it also makes the
4140+    #      public-facing API cluttery, and makes it hard to discern the
4141+    #      right way of doing things.
4142+
4143+    # In general, we leave it to callers to ensure that they aren't
4144+    # going to cause UncoordinatedWriteErrors when working with
4145+    # MutableFileVersions. We know that the next three operations
4146+    # (upload, overwrite, and modify) will all operate on the same
4147+    # version, so we say that only one of them can be going on at once,
4148+    # and serialize them to ensure that that actually happens, since as
4149+    # the caller in this situation it is our job to do that.
4150     def overwrite(self, new_contents):
4151hunk ./src/allmydata/mutable/filenode.py 512
4152+        """
4153+        I overwrite the contents of the best recoverable version of this
4154+        mutable file with new_contents. This is equivalent to calling
4155+        overwrite on the result of get_best_mutable_version with
4156+        new_contents as an argument. I return a Deferred that eventually
4157+        fires with the results of my replacement process.
4158+        """
4159         return self._do_serialized(self._overwrite, new_contents)
4160hunk ./src/allmydata/mutable/filenode.py 520
4161+
4162+
4163     def _overwrite(self, new_contents):
4164hunk ./src/allmydata/mutable/filenode.py 523
4165+        """
4166+        I am the serialized sibling of overwrite.
4167+        """
4168+        d = self.get_best_mutable_version()
4169+        return d.addCallback(lambda mfv: mfv.overwrite(new_contents))
4170+
4171+
4172+
4173+    def upload(self, new_contents, servermap):
4174+        """
4175+        I overwrite the contents of the best recoverable version of this
4176+        mutable file with new_contents, using servermap instead of
4177+        creating/updating our own servermap. I return a Deferred that
4178+        fires with the results of my upload.
4179+        """
4180+        return self._do_serialized(self._upload, new_contents, servermap)
4181+
4182+
4183+    def _upload(self, new_contents, servermap):
4184+        """
4185+        I am the serialized sibling of upload.
4186+        """
4187+        d = self.get_best_mutable_version(servermap)
4188+        return d.addCallback(lambda mfv: mfv.overwrite(new_contents))
4189+
4190+
4191+    def modify(self, modifier, backoffer=None):
4192+        """
4193+        I modify the contents of the best recoverable version of this
4194+        mutable file with the modifier. This is equivalent to calling
4195+        modify on the result of get_best_mutable_version. I return a
4196+        Deferred that eventually fires with an UploadResults instance
4197+        describing this process.
4198+        """
4199+        return self._do_serialized(self._modify, modifier, backoffer)
4200+
4201+
4202+    def _modify(self, modifier, backoffer):
4203+        """
4204+        I am the serialized sibling of modify.
4205+        """
4206+        d = self.get_best_mutable_version()
4207+        return d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
4208+
4209+
4210+    def download_version(self, servermap, version, fetch_privkey=False):
4211+        """
4212+        Download the specified version of this mutable file. I return a
4213+        Deferred that fires with the contents of the specified version
4214+        as a bytestring, or errbacks if the file is not recoverable.
4215+        """
4216+        d = self.get_readable_version(servermap, version)
4217+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
4218+
4219+
4220+    def get_servermap(self, mode):
4221+        """
4222+        I return a servermap that has been updated in mode.
4223+
4224+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
4225+        MODE_ANYTHING. See servermap.py for more on what these mean.
4226+        """
4227+        return self._do_serialized(self._get_servermap, mode)
4228+
4229+
4230+    def _get_servermap(self, mode):
4231+        """
4232+        I am a serialized twin to get_servermap.
4233+        """
4234         servermap = ServerMap()
4235hunk ./src/allmydata/mutable/filenode.py 593
4236-        d = self._update_servermap(servermap, mode=MODE_WRITE)
4237-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
4238+        return self._update_servermap(servermap, mode)
4239+
4240+
4241+    def _update_servermap(self, servermap, mode):
4242+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
4243+                             mode)
4244+        if self._history:
4245+            self._history.notify_mapupdate(u.get_status())
4246+        return u.update()
4247+
4248+
4249+    def set_version(self, version):
4250+        # I can be set in two ways:
4251+        #  1. When the node is created.
4252+        #  2. (for an existing share) when the Servermap is updated
4253+        #     before I am read.
4254+        assert version in (MDMF_VERSION, SDMF_VERSION)
4255+        self._protocol_version = version
4256+
4257+
4258+    def get_version(self):
4259+        return self._protocol_version
4260+
4261+
4262+    def _do_serialized(self, cb, *args, **kwargs):
4263+        # note: to avoid deadlock, this callable is *not* allowed to invoke
4264+        # other serialized methods within this (or any other)
4265+        # MutableFileNode. The callable should be a bound method of this same
4266+        # MFN instance.
4267+        d = defer.Deferred()
4268+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
4269+        # we need to put off d.callback until this Deferred is finished being
4270+        # processed. Otherwise the caller's subsequent activities (like,
4271+        # doing other things with this node) can cause reentrancy problems in
4272+        # the Deferred code itself
4273+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
4274+        # add a log.err just in case something really weird happens, because
4275+        # self._serializer stays around forever, therefore we won't see the
4276+        # usual Unhandled Error in Deferred that would give us a hint.
4277+        self._serializer.addErrback(log.err)
4278         return d
4279 
4280 
4281hunk ./src/allmydata/mutable/filenode.py 636
4282+    def _upload(self, new_contents, servermap):
4283+        """
4284+        A MutableFileNode still has to have some way of getting
4285+        published initially, which is what I am here for. After that,
4286+        all publishing, updating, modifying and so on happens through
4287+        MutableFileVersions.
4288+        """
4289+        assert self._pubkey, "update_servermap must be called before publish"
4290+
4291+        p = Publish(self, self._storage_broker, servermap)
4292+        if self._history:
4293+            self._history.notify_publish(p.get_status(),
4294+                                         new_contents.get_size())
4295+        d = p.publish(new_contents)
4296+        d.addCallback(self._did_upload, new_contents.get_size())
4297+        return d
4298+
4299+
4300+    def _did_upload(self, res, size):
4301+        self._most_recent_size = size
4302+        return res
4303+
4304+
4305+class MutableFileVersion:
4306+    """
4307+    I represent a specific version (most likely the best version) of a
4308+    mutable file.
4309+
4310+    Since I implement IReadable, instances which hold a
4311+    reference to an instance of me are guaranteed the ability (absent
4312+    connection difficulties or unrecoverable versions) to read the file
4313+    that I represent. Depending on whether I was initialized with a
4314+    write capability or not, I may also provide callers the ability to
4315+    overwrite or modify the contents of the mutable file that I
4316+    reference.
4317+    """
4318+    implements(IMutableFileVersion, IWritable)
4319+
4320+    def __init__(self,
4321+                 node,
4322+                 servermap,
4323+                 version,
4324+                 storage_index,
4325+                 storage_broker,
4326+                 readcap,
4327+                 writekey=None,
4328+                 write_secrets=None,
4329+                 history=None):
4330+
4331+        self._node = node
4332+        self._servermap = servermap
4333+        self._version = version
4334+        self._storage_index = storage_index
4335+        self._write_secrets = write_secrets
4336+        self._history = history
4337+        self._storage_broker = storage_broker
4338+
4339+        #assert isinstance(readcap, IURI)
4340+        self._readcap = readcap
4341+
4342+        self._writekey = writekey
4343+        self._serializer = defer.succeed(None)
4344+        self._size = None
4345+
4346+
4347+    def get_sequence_number(self):
4348+        """
4349+        Get the sequence number of the mutable version that I represent.
4350+        """
4351+        return self._version[0] # verinfo[0] == the sequence number
4352+
4353+
4354+    # TODO: Terminology?
4355+    def get_writekey(self):
4356+        """
4357+        I return a writekey or None if I don't have a writekey.
4358+        """
4359+        return self._writekey
4360+
4361+
4362+    def overwrite(self, new_contents):
4363+        """
4364+        I overwrite the contents of this mutable file version with the
4365+        data in new_contents.
4366+        """
4367+        assert not self.is_readonly()
4368+
4369+        return self._do_serialized(self._overwrite, new_contents)
4370+
4371+
4372+    def _overwrite(self, new_contents):
4373+        assert IMutableUploadable.providedBy(new_contents)
4374+        assert self._servermap.last_update_mode == MODE_WRITE
4375+
4376+        return self._upload(new_contents)
4377+
4378+
4379     def modify(self, modifier, backoffer=None):
4380         """I use a modifier callback to apply a change to the mutable file.
4381         I implement the following pseudocode::
4382hunk ./src/allmydata/mutable/filenode.py 773
4383         backoffer should not invoke any methods on this MutableFileNode
4384         instance, and it needs to be highly conscious of deadlock issues.
4385         """
4386+        assert not self.is_readonly()
4387+
4388         return self._do_serialized(self._modify, modifier, backoffer)
4389hunk ./src/allmydata/mutable/filenode.py 776
4390+
4391+
4392     def _modify(self, modifier, backoffer):
4393hunk ./src/allmydata/mutable/filenode.py 779
4394-        servermap = ServerMap()
4395         if backoffer is None:
4396             backoffer = BackoffAgent().delay
4397hunk ./src/allmydata/mutable/filenode.py 781
4398-        return self._modify_and_retry(servermap, modifier, backoffer, True)
4399-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
4400-        d = self._modify_once(servermap, modifier, first_time)
4401+        return self._modify_and_retry(modifier, backoffer, True)
4402+
4403+
4404+    def _modify_and_retry(self, modifier, backoffer, first_time):
4405+        """
4406+        I try to apply modifier to the contents of this version of the
4407+        mutable file. If I succeed, I return an UploadResults instance
4408+        describing my success. If I fail, I try again after waiting for
4409+        a little bit.
4410+        """
4411+        log.msg("doing modify")
4412+        d = self._modify_once(modifier, first_time)
4413         def _retry(f):
4414             f.trap(UncoordinatedWriteError)
4415             d2 = defer.maybeDeferred(backoffer, self, f)
4416hunk ./src/allmydata/mutable/filenode.py 797
4417             d2.addCallback(lambda ignored:
4418-                           self._modify_and_retry(servermap, modifier,
4419+                           self._modify_and_retry(modifier,
4420                                                   backoffer, False))
4421             return d2
4422         d.addErrback(_retry)
4423hunk ./src/allmydata/mutable/filenode.py 802
4424         return d
4425-    def _modify_once(self, servermap, modifier, first_time):
4426-        d = self._update_servermap(servermap, MODE_WRITE)
4427-        d.addCallback(self._once_updated_download_best_version, servermap)
4428+
4429+
4430+    def _modify_once(self, modifier, first_time):
4431+        """
4432+        I attempt to apply a modifier to the contents of the mutable
4433+        file.
4434+        """
4435+        # XXX: This is wrong -- we could get more servers if we updated
4436+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
4437+        # assert that the last update wasn't MODE_READ
4438+        assert self._servermap.last_update_mode == MODE_WRITE
4439+
4440+        # download_to_data is serialized, so we have to call this to
4441+        # avoid deadlock.
4442+        d = self._try_to_download_data()
4443         def _apply(old_contents):
4444hunk ./src/allmydata/mutable/filenode.py 818
4445-            new_contents = modifier(old_contents, servermap, first_time)
4446+            new_contents = modifier(old_contents, self._servermap, first_time)
4447+            precondition((isinstance(new_contents, str) or
4448+                          new_contents is None),
4449+                         "Modifier function must return a string "
4450+                         "or None")
4451+
4452             if new_contents is None or new_contents == old_contents:
4453hunk ./src/allmydata/mutable/filenode.py 825
4454+                log.msg("no changes")
4455                 # no changes need to be made
4456                 if first_time:
4457                     return
4458hunk ./src/allmydata/mutable/filenode.py 833
4459                 # recovery when it observes UCWE, we need to do a second
4460                 # publish. See #551 for details. We'll basically loop until
4461                 # we managed an uncontested publish.
4462-                new_contents = old_contents
4463-            precondition(isinstance(new_contents, str),
4464-                         "Modifier function must return a string or None")
4465-            return self._upload(new_contents, servermap)
4466+                old_uploadable = MutableData(old_contents)
4467+                new_contents = old_uploadable
4468+            else:
4469+                new_contents = MutableData(new_contents)
4470+
4471+            return self._upload(new_contents)
4472         d.addCallback(_apply)
4473         return d
4474 
4475hunk ./src/allmydata/mutable/filenode.py 842
4476-    def get_servermap(self, mode):
4477-        return self._do_serialized(self._get_servermap, mode)
4478-    def _get_servermap(self, mode):
4479-        servermap = ServerMap()
4480-        return self._update_servermap(servermap, mode)
4481-    def _update_servermap(self, servermap, mode):
4482-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
4483-                             mode)
4484-        if self._history:
4485-            self._history.notify_mapupdate(u.get_status())
4486-        return u.update()
4487 
4488hunk ./src/allmydata/mutable/filenode.py 843
4489-    def download_version(self, servermap, version, fetch_privkey=False):
4490-        return self._do_serialized(self._try_once_to_download_version,
4491-                                   servermap, version, fetch_privkey)
4492-    def _try_once_to_download_version(self, servermap, version,
4493-                                      fetch_privkey=False):
4494-        r = Retrieve(self, servermap, version, fetch_privkey)
4495+    def is_readonly(self):
4496+        """
4497+        I return True if this MutableFileVersion provides no write
4498+        access to the file that it encapsulates, and False if it
4499+        provides the ability to modify the file.
4500+        """
4501+        return self._writekey is None
4502+
4503+
4504+    def is_mutable(self):
4505+        """
4506+        I return True, since mutable files are always mutable by
4507+        somebody.
4508+        """
4509+        return True
4510+
4511+
4512+    def get_storage_index(self):
4513+        """
4514+        I return the storage index of the reference that I encapsulate.
4515+        """
4516+        return self._storage_index
4517+
4518+
4519+    def get_size(self):
4520+        """
4521+        I return the length, in bytes, of this readable object.
4522+        """
4523+        return self._servermap.size_of_version(self._version)
4524+
4525+
4526+    def download_to_data(self, fetch_privkey=False):
4527+        """
4528+        I return a Deferred that fires with the contents of this
4529+        readable object as a byte string.
4530+
4531+        """
4532+        c = consumer.MemoryConsumer()
4533+        d = self.read(c, fetch_privkey=fetch_privkey)
4534+        d.addCallback(lambda mc: "".join(mc.chunks))
4535+        return d
4536+
4537+
4538+    def _try_to_download_data(self):
4539+        """
4540+        I am an unserialized cousin of download_to_data; I am called
4541+        from the children of modify() to download the data associated
4542+        with this mutable version.
4543+        """
4544+        c = consumer.MemoryConsumer()
4545+        # modify will almost certainly write, so we need the privkey.
4546+        d = self._read(c, fetch_privkey=True)
4547+        d.addCallback(lambda mc: "".join(mc.chunks))
4548+        return d
4549+
4550+
4551+    def _update_servermap(self, mode=MODE_READ):
4552+        """
4553+        I update our Servermap according to my mode argument. I return a
4554+        Deferred that fires with None when this has finished. The
4555+        updated Servermap will be at self._servermap in that case.
4556+        """
4557+        d = self._node.get_servermap(mode)
4558+
4559+        def _got_servermap(servermap):
4560+            assert servermap.last_update_mode == mode
4561+
4562+            self._servermap = servermap
4563+        d.addCallback(_got_servermap)
4564+        return d
4565+
4566+
4567+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
4568+        """
4569+        I read a portion (possibly all) of the mutable file that I
4570+        reference into consumer.
4571+        """
4572+        return self._do_serialized(self._read, consumer, offset, size,
4573+                                   fetch_privkey)
4574+
4575+
4576+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
4577+        """
4578+        I am the serialized companion of read.
4579+        """
4580+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
4581         if self._history:
4582             self._history.notify_retrieve(r.get_status())
4583hunk ./src/allmydata/mutable/filenode.py 931
4584-        d = r.download()
4585-        d.addCallback(self._downloaded_version)
4586+        d = r.download(consumer, offset, size)
4587         return d
4588hunk ./src/allmydata/mutable/filenode.py 933
4589-    def _downloaded_version(self, data):
4590-        self._most_recent_size = len(data)
4591-        return data
4592 
4593hunk ./src/allmydata/mutable/filenode.py 934
4594-    def upload(self, new_contents, servermap):
4595-        return self._do_serialized(self._upload, new_contents, servermap)
4596-    def _upload(self, new_contents, servermap):
4597-        assert self._pubkey, "update_servermap must be called before publish"
4598-        p = Publish(self, self._storage_broker, servermap)
4599+
4600+    def _do_serialized(self, cb, *args, **kwargs):
4601+        # note: to avoid deadlock, this callable is *not* allowed to invoke
4602+        # other serialized methods within this (or any other)
4603+        # MutableFileNode. The callable should be a bound method of this same
4604+        # MFN instance.
4605+        d = defer.Deferred()
4606+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
4607+        # we need to put off d.callback until this Deferred is finished being
4608+        # processed. Otherwise the caller's subsequent activities (like,
4609+        # doing other things with this node) can cause reentrancy problems in
4610+        # the Deferred code itself
4611+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
4612+        # add a log.err just in case something really weird happens, because
4613+        # self._serializer stays around forever, therefore we won't see the
4614+        # usual Unhandled Error in Deferred that would give us a hint.
4615+        self._serializer.addErrback(log.err)
4616+        return d
4617+
4618+
4619+    def _upload(self, new_contents):
4620+        #assert self._pubkey, "update_servermap must be called before publish"
4621+        p = Publish(self._node, self._storage_broker, self._servermap)
4622         if self._history:
4623hunk ./src/allmydata/mutable/filenode.py 958
4624-            self._history.notify_publish(p.get_status(), len(new_contents))
4625+            self._history.notify_publish(p.get_status(),
4626+                                         new_contents.get_size())
4627         d = p.publish(new_contents)
4628hunk ./src/allmydata/mutable/filenode.py 961
4629-        d.addCallback(self._did_upload, len(new_contents))
4630+        d.addCallback(self._did_upload, new_contents.get_size())
4631         return d
4632hunk ./src/allmydata/mutable/filenode.py 963
4633+
4634+
4635     def _did_upload(self, res, size):
4636hunk ./src/allmydata/mutable/filenode.py 966
4637-        self._most_recent_size = size
4638+        self._size = size
4639         return res
4640hunk ./src/allmydata/mutable/filenode.py 968
4641+
4642+    def update(self, data, offset):
4643+        """
4644+        Do an update of this mutable file version by inserting data at
4645+        offset within the file. If offset is the EOF, this is an append
4646+        operation. I return a Deferred that fires with the results of
4647+        the update operation when it has completed.
4648+
4649+        In cases where update does not append any data, or where it does
4650+        not append so many blocks that the block count crosses a
4651+        power-of-two boundary, this operation will use roughly
4652+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
4653+        Otherwise, it must download, re-encode, and upload the entire
4654+        file again, which will use O(filesize) resources.
4655+        """
4656+        return self._do_serialized(self._update, data, offset)
4657+
4658+
4659+    def _update(self, data, offset):
4660+        """
4661+        I update the mutable file version represented by this particular
4662+        IMutableVersion by inserting the data in data at the offset
4663+        offset. I return a Deferred that fires when this has been
4664+        completed.
4665+        """
4666+        # We have two cases here:
4667+        # 1. The new data will add few enough segments so that it does
4668+        #    not cross into the next power-of-two boundary.
4669+        # 2. It doesn't.
4670+        #
4671+        # In the former case, we can modify the file in place. In the
4672+        # latter case, we need to re-encode the file.
4673+        new_size = data.get_size() + offset
4674+        old_size = self.get_size()
4675+        segment_size = self._version[3]
4676+        num_old_segments = mathutil.div_ceil(old_size,
4677+                                             segment_size)
4678+        num_new_segments = mathutil.div_ceil(new_size,
4679+                                             segment_size)
4680+        log.msg("got %d old segments, %d new segments" % \
4681+                        (num_old_segments, num_new_segments))
4682+
4683+        # We also do a whole file re-encode if the file is an SDMF file.
4684+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
4685+            log.msg("doing re-encode instead of in-place update")
4686+            return self._do_modify_update(data, offset)
4687+
4688+        log.msg("updating in place")
4689+        d = self._do_update_update(data, offset)
4690+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
4691+        d.addCallback(self._build_uploadable_and_finish, data, offset)
4692+        return d
4693+
4694+
4695+    def _do_modify_update(self, data, offset):
4696+        """
4697+        I perform a file update by modifying the contents of the file
4698+        after downloading it, then reuploading it. I am less efficient
4699+        than _do_update_update, but am necessary for certain updates.
4700+        """
4701+        def m(old, servermap, first_time):
4702+            start = offset
4703+            rest = offset + data.get_size()
4704+            new = old[:start]
4705+            new += "".join(data.read(data.get_size()))
4706+            new += old[rest:]
4707+            return new
4708+        return self._modify(m, None)
4709+
4710+
4711+    def _do_update_update(self, data, offset):
4712+        """
4713+        I start the Servermap update that gets us the data we need to
4714+        continue the update process. I return a Deferred that fires when
4715+        the servermap update is done.
4716+        """
4717+        assert IMutableUploadable.providedBy(data)
4718+        assert self.is_mutable()
4719+        # offset == self.get_size() is valid and means that we are
4720+        # appending data to the file.
4721+        assert offset <= self.get_size()
4722+
4723+        datasize = data.get_size()
4724+        # We'll need the segment that the data starts in, regardless of
4725+        # what we'll do later.
4726+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
4727+        start_segment -= 1
4728+
4729+        # We only need the end segment if the data we append does not go
4730+        # beyond the current end-of-file.
4731+        end_segment = start_segment
4732+        if offset + data.get_size() < self.get_size():
4733+            end_data = offset + data.get_size()
4734+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
4735+            end_segment -= 1
4736+        self._start_segment = start_segment
4737+        self._end_segment = end_segment
4738+
4739+        # Now ask for the servermap to be updated in MODE_WRITE with
4740+        # this update range.
4741+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
4742+                             self._servermap,
4743+                             mode=MODE_WRITE,
4744+                             update_range=(start_segment, end_segment))
4745+        return u.update()
4746+
4747+
4748+    def _decode_and_decrypt_segments(self, ignored, data, offset):
4749+        """
4750+        After the servermap update, I take the encrypted and encoded
4751+        data that the servermap fetched while doing its update and
4752+        transform it into decoded-and-decrypted plaintext that can be
4753+        used by the new uploadable. I return a Deferred that fires with
4754+        the segments.
4755+        """
4756+        r = Retrieve(self._node, self._servermap, self._version)
4757+        # decode: takes in our blocks and salts from the servermap,
4758+        # returns a Deferred that fires with the corresponding plaintext
4759+        # segments. Does not download -- simply takes advantage of
4760+        # existing infrastructure within the Retrieve class to avoid
4761+        # duplicating code.
4762+        sm = self._servermap
4763+        # XXX: If the methods in the servermap don't work as
4764+        # abstractions, you should rewrite them instead of going around
4765+        # them.
4766+        update_data = sm.update_data
4767+        start_segments = {} # shnum -> start segment
4768+        end_segments = {} # shnum -> end segment
4769+        blockhashes = {} # shnum -> blockhash tree
4770+        for (shnum, data) in update_data.iteritems():
4771+            data = [d[1] for d in data if d[0] == self._version]
4772+
4773+            # Every data entry in our list should now be share shnum for
4774+            # a particular version of the mutable file, so all of the
4775+            # entries should be identical.
4776+            datum = data[0]
4777+            assert filter(lambda x: x != datum, data) == []
4778+
4779+            blockhashes[shnum] = datum[0]
4780+            start_segments[shnum] = datum[1]
4781+            end_segments[shnum] = datum[2]
4782+
4783+        d1 = r.decode(start_segments, self._start_segment)
4784+        d2 = r.decode(end_segments, self._end_segment)
4785+        d3 = defer.succeed(blockhashes)
4786+        return deferredutil.gatherResults([d1, d2, d3])
4787+
4788+
4789+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
4790+        """
4791+        After the process has the plaintext segments, I build the
4792+        TransformingUploadable that the publisher will eventually
4793+        re-upload to the grid. I then invoke the publisher with that
4794+        uploadable, and return a Deferred when the publish operation has
4795+        completed without issue.
4796+        """
4797+        u = TransformingUploadable(data, offset,
4798+                                   self._version[3],
4799+                                   segments_and_bht[0],
4800+                                   segments_and_bht[1])
4801+        p = Publish(self._node, self._storage_broker, self._servermap)
4802+        return p.update(u, offset, segments_and_bht[2], self._version)
4803}
4804[interfaces.py: Add #993 interfaces
4805Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
4806 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
4807] {
4808hunk ./src/allmydata/interfaces.py 495
4809 class MustNotBeUnknownRWError(CapConstraintError):
4810     """Cannot add an unknown child cap specified in a rw_uri field."""
4811 
4812+
4813+class IReadable(Interface):
4814+    """I represent a readable object -- either an immutable file, or a
4815+    specific version of a mutable file.
4816+    """
4817+
4818+    def is_readonly():
4819+        """Return True if this reference provides mutable access to the given
4820+        file or directory (i.e. if you can modify it), or False if not. Note
4821+        that even if this reference is read-only, someone else may hold a
4822+        read-write reference to it.
4823+
4824+        For an IReadable returned by get_best_readable_version(), this will
4825+        always return True, but for instances of subinterfaces such as
4826+        IMutableFileVersion, it may return False."""
4827+
4828+    def is_mutable():
4829+        """Return True if this file or directory is mutable (by *somebody*,
4830+        not necessarily you), False if it is is immutable. Note that a file
4831+        might be mutable overall, but your reference to it might be
4832+        read-only. On the other hand, all references to an immutable file
4833+        will be read-only; there are no read-write references to an immutable
4834+        file."""
4835+
4836+    def get_storage_index():
4837+        """Return the storage index of the file."""
4838+
4839+    def get_size():
4840+        """Return the length (in bytes) of this readable object."""
4841+
4842+    def download_to_data():
4843+        """Download all of the file contents. I return a Deferred that fires
4844+        with the contents as a byte string."""
4845+
4846+    def read(consumer, offset=0, size=None):
4847+        """Download a portion (possibly all) of the file's contents, making
4848+        them available to the given IConsumer. Return a Deferred that fires
4849+        (with the consumer) when the consumer is unregistered (either because
4850+        the last byte has been given to it, or because the consumer threw an
4851+        exception during write(), possibly because it no longer wants to
4852+        receive data). The portion downloaded will start at 'offset' and
4853+        contain 'size' bytes (or the remainder of the file if size==None).
4854+
4855+        The consumer will be used in non-streaming mode: an IPullProducer
4856+        will be attached to it.
4857+
4858+        The consumer will not receive data right away: several network trips
4859+        must occur first. The order of events will be::
4860+
4861+         consumer.registerProducer(p, streaming)
4862+          (if streaming == False)::
4863+           consumer does p.resumeProducing()
4864+            consumer.write(data)
4865+           consumer does p.resumeProducing()
4866+            consumer.write(data).. (repeat until all data is written)
4867+         consumer.unregisterProducer()
4868+         deferred.callback(consumer)
4869+
4870+        If a download error occurs, or an exception is raised by
4871+        consumer.registerProducer() or consumer.write(), I will call
4872+        consumer.unregisterProducer() and then deliver the exception via
4873+        deferred.errback(). To cancel the download, the consumer should call
4874+        p.stopProducing(), which will result in an exception being delivered
4875+        via deferred.errback().
4876+
4877+        See src/allmydata/util/consumer.py for an example of a simple
4878+        download-to-memory consumer.
4879+        """
4880+
4881+
4882+class IWritable(Interface):
4883+    """
4884+    I define methods that callers can use to update SDMF and MDMF
4885+    mutable files on a Tahoe-LAFS grid.
4886+    """
4887+    # XXX: For the moment, we have only this. It is possible that we
4888+    #      want to move overwrite() and modify() in here too.
4889+    def update(data, offset):
4890+        """
4891+        I write the data from my data argument to the MDMF file,
4892+        starting at offset. I continue writing data until my data
4893+        argument is exhausted, appending data to the file as necessary.
4894+        """
4895+        # assert IMutableUploadable.providedBy(data)
4896+        # to append data: offset=node.get_size_of_best_version()
4897+        # do we want to support compacting MDMF?
4898+        # for an MDMF file, this can be done with O(data.get_size())
4899+        # memory. For an SDMF file, any modification takes
4900+        # O(node.get_size_of_best_version()).
4901+
4902+
4903+class IMutableFileVersion(IReadable):
4904+    """I provide access to a particular version of a mutable file. The
4905+    access is read/write if I was obtained from a filenode derived from
4906+    a write cap, or read-only if the filenode was derived from a read cap.
4907+    """
4908+
4909+    def get_sequence_number():
4910+        """Return the sequence number of this version."""
4911+
4912+    def get_servermap():
4913+        """Return the IMutableFileServerMap instance that was used to create
4914+        this object.
4915+        """
4916+
4917+    def get_writekey():
4918+        """Return this filenode's writekey, or None if the node does not have
4919+        write-capability. This may be used to assist with data structures
4920+        that need to make certain data available only to writers, such as the
4921+        read-write child caps in dirnodes. The recommended process is to have
4922+        reader-visible data be submitted to the filenode in the clear (where
4923+        it will be encrypted by the filenode using the readkey), but encrypt
4924+        writer-visible data using this writekey.
4925+        """
4926+
4927+    # TODO: Can this be overwrite instead of replace?
4928+    def replace(new_contents):
4929+        """Replace the contents of the mutable file, provided that no other
4930+        node has published (or is attempting to publish, concurrently) a
4931+        newer version of the file than this one.
4932+
4933+        I will avoid modifying any share that is different than the version
4934+        given by get_sequence_number(). However, if another node is writing
4935+        to the file at the same time as me, I may manage to update some shares
4936+        while they update others. If I see any evidence of this, I will signal
4937+        UncoordinatedWriteError, and the file will be left in an inconsistent
4938+        state (possibly the version you provided, possibly the old version,
4939+        possibly somebody else's version, and possibly a mix of shares from
4940+        all of these).
4941+
4942+        The recommended response to UncoordinatedWriteError is to either
4943+        return it to the caller (since they failed to coordinate their
4944+        writes), or to attempt some sort of recovery. It may be sufficient to
4945+        wait a random interval (with exponential backoff) and repeat your
4946+        operation. If I do not signal UncoordinatedWriteError, then I was
4947+        able to write the new version without incident.
4948+
4949+        I return a Deferred that fires (with a PublishStatus object) when the
4950+        update has completed.
4951+        """
4952+
4953+    def modify(modifier_cb):
4954+        """Modify the contents of the file, by downloading this version,
4955+        applying the modifier function (or bound method), then uploading
4956+        the new version. This will succeed as long as no other node
4957+        publishes a version between the download and the upload.
4958+        I return a Deferred that fires (with a PublishStatus object) when
4959+        the update is complete.
4960+
4961+        The modifier callable will be given three arguments: a string (with
4962+        the old contents), a 'first_time' boolean, and a servermap. As with
4963+        download_to_data(), the old contents will be from this version,
4964+        but the modifier can use the servermap to make other decisions
4965+        (such as refusing to apply the delta if there are multiple parallel
4966+        versions, or if there is evidence of a newer unrecoverable version).
4967+        'first_time' will be True the first time the modifier is called,
4968+        and False on any subsequent calls.
4969+
4970+        The callable should return a string with the new contents. The
4971+        callable must be prepared to be called multiple times, and must
4972+        examine the input string to see if the change that it wants to make
4973+        is already present in the old version. If it does not need to make
4974+        any changes, it can either return None, or return its input string.
4975+
4976+        If the modifier raises an exception, it will be returned in the
4977+        errback.
4978+        """
4979+
4980+
4981 # The hierarchy looks like this:
4982 #  IFilesystemNode
4983 #   IFileNode
4984hunk ./src/allmydata/interfaces.py 754
4985     def raise_error():
4986         """Raise any error associated with this node."""
4987 
4988+    # XXX: These may not be appropriate outside the context of an IReadable.
4989     def get_size():
4990         """Return the length (in bytes) of the data this node represents. For
4991         directory nodes, I return the size of the backing store. I return
4992hunk ./src/allmydata/interfaces.py 771
4993 class IFileNode(IFilesystemNode):
4994     """I am a node which represents a file: a sequence of bytes. I am not a
4995     container, like IDirectoryNode."""
4996+    def get_best_readable_version():
4997+        """Return a Deferred that fires with an IReadable for the 'best'
4998+        available version of the file. The IReadable provides only read
4999+        access, even if this filenode was derived from a write cap.
5000 
5001hunk ./src/allmydata/interfaces.py 776
5002-class IImmutableFileNode(IFileNode):
5003-    def read(consumer, offset=0, size=None):
5004-        """Download a portion (possibly all) of the file's contents, making
5005-        them available to the given IConsumer. Return a Deferred that fires
5006-        (with the consumer) when the consumer is unregistered (either because
5007-        the last byte has been given to it, or because the consumer threw an
5008-        exception during write(), possibly because it no longer wants to
5009-        receive data). The portion downloaded will start at 'offset' and
5010-        contain 'size' bytes (or the remainder of the file if size==None).
5011-
5012-        The consumer will be used in non-streaming mode: an IPullProducer
5013-        will be attached to it.
5014+        For an immutable file, there is only one version. For a mutable
5015+        file, the 'best' version is the recoverable version with the
5016+        highest sequence number. If no uncoordinated writes have occurred,
5017+        and if enough shares are available, then this will be the most
5018+        recent version that has been uploaded. If no version is recoverable,
5019+        the Deferred will errback with an UnrecoverableFileError.
5020+        """
5021 
5022hunk ./src/allmydata/interfaces.py 784
5023-        The consumer will not receive data right away: several network trips
5024-        must occur first. The order of events will be::
5025+    def download_best_version():
5026+        """Download the contents of the version that would be returned
5027+        by get_best_readable_version(). This is equivalent to calling
5028+        download_to_data() on the IReadable given by that method.
5029 
5030hunk ./src/allmydata/interfaces.py 789
5031-         consumer.registerProducer(p, streaming)
5032-          (if streaming == False)::
5033-           consumer does p.resumeProducing()
5034-            consumer.write(data)
5035-           consumer does p.resumeProducing()
5036-            consumer.write(data).. (repeat until all data is written)
5037-         consumer.unregisterProducer()
5038-         deferred.callback(consumer)
5039+        I return a Deferred that fires with a byte string when the file
5040+        has been fully downloaded. To support streaming download, use
5041+        the 'read' method of IReadable. If no version is recoverable,
5042+        the Deferred will errback with an UnrecoverableFileError.
5043+        """
5044 
5045hunk ./src/allmydata/interfaces.py 795
5046-        If a download error occurs, or an exception is raised by
5047-        consumer.registerProducer() or consumer.write(), I will call
5048-        consumer.unregisterProducer() and then deliver the exception via
5049-        deferred.errback(). To cancel the download, the consumer should call
5050-        p.stopProducing(), which will result in an exception being delivered
5051-        via deferred.errback().
5052+    def get_size_of_best_version():
5053+        """Find the size of the version that would be returned by
5054+        get_best_readable_version().
5055 
5056hunk ./src/allmydata/interfaces.py 799
5057-        See src/allmydata/util/consumer.py for an example of a simple
5058-        download-to-memory consumer.
5059+        I return a Deferred that fires with an integer. If no version
5060+        is recoverable, the Deferred will errback with an
5061+        UnrecoverableFileError.
5062         """
5063 
5064hunk ./src/allmydata/interfaces.py 804
5065+
5066+class IImmutableFileNode(IFileNode, IReadable):
5067+    """I am a node representing an immutable file. Immutable files have
5068+    only one version"""
5069+
5070+
5071 class IMutableFileNode(IFileNode):
5072     """I provide access to a 'mutable file', which retains its identity
5073     regardless of what contents are put in it.
5074hunk ./src/allmydata/interfaces.py 869
5075     only be retrieved and updated all-at-once, as a single big string. Future
5076     versions of our mutable files will remove this restriction.
5077     """
5078-
5079-    def download_best_version():
5080-        """Download the 'best' available version of the file, meaning one of
5081-        the recoverable versions with the highest sequence number. If no
5082+    def get_best_mutable_version():
5083+        """Return a Deferred that fires with an IMutableFileVersion for
5084+        the 'best' available version of the file. The best version is
5085+        the recoverable version with the highest sequence number. If no
5086         uncoordinated writes have occurred, and if enough shares are
5087hunk ./src/allmydata/interfaces.py 874
5088-        available, then this will be the most recent version that has been
5089-        uploaded.
5090+        available, then this will be the most recent version that has
5091+        been uploaded.
5092 
5093hunk ./src/allmydata/interfaces.py 877
5094-        I update an internal servermap with MODE_READ, determine which
5095-        version of the file is indicated by
5096-        servermap.best_recoverable_version(), and return a Deferred that
5097-        fires with its contents. If no version is recoverable, the Deferred
5098-        will errback with UnrecoverableFileError.
5099-        """
5100-
5101-    def get_size_of_best_version():
5102-        """Find the size of the version that would be downloaded with
5103-        download_best_version(), without actually downloading the whole file.
5104-
5105-        I return a Deferred that fires with an integer.
5106+        If no version is recoverable, the Deferred will errback with an
5107+        UnrecoverableFileError.
5108         """
5109 
5110     def overwrite(new_contents):
5111hunk ./src/allmydata/interfaces.py 917
5112         errback.
5113         """
5114 
5115-
5116     def get_servermap(mode):
5117         """Return a Deferred that fires with an IMutableFileServerMap
5118         instance, updated using the given mode.
5119hunk ./src/allmydata/interfaces.py 970
5120         writer-visible data using this writekey.
5121         """
5122 
5123+    def set_version(version):
5124+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
5125+        we upload in SDMF for reasons of compatibility. If you want to
5126+        change this, set_version will let you do that.
5127+
5128+        To say that this file should be uploaded in SDMF, pass in a 0. To
5129+        say that the file should be uploaded as MDMF, pass in a 1.
5130+        """
5131+
5132+    def get_version():
5133+        """Returns the mutable file protocol version."""
5134+
5135 class NotEnoughSharesError(Exception):
5136     """Download was unable to get enough shares"""
5137 
5138hunk ./src/allmydata/interfaces.py 1786
5139         """The upload is finished, and whatever filehandle was in use may be
5140         closed."""
5141 
5142+
5143+class IMutableUploadable(Interface):
5144+    """
5145+    I represent content that is due to be uploaded to a mutable filecap.
5146+    """
5147+    # This is somewhat simpler than the IUploadable interface above
5148+    # because mutable files do not need to be concerned with possibly
5149+    # generating a CHK, nor with per-file keys. It is a subset of the
5150+    # methods in IUploadable, though, so we could just as well implement
5151+    # the mutable uploadables as IUploadables that don't happen to use
5152+    # those methods (with the understanding that the unused methods will
5153+    # never be called on such objects)
5154+    def get_size():
5155+        """
5156+        Returns a Deferred that fires with the size of the content held
5157+        by the uploadable.
5158+        """
5159+
5160+    def read(length):
5161+        """
5162+        Returns a list of strings which, when concatenated, are the next
5163+        length bytes of the file, or fewer if there are fewer bytes
5164+        between the current location and the end of the file.
5165+        """
5166+
5167+    def close():
5168+        """
5169+        The process that used the Uploadable is finished using it, so
5170+        the uploadable may be closed.
5171+        """
5172+
5173 class IUploadResults(Interface):
5174     """I am returned by upload() methods. I contain a number of public
5175     attributes which can be read to determine the results of the upload. Some
5176}
5177[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
5178Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
5179 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
5180] {
5181hunk ./src/allmydata/frontends/sftpd.py 33
5182 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
5183      NoSuchChildError, ChildOfWrongTypeError
5184 from allmydata.mutable.common import NotWriteableError
5185+from allmydata.mutable.publish import MutableFileHandle
5186 from allmydata.immutable.upload import FileHandle
5187 from allmydata.dirnode import update_metadata
5188 from allmydata.util.fileutil import EncryptedTemporaryFile
5189hunk ./src/allmydata/frontends/sftpd.py 664
5190         else:
5191             assert IFileNode.providedBy(filenode), filenode
5192 
5193-            if filenode.is_mutable():
5194-                self.async.addCallback(lambda ign: filenode.download_best_version())
5195-                def _downloaded(data):
5196-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
5197-                    self.consumer.write(data)
5198-                    self.consumer.finish()
5199-                    return None
5200-                self.async.addCallback(_downloaded)
5201-            else:
5202-                download_size = filenode.get_size()
5203-                assert download_size is not None, "download_size is None"
5204+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
5205+
5206+            def _read(version):
5207+                if noisy: self.log("_read", level=NOISY)
5208+                download_size = version.get_size()
5209+                assert download_size is not None
5210+
5211                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
5212hunk ./src/allmydata/frontends/sftpd.py 672
5213-                def _read(ign):
5214-                    if noisy: self.log("_read immutable", level=NOISY)
5215-                    filenode.read(self.consumer, 0, None)
5216-                self.async.addCallback(_read)
5217+
5218+                version.read(self.consumer, 0, None)
5219+            self.async.addCallback(_read)
5220 
5221         eventually(self.async.callback, None)
5222 
5223hunk ./src/allmydata/frontends/sftpd.py 818
5224                     assert parent and childname, (parent, childname, self.metadata)
5225                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
5226 
5227-                d2.addCallback(lambda ign: self.consumer.get_current_size())
5228-                d2.addCallback(lambda size: self.consumer.read(0, size))
5229-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
5230+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
5231             else:
5232                 def _add_file(ign):
5233                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
5234}
5235[nodemaker.py: Make nodemaker expose a way to create MDMF files
5236Kevan Carstensen <kevan@isnotajoke.com>**20100809233623
5237 Ignore-this: a8a7c4283bb94be9fabb6fe3f2ca54b6
5238] {
5239hunk ./src/allmydata/nodemaker.py 3
5240 import weakref
5241 from zope.interface import implements
5242-from allmydata.interfaces import INodeMaker
5243+from allmydata.util.assertutil import precondition
5244+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
5245+                                 SDMF_VERSION, MDMF_VERSION
5246 from allmydata.immutable.literal import LiteralFileNode
5247 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
5248 from allmydata.immutable.upload import Data
5249hunk ./src/allmydata/nodemaker.py 10
5250 from allmydata.mutable.filenode import MutableFileNode
5251+from allmydata.mutable.publish import MutableData
5252 from allmydata.dirnode import DirectoryNode, pack_children
5253 from allmydata.unknown import UnknownNode
5254 from allmydata import uri
5255hunk ./src/allmydata/nodemaker.py 93
5256             return self._create_dirnode(filenode)
5257         return None
5258 
5259-    def create_mutable_file(self, contents=None, keysize=None):
5260+    def create_mutable_file(self, contents=None, keysize=None,
5261+                            version=SDMF_VERSION):
5262         n = MutableFileNode(self.storage_broker, self.secret_holder,
5263                             self.default_encoding_parameters, self.history)
5264hunk ./src/allmydata/nodemaker.py 97
5265+        n.set_version(version)
5266         d = self.key_generator.generate(keysize)
5267         d.addCallback(n.create_with_keys, contents)
5268         d.addCallback(lambda res: n)
5269hunk ./src/allmydata/nodemaker.py 103
5270         return d
5271 
5272-    def create_new_mutable_directory(self, initial_children={}):
5273+    def create_new_mutable_directory(self, initial_children={},
5274+                                     version=SDMF_VERSION):
5275+        # initial_children must have metadata (i.e. {} instead of None)
5276+        for (name, (node, metadata)) in initial_children.iteritems():
5277+            precondition(isinstance(metadata, dict),
5278+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
5279+            node.raise_error()
5280         d = self.create_mutable_file(lambda n:
5281hunk ./src/allmydata/nodemaker.py 111
5282-                                     pack_children(initial_children, n.get_writekey()))
5283+                                     MutableData(pack_children(initial_children,
5284+                                                    n.get_writekey())),
5285+                                     version)
5286         d.addCallback(self._create_dirnode)
5287         return d
5288 
5289}
5290[web: Alter the webapi to get along with and take advantage of the MDMF changes
5291Kevan Carstensen <kevan@isnotajoke.com>**20100809233755
5292 Ignore-this: 724e169319427bb130c1331b30f92686
5293 
5294 The main benefit that the webapi gets from MDMF, at least initially, is
5295 the ability to do a streaming download of an MDMF mutable file. It also
5296 exposes a way (through the PUT verb) to append to or otherwise modify
5297 (in-place) an MDMF mutable file.
5298] {
5299hunk ./src/allmydata/web/common.py 34
5300     else:
5301         return boolean_of_arg(replace)
5302 
5303+
5304+def parse_offset_arg(offset):
5305+    # XXX: This will raise a ValueError when invoked on something that
5306+    # is not an integer. Is that okay? Or do we want a better error
5307+    # message? Since this call is going to be used by programmers and
5308+    # their tools rather than users (through the wui), it is not
5309+    # inconsistent to return that, I guess.
5310+    offset = int(offset)
5311+    return offset
5312+
5313+
5314 def get_root(ctx_or_req):
5315     req = IRequest(ctx_or_req)
5316     # the addSlash=True gives us one extra (empty) segment
5317hunk ./src/allmydata/web/filenode.py 12
5318 from allmydata.interfaces import ExistingChildError
5319 from allmydata.monitor import Monitor
5320 from allmydata.immutable.upload import FileHandle
5321+from allmydata.mutable.publish import MutableFileHandle
5322 from allmydata.util import log, base32
5323 
5324 from allmydata.web.common import text_plain, WebError, RenderMixin, \
5325hunk ./src/allmydata/web/filenode.py 17
5326      boolean_of_arg, get_arg, should_create_intermediate_directories, \
5327-     MyExceptionHandler, parse_replace_arg
5328+     MyExceptionHandler, parse_replace_arg, parse_offset_arg
5329 from allmydata.web.check_results import CheckResults, \
5330      CheckAndRepairResults, LiteralCheckResults
5331 from allmydata.web.info import MoreInfo
5332hunk ./src/allmydata/web/filenode.py 27
5333         # a new file is being uploaded in our place.
5334         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
5335         if mutable:
5336-            req.content.seek(0)
5337-            data = req.content.read()
5338+            data = MutableFileHandle(req.content)
5339             d = client.create_mutable_file(data)
5340             def _uploaded(newnode):
5341                 d2 = self.parentnode.set_node(self.name, newnode,
5342hunk ./src/allmydata/web/filenode.py 61
5343         d.addCallback(lambda res: childnode.get_uri())
5344         return d
5345 
5346-    def _read_data_from_formpost(self, req):
5347-        # SDMF: files are small, and we can only upload data, so we read
5348-        # the whole file into memory before uploading.
5349-        contents = req.fields["file"]
5350-        contents.file.seek(0)
5351-        data = contents.file.read()
5352-        return data
5353 
5354     def replace_me_with_a_formpost(self, req, client, replace):
5355         # create a new file, maybe mutable, maybe immutable
5356hunk ./src/allmydata/web/filenode.py 66
5357         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
5358 
5359+        # create an immutable file
5360+        contents = req.fields["file"]
5361         if mutable:
5362hunk ./src/allmydata/web/filenode.py 69
5363-            data = self._read_data_from_formpost(req)
5364-            d = client.create_mutable_file(data)
5365+            uploadable = MutableFileHandle(contents.file)
5366+            d = client.create_mutable_file(uploadable)
5367             def _uploaded(newnode):
5368                 d2 = self.parentnode.set_node(self.name, newnode,
5369                                               overwrite=replace)
5370hunk ./src/allmydata/web/filenode.py 78
5371                 return d2
5372             d.addCallback(_uploaded)
5373             return d
5374-        # create an immutable file
5375-        contents = req.fields["file"]
5376+
5377         uploadable = FileHandle(contents.file, convergence=client.convergence)
5378         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
5379         d.addCallback(lambda newnode: newnode.get_uri())
5380hunk ./src/allmydata/web/filenode.py 84
5381         return d
5382 
5383+
5384 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
5385     def __init__(self, client, parentnode, name):
5386         rend.Page.__init__(self)
5387hunk ./src/allmydata/web/filenode.py 167
5388             # properly. So we assume that at least the browser will agree
5389             # with itself, and echo back the same bytes that we were given.
5390             filename = get_arg(req, "filename", self.name) or "unknown"
5391-            if self.node.is_mutable():
5392-                # some day: d = self.node.get_best_version()
5393-                d = makeMutableDownloadable(self.node)
5394-            else:
5395-                d = defer.succeed(self.node)
5396+            d = self.node.get_best_readable_version()
5397             d.addCallback(lambda dn: FileDownloader(dn, filename))
5398             return d
5399         if t == "json":
5400hunk ./src/allmydata/web/filenode.py 191
5401         if t:
5402             raise WebError("GET file: bad t=%s" % t)
5403         filename = get_arg(req, "filename", self.name) or "unknown"
5404-        if self.node.is_mutable():
5405-            # some day: d = self.node.get_best_version()
5406-            d = makeMutableDownloadable(self.node)
5407-        else:
5408-            d = defer.succeed(self.node)
5409+        d = self.node.get_best_readable_version()
5410         d.addCallback(lambda dn: FileDownloader(dn, filename))
5411         return d
5412 
5413hunk ./src/allmydata/web/filenode.py 199
5414         req = IRequest(ctx)
5415         t = get_arg(req, "t", "").strip()
5416         replace = parse_replace_arg(get_arg(req, "replace", "true"))
5417+        offset = parse_offset_arg(get_arg(req, "offset", -1))
5418 
5419         if not t:
5420hunk ./src/allmydata/web/filenode.py 202
5421-            if self.node.is_mutable():
5422+            if self.node.is_mutable() and offset >= 0:
5423+                return self.update_my_contents(req, offset)
5424+
5425+            elif self.node.is_mutable():
5426                 return self.replace_my_contents(req)
5427             if not replace:
5428                 # this is the early trap: if someone else modifies the
5429hunk ./src/allmydata/web/filenode.py 212
5430                 # directory while we're uploading, the add_file(overwrite=)
5431                 # call in replace_me_with_a_child will do the late trap.
5432                 raise ExistingChildError()
5433+            if offset >= 0:
5434+                raise WebError("PUT to a file: append operation invoked "
5435+                               "on an immutable cap")
5436+
5437+
5438             assert self.parentnode and self.name
5439             return self.replace_me_with_a_child(req, self.client, replace)
5440         if t == "uri":
5441hunk ./src/allmydata/web/filenode.py 279
5442 
5443     def replace_my_contents(self, req):
5444         req.content.seek(0)
5445-        new_contents = req.content.read()
5446+        new_contents = MutableFileHandle(req.content)
5447         d = self.node.overwrite(new_contents)
5448         d.addCallback(lambda res: self.node.get_uri())
5449         return d
5450hunk ./src/allmydata/web/filenode.py 284
5451 
5452+
5453+    def update_my_contents(self, req, offset):
5454+        req.content.seek(0)
5455+        added_contents = MutableFileHandle(req.content)
5456+
5457+        d = self.node.get_best_mutable_version()
5458+        d.addCallback(lambda mv:
5459+            mv.update(added_contents, offset))
5460+        d.addCallback(lambda ignored:
5461+            self.node.get_uri())
5462+        return d
5463+
5464+
5465     def replace_my_contents_with_a_formpost(self, req):
5466         # we have a mutable file. Get the data from the formpost, and replace
5467         # the mutable file's contents with it.
5468hunk ./src/allmydata/web/filenode.py 300
5469-        new_contents = self._read_data_from_formpost(req)
5470+        new_contents = req.fields['file']
5471+        new_contents = MutableFileHandle(new_contents.file)
5472+
5473         d = self.node.overwrite(new_contents)
5474         d.addCallback(lambda res: self.node.get_uri())
5475         return d
5476hunk ./src/allmydata/web/filenode.py 307
5477 
5478-class MutableDownloadable:
5479-    #implements(IDownloadable)
5480-    def __init__(self, size, node):
5481-        self.size = size
5482-        self.node = node
5483-    def get_size(self):
5484-        return self.size
5485-    def is_mutable(self):
5486-        return True
5487-    def read(self, consumer, offset=0, size=None):
5488-        d = self.node.download_best_version()
5489-        d.addCallback(self._got_data, consumer, offset, size)
5490-        return d
5491-    def _got_data(self, contents, consumer, offset, size):
5492-        start = offset
5493-        if size is not None:
5494-            end = offset+size
5495-        else:
5496-            end = self.size
5497-        # SDMF: we can write the whole file in one big chunk
5498-        consumer.write(contents[start:end])
5499-        return consumer
5500-
5501-def makeMutableDownloadable(n):
5502-    d = defer.maybeDeferred(n.get_size_of_best_version)
5503-    d.addCallback(MutableDownloadable, n)
5504-    return d
5505 
5506 class FileDownloader(rend.Page):
5507     # since we override the rendering process (to let the tahoe Downloader
5508hunk ./src/allmydata/web/unlinked.py 7
5509 from twisted.internet import defer
5510 from nevow import rend, url, tags as T
5511 from allmydata.immutable.upload import FileHandle
5512+from allmydata.mutable.publish import MutableFileHandle
5513 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
5514      convert_children_json, WebError
5515 from allmydata.web import status
5516hunk ./src/allmydata/web/unlinked.py 23
5517 def PUTUnlinkedSSK(req, client):
5518     # SDMF: files are small, and we can only upload data
5519     req.content.seek(0)
5520-    data = req.content.read()
5521+    data = MutableFileHandle(req.content)
5522     d = client.create_mutable_file(data)
5523     d.addCallback(lambda n: n.get_uri())
5524     return d
5525hunk ./src/allmydata/web/unlinked.py 87
5526     # "POST /uri", to create an unlinked file.
5527     # SDMF: files are small, and we can only upload data
5528     contents = req.fields["file"]
5529-    contents.file.seek(0)
5530-    data = contents.file.read()
5531+    data = MutableFileHandle(contents.file)
5532     d = client.create_mutable_file(data)
5533     d.addCallback(lambda n: n.get_uri())
5534     return d
5535}
5536[mutable/layout.py and interfaces.py: add MDMF writer and reader
5537Kevan Carstensen <kevan@isnotajoke.com>**20100809234004
5538 Ignore-this: 90db36ee3318dbbd4397baebc6014f86
5539 
5540 The MDMF writer is responsible for keeping state as plaintext is
5541 gradually processed into share data by the upload process. When the
5542 upload finishes, it will write all of its share data to a remote server,
5543 reporting its status back to the publisher.
5544 
5545 The MDMF reader is responsible for abstracting an MDMF file as it sits
5546 on the grid from the downloader; specifically, by receiving and
5547 responding to requests for arbitrary data within the MDMF file.
5548 
5549 The interfaces.py file has also been modified to contain an interface
5550 for the writer.
5551] {
5552hunk ./src/allmydata/interfaces.py 7
5553      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
5554 
5555 HASH_SIZE=32
5556+SALT_SIZE=16
5557+
5558+SDMF_VERSION=0
5559+MDMF_VERSION=1
5560 
5561 Hash = StringConstraint(maxLength=HASH_SIZE,
5562                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
5563hunk ./src/allmydata/interfaces.py 420
5564         """
5565 
5566 
5567+class IMutableSlotWriter(Interface):
5568+    """
5569+    The interface for a writer around a mutable slot on a remote server.
5570+    """
5571+    def set_checkstring(checkstring, *args):
5572+        """
5573+        Set the checkstring that I will pass to the remote server when
5574+        writing.
5575+
5576+            @param checkstring A packed checkstring to use.
5577+
5578+        Note that implementations can differ in which semantics they
5579+        wish to support for set_checkstring -- they can, for example,
5580+        build the checkstring themselves from its constituents, or
5581+        some other thing.
5582+        """
5583+
5584+    def get_checkstring():
5585+        """
5586+        Get the checkstring that I think currently exists on the remote
5587+        server.
5588+        """
5589+
5590+    def put_block(data, segnum, salt):
5591+        """
5592+        Add a block and salt to the share.
5593+        """
5594+
5595+    def put_encprivey(encprivkey):
5596+        """
5597+        Add the encrypted private key to the share.
5598+        """
5599+
5600+    def put_blockhashes(blockhashes=list):
5601+        """
5602+        Add the block hash tree to the share.
5603+        """
5604+
5605+    def put_sharehashes(sharehashes=dict):
5606+        """
5607+        Add the share hash chain to the share.
5608+        """
5609+
5610+    def get_signable():
5611+        """
5612+        Return the part of the share that needs to be signed.
5613+        """
5614+
5615+    def put_signature(signature):
5616+        """
5617+        Add the signature to the share.
5618+        """
5619+
5620+    def put_verification_key(verification_key):
5621+        """
5622+        Add the verification key to the share.
5623+        """
5624+
5625+    def finish_publishing():
5626+        """
5627+        Do anything necessary to finish writing the share to a remote
5628+        server. I require that no further publishing needs to take place
5629+        after this method has been called.
5630+        """
5631+
5632+
5633 class IURI(Interface):
5634     def init_from_string(uri):
5635         """Accept a string (as created by my to_string() method) and populate
5636hunk ./src/allmydata/mutable/layout.py 4
5637 
5638 import struct
5639 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
5640+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
5641+                                 MDMF_VERSION, IMutableSlotWriter
5642+from allmydata.util import mathutil, observer
5643+from twisted.python import failure
5644+from twisted.internet import defer
5645+from zope.interface import implements
5646+
5647+
5648+# These strings describe the format of the packed structs they help process
5649+# Here's what they mean:
5650+#
5651+#  PREFIX:
5652+#    >: Big-endian byte order; the most significant byte is first (leftmost).
5653+#    B: The version information; an 8 bit version identifier. Stored as
5654+#       an unsigned char. This is currently 00 00 00 00; our modifications
5655+#       will turn it into 00 00 00 01.
5656+#    Q: The sequence number; this is sort of like a revision history for
5657+#       mutable files; they start at 1 and increase as they are changed after
5658+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
5659+#       length.
5660+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
5661+#       characters = 32 bytes to store the value.
5662+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
5663+#       16 characters.
5664+#
5665+#  SIGNED_PREFIX additions, things that are covered by the signature:
5666+#    B: The "k" encoding parameter. We store this as an 8-bit character,
5667+#       which is convenient because our erasure coding scheme cannot
5668+#       encode if you ask for more than 255 pieces.
5669+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
5670+#       same reasons as above.
5671+#    Q: The segment size of the uploaded file. This will essentially be the
5672+#       length of the file in SDMF. An unsigned long long, so we can store
5673+#       files of quite large size.
5674+#    Q: The data length of the uploaded file. Modulo padding, this will be
5675+#       the same of the data length field. Like the data length field, it is
5676+#       an unsigned long long and can be quite large.
5677+#
5678+#   HEADER additions:
5679+#     L: The offset of the signature of this. An unsigned long.
5680+#     L: The offset of the share hash chain. An unsigned long.
5681+#     L: The offset of the block hash tree. An unsigned long.
5682+#     L: The offset of the share data. An unsigned long.
5683+#     Q: The offset of the encrypted private key. An unsigned long long, to
5684+#        account for the possibility of a lot of share data.
5685+#     Q: The offset of the EOF. An unsigned long long, to account for the
5686+#        possibility of a lot of share data.
5687+#
5688+#  After all of these, we have the following:
5689+#    - The verification key: Occupies the space between the end of the header
5690+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
5691+#    - The signature, which goes from the signature offset to the share hash
5692+#      chain offset.
5693+#    - The share hash chain, which goes from the share hash chain offset to
5694+#      the block hash tree offset.
5695+#    - The share data, which goes from the share data offset to the encrypted
5696+#      private key offset.
5697+#    - The encrypted private key offset, which goes until the end of the file.
5698+#
5699+#  The block hash tree in this encoding has only one share, so the offset of
5700+#  the share data will be 32 bits more than the offset of the block hash tree.
5701+#  Given this, we may need to check to see how many bytes a reasonably sized
5702+#  block hash tree will take up.
5703 
5704 PREFIX = ">BQ32s16s" # each version has a different prefix
5705 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
5706hunk ./src/allmydata/mutable/layout.py 73
5707 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
5708 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
5709 HEADER_LENGTH = struct.calcsize(HEADER)
5710+OFFSETS = ">LLLLQQ"
5711+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
5712 
5713 def unpack_header(data):
5714     o = {}
5715hunk ./src/allmydata/mutable/layout.py 194
5716     return (share_hash_chain, block_hash_tree, share_data)
5717 
5718 
5719-def pack_checkstring(seqnum, root_hash, IV):
5720+def pack_checkstring(seqnum, root_hash, IV, version=0):
5721     return struct.pack(PREFIX,
5722hunk ./src/allmydata/mutable/layout.py 196
5723-                       0, # version,
5724+                       version,
5725                        seqnum,
5726                        root_hash,
5727                        IV)
5728hunk ./src/allmydata/mutable/layout.py 269
5729                            encprivkey])
5730     return final_share
5731 
5732+def pack_prefix(seqnum, root_hash, IV,
5733+                required_shares, total_shares,
5734+                segment_size, data_length):
5735+    prefix = struct.pack(SIGNED_PREFIX,
5736+                         0, # version,
5737+                         seqnum,
5738+                         root_hash,
5739+                         IV,
5740+                         required_shares,
5741+                         total_shares,
5742+                         segment_size,
5743+                         data_length,
5744+                         )
5745+    return prefix
5746+
5747+
5748+class SDMFSlotWriteProxy:
5749+    implements(IMutableSlotWriter)
5750+    """
5751+    I represent a remote write slot for an SDMF mutable file. I build a
5752+    share in memory, and then write it in one piece to the remote
5753+    server. This mimics how SDMF shares were built before MDMF (and the
5754+    new MDMF uploader), but provides that functionality in a way that
5755+    allows the MDMF uploader to be built without much special-casing for
5756+    file format, which makes the uploader code more readable.
5757+    """
5758+    def __init__(self,
5759+                 shnum,
5760+                 rref, # a remote reference to a storage server
5761+                 storage_index,
5762+                 secrets, # (write_enabler, renew_secret, cancel_secret)
5763+                 seqnum, # the sequence number of the mutable file
5764+                 required_shares,
5765+                 total_shares,
5766+                 segment_size,
5767+                 data_length): # the length of the original file
5768+        self.shnum = shnum
5769+        self._rref = rref
5770+        self._storage_index = storage_index
5771+        self._secrets = secrets
5772+        self._seqnum = seqnum
5773+        self._required_shares = required_shares
5774+        self._total_shares = total_shares
5775+        self._segment_size = segment_size
5776+        self._data_length = data_length
5777+
5778+        # This is an SDMF file, so it should have only one segment, so,
5779+        # modulo padding of the data length, the segment size and the
5780+        # data length should be the same.
5781+        expected_segment_size = mathutil.next_multiple(data_length,
5782+                                                       self._required_shares)
5783+        assert expected_segment_size == segment_size
5784+
5785+        self._block_size = self._segment_size / self._required_shares
5786+
5787+        # This is meant to mimic how SDMF files were built before MDMF
5788+        # entered the picture: we generate each share in its entirety,
5789+        # then push it off to the storage server in one write. When
5790+        # callers call set_*, they are just populating this dict.
5791+        # finish_publishing will stitch these pieces together into a
5792+        # coherent share, and then write the coherent share to the
5793+        # storage server.
5794+        self._share_pieces = {}
5795+
5796+        # This tells the write logic what checkstring to use when
5797+        # writing remote shares.
5798+        self._testvs = []
5799+
5800+        self._readvs = [(0, struct.calcsize(PREFIX))]
5801+
5802+
5803+    def set_checkstring(self, checkstring_or_seqnum,
5804+                              root_hash=None,
5805+                              salt=None):
5806+        """
5807+        Set the checkstring that I will pass to the remote server when
5808+        writing.
5809+
5810+            @param checkstring_or_seqnum: A packed checkstring to use,
5811+                   or a sequence number. I will treat this as a checkstr
5812+
5813+        Note that implementations can differ in which semantics they
5814+        wish to support for set_checkstring -- they can, for example,
5815+        build the checkstring themselves from its constituents, or
5816+        some other thing.
5817+        """
5818+        if root_hash and salt:
5819+            checkstring = struct.pack(PREFIX,
5820+                                      0,
5821+                                      checkstring_or_seqnum,
5822+                                      root_hash,
5823+                                      salt)
5824+        else:
5825+            checkstring = checkstring_or_seqnum
5826+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
5827+
5828+
5829+    def get_checkstring(self):
5830+        """
5831+        Get the checkstring that I think currently exists on the remote
5832+        server.
5833+        """
5834+        if self._testvs:
5835+            return self._testvs[0][3]
5836+        return ""
5837+
5838+
5839+    def put_block(self, data, segnum, salt):
5840+        """
5841+        Add a block and salt to the share.
5842+        """
5843+        # SDMF files have only one segment
5844+        assert segnum == 0
5845+        assert len(data) == self._block_size
5846+        assert len(salt) == SALT_SIZE
5847+
5848+        self._share_pieces['sharedata'] = data
5849+        self._share_pieces['salt'] = salt
5850+
5851+        # TODO: Figure out something intelligent to return.
5852+        return defer.succeed(None)
5853+
5854+
5855+    def put_encprivkey(self, encprivkey):
5856+        """
5857+        Add the encrypted private key to the share.
5858+        """
5859+        self._share_pieces['encprivkey'] = encprivkey
5860+
5861+        return defer.succeed(None)
5862+
5863+
5864+    def put_blockhashes(self, blockhashes):
5865+        """
5866+        Add the block hash tree to the share.
5867+        """
5868+        assert isinstance(blockhashes, list)
5869+        for h in blockhashes:
5870+            assert len(h) == HASH_SIZE
5871+
5872+        # serialize the blockhashes, then set them.
5873+        blockhashes_s = "".join(blockhashes)
5874+        self._share_pieces['block_hash_tree'] = blockhashes_s
5875+
5876+        return defer.succeed(None)
5877+
5878+
5879+    def put_sharehashes(self, sharehashes):
5880+        """
5881+        Add the share hash chain to the share.
5882+        """
5883+        assert isinstance(sharehashes, dict)
5884+        for h in sharehashes.itervalues():
5885+            assert len(h) == HASH_SIZE
5886+
5887+        # serialize the sharehashes, then set them.
5888+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5889+                                 for i in sorted(sharehashes.keys())])
5890+        self._share_pieces['share_hash_chain'] = sharehashes_s
5891+
5892+        return defer.succeed(None)
5893+
5894+
5895+    def put_root_hash(self, root_hash):
5896+        """
5897+        Add the root hash to the share.
5898+        """
5899+        assert len(root_hash) == HASH_SIZE
5900+
5901+        self._share_pieces['root_hash'] = root_hash
5902+
5903+        return defer.succeed(None)
5904+
5905+
5906+    def put_salt(self, salt):
5907+        """
5908+        Add a salt to an empty SDMF file.
5909+        """
5910+        assert len(salt) == SALT_SIZE
5911+
5912+        self._share_pieces['salt'] = salt
5913+        self._share_pieces['sharedata'] = ""
5914+
5915+
5916+    def get_signable(self):
5917+        """
5918+        Return the part of the share that needs to be signed.
5919+
5920+        SDMF writers need to sign the packed representation of the
5921+        first eight fields of the remote share, that is:
5922+            - version number (0)
5923+            - sequence number
5924+            - root of the share hash tree
5925+            - salt
5926+            - k
5927+            - n
5928+            - segsize
5929+            - datalen
5930+
5931+        This method is responsible for returning that to callers.
5932+        """
5933+        return struct.pack(SIGNED_PREFIX,
5934+                           0,
5935+                           self._seqnum,
5936+                           self._share_pieces['root_hash'],
5937+                           self._share_pieces['salt'],
5938+                           self._required_shares,
5939+                           self._total_shares,
5940+                           self._segment_size,
5941+                           self._data_length)
5942+
5943+
5944+    def put_signature(self, signature):
5945+        """
5946+        Add the signature to the share.
5947+        """
5948+        self._share_pieces['signature'] = signature
5949+
5950+        return defer.succeed(None)
5951+
5952+
5953+    def put_verification_key(self, verification_key):
5954+        """
5955+        Add the verification key to the share.
5956+        """
5957+        self._share_pieces['verification_key'] = verification_key
5958+
5959+        return defer.succeed(None)
5960+
5961+
5962+    def get_verinfo(self):
5963+        """
5964+        I return my verinfo tuple. This is used by the ServermapUpdater
5965+        to keep track of versions of mutable files.
5966+
5967+        The verinfo tuple for MDMF files contains:
5968+            - seqnum
5969+            - root hash
5970+            - a blank (nothing)
5971+            - segsize
5972+            - datalen
5973+            - k
5974+            - n
5975+            - prefix (the thing that you sign)
5976+            - a tuple of offsets
5977+
5978+        We include the nonce in MDMF to simplify processing of version
5979+        information tuples.
5980+
5981+        The verinfo tuple for SDMF files is the same, but contains a
5982+        16-byte IV instead of a hash of salts.
5983+        """
5984+        return (self._seqnum,
5985+                self._share_pieces['root_hash'],
5986+                self._share_pieces['salt'],
5987+                self._segment_size,
5988+                self._data_length,
5989+                self._required_shares,
5990+                self._total_shares,
5991+                self.get_signable(),
5992+                self._get_offsets_tuple())
5993+
5994+    def _get_offsets_dict(self):
5995+        post_offset = HEADER_LENGTH
5996+        offsets = {}
5997+
5998+        verification_key_length = len(self._share_pieces['verification_key'])
5999+        o1 = offsets['signature'] = post_offset + verification_key_length
6000+
6001+        signature_length = len(self._share_pieces['signature'])
6002+        o2 = offsets['share_hash_chain'] = o1 + signature_length
6003+
6004+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
6005+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
6006+
6007+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
6008+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
6009+
6010+        share_data_length = len(self._share_pieces['sharedata'])
6011+        o5 = offsets['enc_privkey'] = o4 + share_data_length
6012+
6013+        encprivkey_length = len(self._share_pieces['encprivkey'])
6014+        offsets['EOF'] = o5 + encprivkey_length
6015+        return offsets
6016+
6017+
6018+    def _get_offsets_tuple(self):
6019+        offsets = self._get_offsets_dict()
6020+        return tuple([(key, value) for key, value in offsets.items()])
6021+
6022+
6023+    def _pack_offsets(self):
6024+        offsets = self._get_offsets_dict()
6025+        return struct.pack(">LLLLQQ",
6026+                           offsets['signature'],
6027+                           offsets['share_hash_chain'],
6028+                           offsets['block_hash_tree'],
6029+                           offsets['share_data'],
6030+                           offsets['enc_privkey'],
6031+                           offsets['EOF'])
6032+
6033+
6034+    def finish_publishing(self):
6035+        """
6036+        Do anything necessary to finish writing the share to a remote
6037+        server. I require that no further publishing needs to take place
6038+        after this method has been called.
6039+        """
6040+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
6041+                  "share_hash_chain", "block_hash_tree"]:
6042+            assert k in self._share_pieces
6043+        # This is the only method that actually writes something to the
6044+        # remote server.
6045+        # First, we need to pack the share into data that we can write
6046+        # to the remote server in one write.
6047+        offsets = self._pack_offsets()
6048+        prefix = self.get_signable()
6049+        final_share = "".join([prefix,
6050+                               offsets,
6051+                               self._share_pieces['verification_key'],
6052+                               self._share_pieces['signature'],
6053+                               self._share_pieces['share_hash_chain'],
6054+                               self._share_pieces['block_hash_tree'],
6055+                               self._share_pieces['sharedata'],
6056+                               self._share_pieces['encprivkey']])
6057+
6058+        # Our only data vector is going to be writing the final share,
6059+        # in its entirely.
6060+        datavs = [(0, final_share)]
6061+
6062+        if not self._testvs:
6063+            # Our caller has not provided us with another checkstring
6064+            # yet, so we assume that we are writing a new share, and set
6065+            # a test vector that will allow a new share to be written.
6066+            self._testvs = []
6067+            self._testvs.append(tuple([0, 1, "eq", ""]))
6068+            new_share = True
6069+
6070+        tw_vectors = {}
6071+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
6072+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
6073+                                     self._storage_index,
6074+                                     self._secrets,
6075+                                     tw_vectors,
6076+                                     # TODO is it useful to read something?
6077+                                     self._readvs)
6078+
6079+
6080+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
6081+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
6082+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
6083+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
6084+MDMFCHECKSTRING = ">BQ32s"
6085+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
6086+MDMFOFFSETS = ">QQQQQQ"
6087+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
6088+
6089+class MDMFSlotWriteProxy:
6090+    implements(IMutableSlotWriter)
6091+
6092+    """
6093+    I represent a remote write slot for an MDMF mutable file.
6094+
6095+    I abstract away from my caller the details of block and salt
6096+    management, and the implementation of the on-disk format for MDMF
6097+    shares.
6098+    """
6099+    # Expected layout, MDMF:
6100+    # offset:     size:       name:
6101+    #-- signed part --
6102+    # 0           1           version number (01)
6103+    # 1           8           sequence number
6104+    # 9           32          share tree root hash
6105+    # 41          1           The "k" encoding parameter
6106+    # 42          1           The "N" encoding parameter
6107+    # 43          8           The segment size of the uploaded file
6108+    # 51          8           The data length of the original plaintext
6109+    #-- end signed part --
6110+    # 59          8           The offset of the encrypted private key
6111+    # 83          8           The offset of the signature
6112+    # 91          8           The offset of the verification key
6113+    # 67          8           The offset of the block hash tree
6114+    # 75          8           The offset of the share hash chain
6115+    # 99          8           The offset of the EOF
6116+    #
6117+    # followed by salts and share data, the encrypted private key, the
6118+    # block hash tree, the salt hash tree, the share hash chain, a
6119+    # signature over the first eight fields, and a verification key.
6120+    #
6121+    # The checkstring is the first three fields -- the version number,
6122+    # sequence number, root hash and root salt hash. This is consistent
6123+    # in meaning to what we have with SDMF files, except now instead of
6124+    # using the literal salt, we use a value derived from all of the
6125+    # salts -- the share hash root.
6126+    #
6127+    # The salt is stored before the block for each segment. The block
6128+    # hash tree is computed over the combination of block and salt for
6129+    # each segment. In this way, we get integrity checking for both
6130+    # block and salt with the current block hash tree arrangement.
6131+    #
6132+    # The ordering of the offsets is different to reflect the dependencies
6133+    # that we'll run into with an MDMF file. The expected write flow is
6134+    # something like this:
6135+    #
6136+    #   0: Initialize with the sequence number, encoding parameters and
6137+    #      data length. From this, we can deduce the number of segments,
6138+    #      and where they should go.. We can also figure out where the
6139+    #      encrypted private key should go, because we can figure out how
6140+    #      big the share data will be.
6141+    #
6142+    #   1: Encrypt, encode, and upload the file in chunks. Do something
6143+    #      like
6144+    #
6145+    #       put_block(data, segnum, salt)
6146+    #
6147+    #      to write a block and a salt to the disk. We can do both of
6148+    #      these operations now because we have enough of the offsets to
6149+    #      know where to put them.
6150+    #
6151+    #   2: Put the encrypted private key. Use:
6152+    #
6153+    #        put_encprivkey(encprivkey)
6154+    #
6155+    #      Now that we know the length of the private key, we can fill
6156+    #      in the offset for the block hash tree.
6157+    #
6158+    #   3: We're now in a position to upload the block hash tree for
6159+    #      a share. Put that using something like:
6160+    #       
6161+    #        put_blockhashes(block_hash_tree)
6162+    #
6163+    #      Note that block_hash_tree is a list of hashes -- we'll take
6164+    #      care of the details of serializing that appropriately. When
6165+    #      we get the block hash tree, we are also in a position to
6166+    #      calculate the offset for the share hash chain, and fill that
6167+    #      into the offsets table.
6168+    #
6169+    #   4: At the same time, we're in a position to upload the salt hash
6170+    #      tree. This is a Merkle tree over all of the salts. We use a
6171+    #      Merkle tree so that we can validate each block,salt pair as
6172+    #      we download them later. We do this using
6173+    #
6174+    #        put_salthashes(salt_hash_tree)
6175+    #
6176+    #      When you do this, I automatically put the root of the tree
6177+    #      (the hash at index 0 of the list) in its appropriate slot in
6178+    #      the signed prefix of the share.
6179+    #
6180+    #   5: We're now in a position to upload the share hash chain for
6181+    #      a share. Do that with something like:
6182+    #     
6183+    #        put_sharehashes(share_hash_chain)
6184+    #
6185+    #      share_hash_chain should be a dictionary mapping shnums to
6186+    #      32-byte hashes -- the wrapper handles serialization.
6187+    #      We'll know where to put the signature at this point, also.
6188+    #      The root of this tree will be put explicitly in the next
6189+    #      step.
6190+    #
6191+    #      TODO: Why? Why not just include it in the tree here?
6192+    #
6193+    #   6: Before putting the signature, we must first put the
6194+    #      root_hash. Do this with:
6195+    #
6196+    #        put_root_hash(root_hash).
6197+    #     
6198+    #      In terms of knowing where to put this value, it was always
6199+    #      possible to place it, but it makes sense semantically to
6200+    #      place it after the share hash tree, so that's why you do it
6201+    #      in this order.
6202+    #
6203+    #   6: With the root hash put, we can now sign the header. Use:
6204+    #
6205+    #        get_signable()
6206+    #
6207+    #      to get the part of the header that you want to sign, and use:
6208+    #       
6209+    #        put_signature(signature)
6210+    #
6211+    #      to write your signature to the remote server.
6212+    #
6213+    #   6: Add the verification key, and finish. Do:
6214+    #
6215+    #        put_verification_key(key)
6216+    #
6217+    #      and
6218+    #
6219+    #        finish_publish()
6220+    #
6221+    # Checkstring management:
6222+    #
6223+    # To write to a mutable slot, we have to provide test vectors to ensure
6224+    # that we are writing to the same data that we think we are. These
6225+    # vectors allow us to detect uncoordinated writes; that is, writes
6226+    # where both we and some other shareholder are writing to the
6227+    # mutable slot, and to report those back to the parts of the program
6228+    # doing the writing.
6229+    #
6230+    # With SDMF, this was easy -- all of the share data was written in
6231+    # one go, so it was easy to detect uncoordinated writes, and we only
6232+    # had to do it once. With MDMF, not all of the file is written at
6233+    # once.
6234+    #
6235+    # If a share is new, we write out as much of the header as we can
6236+    # before writing out anything else. This gives other writers a
6237+    # canary that they can use to detect uncoordinated writes, and, if
6238+    # they do the same thing, gives us the same canary. We them update
6239+    # the share. We won't be able to write out two fields of the header
6240+    # -- the share tree hash and the salt hash -- until we finish
6241+    # writing out the share. We only require the writer to provide the
6242+    # initial checkstring, and keep track of what it should be after
6243+    # updates ourselves.
6244+    #
6245+    # If we haven't written anything yet, then on the first write (which
6246+    # will probably be a block + salt of a share), we'll also write out
6247+    # the header. On subsequent passes, we'll expect to see the header.
6248+    # This changes in two places:
6249+    #
6250+    #   - When we write out the salt hash
6251+    #   - When we write out the root of the share hash tree
6252+    #
6253+    # since these values will change the header. It is possible that we
6254+    # can just make those be written in one operation to minimize
6255+    # disruption.
6256+    def __init__(self,
6257+                 shnum,
6258+                 rref, # a remote reference to a storage server
6259+                 storage_index,
6260+                 secrets, # (write_enabler, renew_secret, cancel_secret)
6261+                 seqnum, # the sequence number of the mutable file
6262+                 required_shares,
6263+                 total_shares,
6264+                 segment_size,
6265+                 data_length): # the length of the original file
6266+        self.shnum = shnum
6267+        self._rref = rref
6268+        self._storage_index = storage_index
6269+        self._seqnum = seqnum
6270+        self._required_shares = required_shares
6271+        assert self.shnum >= 0 and self.shnum < total_shares
6272+        self._total_shares = total_shares
6273+        # We build up the offset table as we write things. It is the
6274+        # last thing we write to the remote server.
6275+        self._offsets = {}
6276+        self._testvs = []
6277+        # This is a list of write vectors that will be sent to our
6278+        # remote server once we are directed to write things there.
6279+        self._writevs = []
6280+        self._secrets = secrets
6281+        # The segment size needs to be a multiple of the k parameter --
6282+        # any padding should have been carried out by the publisher
6283+        # already.
6284+        assert segment_size % required_shares == 0
6285+        self._segment_size = segment_size
6286+        self._data_length = data_length
6287+
6288+        # These are set later -- we define them here so that we can
6289+        # check for their existence easily
6290+
6291+        # This is the root of the share hash tree -- the Merkle tree
6292+        # over the roots of the block hash trees computed for shares in
6293+        # this upload.
6294+        self._root_hash = None
6295+
6296+        # We haven't yet written anything to the remote bucket. By
6297+        # setting this, we tell the _write method as much. The write
6298+        # method will then know that it also needs to add a write vector
6299+        # for the checkstring (or what we have of it) to the first write
6300+        # request. We'll then record that value for future use.  If
6301+        # we're expecting something to be there already, we need to call
6302+        # set_checkstring before we write anything to tell the first
6303+        # write about that.
6304+        self._written = False
6305+
6306+        # When writing data to the storage servers, we get a read vector
6307+        # for free. We'll read the checkstring, which will help us
6308+        # figure out what's gone wrong if a write fails.
6309+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
6310+
6311+        # We calculate the number of segments because it tells us
6312+        # where the salt part of the file ends/share segment begins,
6313+        # and also because it provides a useful amount of bounds checking.
6314+        self._num_segments = mathutil.div_ceil(self._data_length,
6315+                                               self._segment_size)
6316+        self._block_size = self._segment_size / self._required_shares
6317+        # We also calculate the share size, to help us with block
6318+        # constraints later.
6319+        tail_size = self._data_length % self._segment_size
6320+        if not tail_size:
6321+            self._tail_block_size = self._block_size
6322+        else:
6323+            self._tail_block_size = mathutil.next_multiple(tail_size,
6324+                                                           self._required_shares)
6325+            self._tail_block_size /= self._required_shares
6326+
6327+        # We already know where the sharedata starts; right after the end
6328+        # of the header (which is defined as the signable part + the offsets)
6329+        # We can also calculate where the encrypted private key begins
6330+        # from what we know know.
6331+        self._actual_block_size = self._block_size + SALT_SIZE
6332+        data_size = self._actual_block_size * (self._num_segments - 1)
6333+        data_size += self._tail_block_size
6334+        data_size += SALT_SIZE
6335+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
6336+        self._offsets['enc_privkey'] += data_size
6337+        # We'll wait for the rest. Callers can now call my "put_block" and
6338+        # "set_checkstring" methods.
6339+
6340+
6341+    def set_checkstring(self,
6342+                        seqnum_or_checkstring,
6343+                        root_hash=None,
6344+                        salt=None):
6345+        """
6346+        Set checkstring checkstring for the given shnum.
6347+
6348+        This can be invoked in one of two ways.
6349+
6350+        With one argument, I assume that you are giving me a literal
6351+        checkstring -- e.g., the output of get_checkstring. I will then
6352+        set that checkstring as it is. This form is used by unit tests.
6353+
6354+        With two arguments, I assume that you are giving me a sequence
6355+        number and root hash to make a checkstring from. In that case, I
6356+        will build a checkstring and set it for you. This form is used
6357+        by the publisher.
6358+
6359+        By default, I assume that I am writing new shares to the grid.
6360+        If you don't explcitly set your own checkstring, I will use
6361+        one that requires that the remote share not exist. You will want
6362+        to use this method if you are updating a share in-place;
6363+        otherwise, writes will fail.
6364+        """
6365+        # You're allowed to overwrite checkstrings with this method;
6366+        # I assume that users know what they are doing when they call
6367+        # it.
6368+        if root_hash:
6369+            checkstring = struct.pack(MDMFCHECKSTRING,
6370+                                      1,
6371+                                      seqnum_or_checkstring,
6372+                                      root_hash)
6373+        else:
6374+            checkstring = seqnum_or_checkstring
6375+
6376+        if checkstring == "":
6377+            # We special-case this, since len("") = 0, but we need
6378+            # length of 1 for the case of an empty share to work on the
6379+            # storage server, which is what a checkstring that is the
6380+            # empty string means.
6381+            self._testvs = []
6382+        else:
6383+            self._testvs = []
6384+            self._testvs.append((0, len(checkstring), "eq", checkstring))
6385+
6386+
6387+    def __repr__(self):
6388+        return "MDMFSlotWriteProxy for share %d" % self.shnum
6389+
6390+
6391+    def get_checkstring(self):
6392+        """
6393+        Given a share number, I return a representation of what the
6394+        checkstring for that share on the server will look like.
6395+
6396+        I am mostly used for tests.
6397+        """
6398+        if self._root_hash:
6399+            roothash = self._root_hash
6400+        else:
6401+            roothash = "\x00" * 32
6402+        return struct.pack(MDMFCHECKSTRING,
6403+                           1,
6404+                           self._seqnum,
6405+                           roothash)
6406+
6407+
6408+    def put_block(self, data, segnum, salt):
6409+        """
6410+        I queue a write vector for the data, salt, and segment number
6411+        provided to me. I return None, as I do not actually cause
6412+        anything to be written yet.
6413+        """
6414+        if segnum >= self._num_segments:
6415+            raise LayoutInvalid("I won't overwrite the private key")
6416+        if len(salt) != SALT_SIZE:
6417+            raise LayoutInvalid("I was given a salt of size %d, but "
6418+                                "I wanted a salt of size %d")
6419+        if segnum + 1 == self._num_segments:
6420+            if len(data) != self._tail_block_size:
6421+                raise LayoutInvalid("I was given the wrong size block to write")
6422+        elif len(data) != self._block_size:
6423+            raise LayoutInvalid("I was given the wrong size block to write")
6424+
6425+        # We want to write at len(MDMFHEADER) + segnum * block_size.
6426+
6427+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
6428+        data = salt + data
6429+
6430+        self._writevs.append(tuple([offset, data]))
6431+
6432+
6433+    def put_encprivkey(self, encprivkey):
6434+        """
6435+        I queue a write vector for the encrypted private key provided to
6436+        me.
6437+        """
6438+        assert self._offsets
6439+        assert self._offsets['enc_privkey']
6440+        # You shouldn't re-write the encprivkey after the block hash
6441+        # tree is written, since that could cause the private key to run
6442+        # into the block hash tree. Before it writes the block hash
6443+        # tree, the block hash tree writing method writes the offset of
6444+        # the salt hash tree. So that's a good indicator of whether or
6445+        # not the block hash tree has been written.
6446+        if "share_hash_chain" in self._offsets:
6447+            raise LayoutInvalid("You must write this before the block hash tree")
6448+
6449+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
6450+            len(encprivkey)
6451+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
6452+
6453+
6454+    def put_blockhashes(self, blockhashes):
6455+        """
6456+        I queue a write vector to put the block hash tree in blockhashes
6457+        onto the remote server.
6458+
6459+        The encrypted private key must be queued before the block hash
6460+        tree, since we need to know how large it is to know where the
6461+        block hash tree should go. The block hash tree must be put
6462+        before the salt hash tree, since its size determines the
6463+        offset of the share hash chain.
6464+        """
6465+        assert self._offsets
6466+        assert isinstance(blockhashes, list)
6467+        if "block_hash_tree" not in self._offsets:
6468+            raise LayoutInvalid("You must put the encrypted private key "
6469+                                "before you put the block hash tree")
6470+        # If written, the share hash chain causes the signature offset
6471+        # to be defined.
6472+        if "signature" in self._offsets:
6473+            raise LayoutInvalid("You must put the block hash tree before "
6474+                                "you put the share hash chain")
6475+        blockhashes_s = "".join(blockhashes)
6476+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
6477+
6478+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
6479+                                  blockhashes_s]))
6480+
6481+
6482+    def put_sharehashes(self, sharehashes):
6483+        """
6484+        I queue a write vector to put the share hash chain in my
6485+        argument onto the remote server.
6486+
6487+        The salt hash tree must be queued before the share hash chain,
6488+        since we need to know where the salt hash tree ends before we
6489+        can know where the share hash chain starts. The share hash chain
6490+        must be put before the signature, since the length of the packed
6491+        share hash chain determines the offset of the signature. Also,
6492+        semantically, you must know what the root of the salt hash tree
6493+        is before you can generate a valid signature.
6494+        """
6495+        assert isinstance(sharehashes, dict)
6496+        if "share_hash_chain" not in self._offsets:
6497+            raise LayoutInvalid("You need to put the salt hash tree before "
6498+                                "you can put the share hash chain")
6499+        # The signature comes after the share hash chain. If the
6500+        # signature has already been written, we must not write another
6501+        # share hash chain. The signature writes the verification key
6502+        # offset when it gets sent to the remote server, so we look for
6503+        # that.
6504+        if "verification_key" in self._offsets:
6505+            raise LayoutInvalid("You must write the share hash chain "
6506+                                "before you write the signature")
6507+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
6508+                                  for i in sorted(sharehashes.keys())])
6509+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
6510+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
6511+                            sharehashes_s]))
6512+
6513+
6514+    def put_root_hash(self, roothash):
6515+        """
6516+        Put the root hash (the root of the share hash tree) in the
6517+        remote slot.
6518+        """
6519+        # It does not make sense to be able to put the root
6520+        # hash without first putting the share hashes, since you need
6521+        # the share hashes to generate the root hash.
6522+        #
6523+        # Signature is defined by the routine that places the share hash
6524+        # chain, so it's a good thing to look for in finding out whether
6525+        # or not the share hash chain exists on the remote server.
6526+        if "signature" not in self._offsets:
6527+            raise LayoutInvalid("You need to put the share hash chain "
6528+                                "before you can put the root share hash")
6529+        if len(roothash) != HASH_SIZE:
6530+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
6531+                                 % HASH_SIZE)
6532+        self._root_hash = roothash
6533+        # To write both of these values, we update the checkstring on
6534+        # the remote server, which includes them
6535+        checkstring = self.get_checkstring()
6536+        self._writevs.append(tuple([0, checkstring]))
6537+        # This write, if successful, changes the checkstring, so we need
6538+        # to update our internal checkstring to be consistent with the
6539+        # one on the server.
6540+
6541+
6542+    def get_signable(self):
6543+        """
6544+        Get the first seven fields of the mutable file; the parts that
6545+        are signed.
6546+        """
6547+        if not self._root_hash:
6548+            raise LayoutInvalid("You need to set the root hash "
6549+                                "before getting something to "
6550+                                "sign")
6551+        return struct.pack(MDMFSIGNABLEHEADER,
6552+                           1,
6553+                           self._seqnum,
6554+                           self._root_hash,
6555+                           self._required_shares,
6556+                           self._total_shares,
6557+                           self._segment_size,
6558+                           self._data_length)
6559+
6560+
6561+    def put_signature(self, signature):
6562+        """
6563+        I queue a write vector for the signature of the MDMF share.
6564+
6565+        I require that the root hash and share hash chain have been put
6566+        to the grid before I will write the signature to the grid.
6567+        """
6568+        if "signature" not in self._offsets:
6569+            raise LayoutInvalid("You must put the share hash chain "
6570+        # It does not make sense to put a signature without first
6571+        # putting the root hash and the salt hash (since otherwise
6572+        # the signature would be incomplete), so we don't allow that.
6573+                       "before putting the signature")
6574+        if not self._root_hash:
6575+            raise LayoutInvalid("You must complete the signed prefix "
6576+                                "before computing a signature")
6577+        # If we put the signature after we put the verification key, we
6578+        # could end up running into the verification key, and will
6579+        # probably screw up the offsets as well. So we don't allow that.
6580+        # The method that writes the verification key defines the EOF
6581+        # offset before writing the verification key, so look for that.
6582+        if "EOF" in self._offsets:
6583+            raise LayoutInvalid("You must write the signature before the verification key")
6584+
6585+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
6586+        self._writevs.append(tuple([self._offsets['signature'], signature]))
6587+
6588+
6589+    def put_verification_key(self, verification_key):
6590+        """
6591+        I queue a write vector for the verification key.
6592+
6593+        I require that the signature have been written to the storage
6594+        server before I allow the verification key to be written to the
6595+        remote server.
6596+        """
6597+        if "verification_key" not in self._offsets:
6598+            raise LayoutInvalid("You must put the signature before you "
6599+                                "can put the verification key")
6600+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
6601+        self._writevs.append(tuple([self._offsets['verification_key'],
6602+                            verification_key]))
6603+
6604+
6605+    def _get_offsets_tuple(self):
6606+        return tuple([(key, value) for key, value in self._offsets.items()])
6607+
6608+
6609+    def get_verinfo(self):
6610+        return (self._seqnum,
6611+                self._root_hash,
6612+                self._required_shares,
6613+                self._total_shares,
6614+                self._segment_size,
6615+                self._data_length,
6616+                self.get_signable(),
6617+                self._get_offsets_tuple())
6618+
6619+
6620+    def finish_publishing(self):
6621+        """
6622+        I add a write vector for the offsets table, and then cause all
6623+        of the write vectors that I've dealt with so far to be published
6624+        to the remote server, ending the write process.
6625+        """
6626+        if "EOF" not in self._offsets:
6627+            raise LayoutInvalid("You must put the verification key before "
6628+                                "you can publish the offsets")
6629+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
6630+        offsets = struct.pack(MDMFOFFSETS,
6631+                              self._offsets['enc_privkey'],
6632+                              self._offsets['block_hash_tree'],
6633+                              self._offsets['share_hash_chain'],
6634+                              self._offsets['signature'],
6635+                              self._offsets['verification_key'],
6636+                              self._offsets['EOF'])
6637+        self._writevs.append(tuple([offsets_offset, offsets]))
6638+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
6639+        params = struct.pack(">BBQQ",
6640+                             self._required_shares,
6641+                             self._total_shares,
6642+                             self._segment_size,
6643+                             self._data_length)
6644+        self._writevs.append(tuple([encoding_parameters_offset, params]))
6645+        return self._write(self._writevs)
6646+
6647+
6648+    def _write(self, datavs, on_failure=None, on_success=None):
6649+        """I write the data vectors in datavs to the remote slot."""
6650+        tw_vectors = {}
6651+        new_share = False
6652+        if not self._testvs:
6653+            self._testvs = []
6654+            self._testvs.append(tuple([0, 1, "eq", ""]))
6655+            new_share = True
6656+        if not self._written:
6657+            # Write a new checkstring to the share when we write it, so
6658+            # that we have something to check later.
6659+            new_checkstring = self.get_checkstring()
6660+            datavs.append((0, new_checkstring))
6661+            def _first_write():
6662+                self._written = True
6663+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
6664+            on_success = _first_write
6665+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
6666+        datalength = sum([len(x[1]) for x in datavs])
6667+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
6668+                                  self._storage_index,
6669+                                  self._secrets,
6670+                                  tw_vectors,
6671+                                  self._readv)
6672+        def _result(results):
6673+            if isinstance(results, failure.Failure) or not results[0]:
6674+                # Do nothing; the write was unsuccessful.
6675+                if on_failure: on_failure()
6676+            else:
6677+                if on_success: on_success()
6678+            return results
6679+        d.addCallback(_result)
6680+        return d
6681+
6682+
6683+class MDMFSlotReadProxy:
6684+    """
6685+    I read from a mutable slot filled with data written in the MDMF data
6686+    format (which is described above).
6687+
6688+    I can be initialized with some amount of data, which I will use (if
6689+    it is valid) to eliminate some of the need to fetch it from servers.
6690+    """
6691+    def __init__(self,
6692+                 rref,
6693+                 storage_index,
6694+                 shnum,
6695+                 data=""):
6696+        # Start the initialization process.
6697+        self._rref = rref
6698+        self._storage_index = storage_index
6699+        self.shnum = shnum
6700+
6701+        # Before doing anything, the reader is probably going to want to
6702+        # verify that the signature is correct. To do that, they'll need
6703+        # the verification key, and the signature. To get those, we'll
6704+        # need the offset table. So fetch the offset table on the
6705+        # assumption that that will be the first thing that a reader is
6706+        # going to do.
6707+
6708+        # The fact that these encoding parameters are None tells us
6709+        # that we haven't yet fetched them from the remote share, so we
6710+        # should. We could just not set them, but the checks will be
6711+        # easier to read if we don't have to use hasattr.
6712+        self._version_number = None
6713+        self._sequence_number = None
6714+        self._root_hash = None
6715+        # Filled in if we're dealing with an SDMF file. Unused
6716+        # otherwise.
6717+        self._salt = None
6718+        self._required_shares = None
6719+        self._total_shares = None
6720+        self._segment_size = None
6721+        self._data_length = None
6722+        self._offsets = None
6723+
6724+        # If the user has chosen to initialize us with some data, we'll
6725+        # try to satisfy subsequent data requests with that data before
6726+        # asking the storage server for it. If
6727+        self._data = data
6728+        # The way callers interact with cache in the filenode returns
6729+        # None if there isn't any cached data, but the way we index the
6730+        # cached data requires a string, so convert None to "".
6731+        if self._data == None:
6732+            self._data = ""
6733+
6734+        self._queue_observers = observer.ObserverList()
6735+        self._queue_errbacks = observer.ObserverList()
6736+        self._readvs = []
6737+
6738+
6739+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
6740+        """
6741+        I fetch the offset table and the header from the remote slot if
6742+        I don't already have them. If I do have them, I do nothing and
6743+        return an empty Deferred.
6744+        """
6745+        if self._offsets:
6746+            return defer.succeed(None)
6747+        # At this point, we may be either SDMF or MDMF. Fetching 107
6748+        # bytes will be enough to get header and offsets for both SDMF and
6749+        # MDMF, though we'll be left with 4 more bytes than we
6750+        # need if this ends up being MDMF. This is probably less
6751+        # expensive than the cost of a second roundtrip.
6752+        readvs = [(0, 107)]
6753+        d = self._read(readvs, force_remote)
6754+        d.addCallback(self._process_encoding_parameters)
6755+        d.addCallback(self._process_offsets)
6756+        return d
6757+
6758+
6759+    def _process_encoding_parameters(self, encoding_parameters):
6760+        assert self.shnum in encoding_parameters
6761+        encoding_parameters = encoding_parameters[self.shnum][0]
6762+        # The first byte is the version number. It will tell us what
6763+        # to do next.
6764+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
6765+        if verno == MDMF_VERSION:
6766+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
6767+            (verno,
6768+             seqnum,
6769+             root_hash,
6770+             k,
6771+             n,
6772+             segsize,
6773+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
6774+                                      encoding_parameters[:read_size])
6775+            if segsize == 0 and datalen == 0:
6776+                # Empty file, no segments.
6777+                self._num_segments = 0
6778+            else:
6779+                self._num_segments = mathutil.div_ceil(datalen, segsize)
6780+
6781+        elif verno == SDMF_VERSION:
6782+            read_size = SIGNED_PREFIX_LENGTH
6783+            (verno,
6784+             seqnum,
6785+             root_hash,
6786+             salt,
6787+             k,
6788+             n,
6789+             segsize,
6790+             datalen) = struct.unpack(">BQ32s16s BBQQ",
6791+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
6792+            self._salt = salt
6793+            if segsize == 0 and datalen == 0:
6794+                # empty file
6795+                self._num_segments = 0
6796+            else:
6797+                # non-empty SDMF files have one segment.
6798+                self._num_segments = 1
6799+        else:
6800+            raise UnknownVersionError("You asked me to read mutable file "
6801+                                      "version %d, but I only understand "
6802+                                      "%d and %d" % (verno, SDMF_VERSION,
6803+                                                     MDMF_VERSION))
6804+
6805+        self._version_number = verno
6806+        self._sequence_number = seqnum
6807+        self._root_hash = root_hash
6808+        self._required_shares = k
6809+        self._total_shares = n
6810+        self._segment_size = segsize
6811+        self._data_length = datalen
6812+
6813+        self._block_size = self._segment_size / self._required_shares
6814+        # We can upload empty files, and need to account for this fact
6815+        # so as to avoid zero-division and zero-modulo errors.
6816+        if datalen > 0:
6817+            tail_size = self._data_length % self._segment_size
6818+        else:
6819+            tail_size = 0
6820+        if not tail_size:
6821+            self._tail_block_size = self._block_size
6822+        else:
6823+            self._tail_block_size = mathutil.next_multiple(tail_size,
6824+                                                    self._required_shares)
6825+            self._tail_block_size /= self._required_shares
6826+
6827+        return encoding_parameters
6828+
6829+
6830+    def _process_offsets(self, offsets):
6831+        if self._version_number == 0:
6832+            read_size = OFFSETS_LENGTH
6833+            read_offset = SIGNED_PREFIX_LENGTH
6834+            end = read_size + read_offset
6835+            (signature,
6836+             share_hash_chain,
6837+             block_hash_tree,
6838+             share_data,
6839+             enc_privkey,
6840+             EOF) = struct.unpack(">LLLLQQ",
6841+                                  offsets[read_offset:end])
6842+            self._offsets = {}
6843+            self._offsets['signature'] = signature
6844+            self._offsets['share_data'] = share_data
6845+            self._offsets['block_hash_tree'] = block_hash_tree
6846+            self._offsets['share_hash_chain'] = share_hash_chain
6847+            self._offsets['enc_privkey'] = enc_privkey
6848+            self._offsets['EOF'] = EOF
6849+
6850+        elif self._version_number == 1:
6851+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
6852+            read_length = MDMFOFFSETS_LENGTH
6853+            end = read_offset + read_length
6854+            (encprivkey,
6855+             blockhashes,
6856+             sharehashes,
6857+             signature,
6858+             verification_key,
6859+             eof) = struct.unpack(MDMFOFFSETS,
6860+                                  offsets[read_offset:end])
6861+            self._offsets = {}
6862+            self._offsets['enc_privkey'] = encprivkey
6863+            self._offsets['block_hash_tree'] = blockhashes
6864+            self._offsets['share_hash_chain'] = sharehashes
6865+            self._offsets['signature'] = signature
6866+            self._offsets['verification_key'] = verification_key
6867+            self._offsets['EOF'] = eof
6868+
6869+
6870+    def get_block_and_salt(self, segnum, queue=False):
6871+        """
6872+        I return (block, salt), where block is the block data and
6873+        salt is the salt used to encrypt that segment.
6874+        """
6875+        d = self._maybe_fetch_offsets_and_header()
6876+        def _then(ignored):
6877+            if self._version_number == 1:
6878+                base_share_offset = MDMFHEADERSIZE
6879+            else:
6880+                base_share_offset = self._offsets['share_data']
6881+
6882+            if segnum + 1 > self._num_segments:
6883+                raise LayoutInvalid("Not a valid segment number")
6884+
6885+            if self._version_number == 0:
6886+                share_offset = base_share_offset + self._block_size * segnum
6887+            else:
6888+                share_offset = base_share_offset + (self._block_size + \
6889+                                                    SALT_SIZE) * segnum
6890+            if segnum + 1 == self._num_segments:
6891+                data = self._tail_block_size
6892+            else:
6893+                data = self._block_size
6894+
6895+            if self._version_number == 1:
6896+                data += SALT_SIZE
6897+
6898+            readvs = [(share_offset, data)]
6899+            return readvs
6900+        d.addCallback(_then)
6901+        d.addCallback(lambda readvs:
6902+            self._read(readvs, queue=queue))
6903+        def _process_results(results):
6904+            assert self.shnum in results
6905+            if self._version_number == 0:
6906+                # We only read the share data, but we know the salt from
6907+                # when we fetched the header
6908+                data = results[self.shnum]
6909+                if not data:
6910+                    data = ""
6911+                else:
6912+                    assert len(data) == 1
6913+                    data = data[0]
6914+                salt = self._salt
6915+            else:
6916+                data = results[self.shnum]
6917+                if not data:
6918+                    salt = data = ""
6919+                else:
6920+                    salt_and_data = results[self.shnum][0]
6921+                    salt = salt_and_data[:SALT_SIZE]
6922+                    data = salt_and_data[SALT_SIZE:]
6923+            return data, salt
6924+        d.addCallback(_process_results)
6925+        return d
6926+
6927+
6928+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
6929+        """
6930+        I return the block hash tree
6931+
6932+        I take an optional argument, needed, which is a set of indices
6933+        correspond to hashes that I should fetch. If this argument is
6934+        missing, I will fetch the entire block hash tree; otherwise, I
6935+        may attempt to fetch fewer hashes, based on what needed says
6936+        that I should do. Note that I may fetch as many hashes as I
6937+        want, so long as the set of hashes that I do fetch is a superset
6938+        of the ones that I am asked for, so callers should be prepared
6939+        to tolerate additional hashes.
6940+        """
6941+        # TODO: Return only the parts of the block hash tree necessary
6942+        # to validate the blocknum provided?
6943+        # This is a good idea, but it is hard to implement correctly. It
6944+        # is bad to fetch any one block hash more than once, so we
6945+        # probably just want to fetch the whole thing at once and then
6946+        # serve it.
6947+        if needed == set([]):
6948+            return defer.succeed([])
6949+        d = self._maybe_fetch_offsets_and_header()
6950+        def _then(ignored):
6951+            blockhashes_offset = self._offsets['block_hash_tree']
6952+            if self._version_number == 1:
6953+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
6954+            else:
6955+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
6956+            readvs = [(blockhashes_offset, blockhashes_length)]
6957+            return readvs
6958+        d.addCallback(_then)
6959+        d.addCallback(lambda readvs:
6960+            self._read(readvs, queue=queue, force_remote=force_remote))
6961+        def _build_block_hash_tree(results):
6962+            assert self.shnum in results
6963+
6964+            rawhashes = results[self.shnum][0]
6965+            results = [rawhashes[i:i+HASH_SIZE]
6966+                       for i in range(0, len(rawhashes), HASH_SIZE)]
6967+            return results
6968+        d.addCallback(_build_block_hash_tree)
6969+        return d
6970+
6971+
6972+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
6973+        """
6974+        I return the part of the share hash chain placed to validate
6975+        this share.
6976+
6977+        I take an optional argument, needed. Needed is a set of indices
6978+        that correspond to the hashes that I should fetch. If needed is
6979+        not present, I will fetch and return the entire share hash
6980+        chain. Otherwise, I may fetch and return any part of the share
6981+        hash chain that is a superset of the part that I am asked to
6982+        fetch. Callers should be prepared to deal with more hashes than
6983+        they've asked for.
6984+        """
6985+        if needed == set([]):
6986+            return defer.succeed([])
6987+        d = self._maybe_fetch_offsets_and_header()
6988+
6989+        def _make_readvs(ignored):
6990+            sharehashes_offset = self._offsets['share_hash_chain']
6991+            if self._version_number == 0:
6992+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
6993+            else:
6994+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
6995+            readvs = [(sharehashes_offset, sharehashes_length)]
6996+            return readvs
6997+        d.addCallback(_make_readvs)
6998+        d.addCallback(lambda readvs:
6999+            self._read(readvs, queue=queue, force_remote=force_remote))
7000+        def _build_share_hash_chain(results):
7001+            assert self.shnum in results
7002+
7003+            sharehashes = results[self.shnum][0]
7004+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
7005+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
7006+            results = dict([struct.unpack(">H32s", data)
7007+                            for data in results])
7008+            return results
7009+        d.addCallback(_build_share_hash_chain)
7010+        return d
7011+
7012+
7013+    def get_encprivkey(self, queue=False):
7014+        """
7015+        I return the encrypted private key.
7016+        """
7017+        d = self._maybe_fetch_offsets_and_header()
7018+
7019+        def _make_readvs(ignored):
7020+            privkey_offset = self._offsets['enc_privkey']
7021+            if self._version_number == 0:
7022+                privkey_length = self._offsets['EOF'] - privkey_offset
7023+            else:
7024+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
7025+            readvs = [(privkey_offset, privkey_length)]
7026+            return readvs
7027+        d.addCallback(_make_readvs)
7028+        d.addCallback(lambda readvs:
7029+            self._read(readvs, queue=queue))
7030+        def _process_results(results):
7031+            assert self.shnum in results
7032+            privkey = results[self.shnum][0]
7033+            return privkey
7034+        d.addCallback(_process_results)
7035+        return d
7036+
7037+
7038+    def get_signature(self, queue=False):
7039+        """
7040+        I return the signature of my share.
7041+        """
7042+        d = self._maybe_fetch_offsets_and_header()
7043+
7044+        def _make_readvs(ignored):
7045+            signature_offset = self._offsets['signature']
7046+            if self._version_number == 1:
7047+                signature_length = self._offsets['verification_key'] - signature_offset
7048+            else:
7049+                signature_length = self._offsets['share_hash_chain'] - signature_offset
7050+            readvs = [(signature_offset, signature_length)]
7051+            return readvs
7052+        d.addCallback(_make_readvs)
7053+        d.addCallback(lambda readvs:
7054+            self._read(readvs, queue=queue))
7055+        def _process_results(results):
7056+            assert self.shnum in results
7057+            signature = results[self.shnum][0]
7058+            return signature
7059+        d.addCallback(_process_results)
7060+        return d
7061+
7062+
7063+    def get_verification_key(self, queue=False):
7064+        """
7065+        I return the verification key.
7066+        """
7067+        d = self._maybe_fetch_offsets_and_header()
7068+
7069+        def _make_readvs(ignored):
7070+            if self._version_number == 1:
7071+                vk_offset = self._offsets['verification_key']
7072+                vk_length = self._offsets['EOF'] - vk_offset
7073+            else:
7074+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
7075+                vk_length = self._offsets['signature'] - vk_offset
7076+            readvs = [(vk_offset, vk_length)]
7077+            return readvs
7078+        d.addCallback(_make_readvs)
7079+        d.addCallback(lambda readvs:
7080+            self._read(readvs, queue=queue))
7081+        def _process_results(results):
7082+            assert self.shnum in results
7083+            verification_key = results[self.shnum][0]
7084+            return verification_key
7085+        d.addCallback(_process_results)
7086+        return d
7087+
7088+
7089+    def get_encoding_parameters(self):
7090+        """
7091+        I return (k, n, segsize, datalen)
7092+        """
7093+        d = self._maybe_fetch_offsets_and_header()
7094+        d.addCallback(lambda ignored:
7095+            (self._required_shares,
7096+             self._total_shares,
7097+             self._segment_size,
7098+             self._data_length))
7099+        return d
7100+
7101+
7102+    def get_seqnum(self):
7103+        """
7104+        I return the sequence number for this share.
7105+        """
7106+        d = self._maybe_fetch_offsets_and_header()
7107+        d.addCallback(lambda ignored:
7108+            self._sequence_number)
7109+        return d
7110+
7111+
7112+    def get_root_hash(self):
7113+        """
7114+        I return the root of the block hash tree
7115+        """
7116+        d = self._maybe_fetch_offsets_and_header()
7117+        d.addCallback(lambda ignored: self._root_hash)
7118+        return d
7119+
7120+
7121+    def get_checkstring(self):
7122+        """
7123+        I return the packed representation of the following:
7124+
7125+            - version number
7126+            - sequence number
7127+            - root hash
7128+            - salt hash
7129+
7130+        which my users use as a checkstring to detect other writers.
7131+        """
7132+        d = self._maybe_fetch_offsets_and_header()
7133+        def _build_checkstring(ignored):
7134+            if self._salt:
7135+                checkstring = strut.pack(PREFIX,
7136+                                         self._version_number,
7137+                                         self._sequence_number,
7138+                                         self._root_hash,
7139+                                         self._salt)
7140+            else:
7141+                checkstring = struct.pack(MDMFCHECKSTRING,
7142+                                          self._version_number,
7143+                                          self._sequence_number,
7144+                                          self._root_hash)
7145+
7146+            return checkstring
7147+        d.addCallback(_build_checkstring)
7148+        return d
7149+
7150+
7151+    def get_prefix(self, force_remote):
7152+        d = self._maybe_fetch_offsets_and_header(force_remote)
7153+        d.addCallback(lambda ignored:
7154+            self._build_prefix())
7155+        return d
7156+
7157+
7158+    def _build_prefix(self):
7159+        # The prefix is another name for the part of the remote share
7160+        # that gets signed. It consists of everything up to and
7161+        # including the datalength, packed by struct.
7162+        if self._version_number == SDMF_VERSION:
7163+            return struct.pack(SIGNED_PREFIX,
7164+                           self._version_number,
7165+                           self._sequence_number,
7166+                           self._root_hash,
7167+                           self._salt,
7168+                           self._required_shares,
7169+                           self._total_shares,
7170+                           self._segment_size,
7171+                           self._data_length)
7172+
7173+        else:
7174+            return struct.pack(MDMFSIGNABLEHEADER,
7175+                           self._version_number,
7176+                           self._sequence_number,
7177+                           self._root_hash,
7178+                           self._required_shares,
7179+                           self._total_shares,
7180+                           self._segment_size,
7181+                           self._data_length)
7182+
7183+
7184+    def _get_offsets_tuple(self):
7185+        # The offsets tuple is another component of the version
7186+        # information tuple. It is basically our offsets dictionary,
7187+        # itemized and in a tuple.
7188+        return self._offsets.copy()
7189+
7190+
7191+    def get_verinfo(self):
7192+        """
7193+        I return my verinfo tuple. This is used by the ServermapUpdater
7194+        to keep track of versions of mutable files.
7195+
7196+        The verinfo tuple for MDMF files contains:
7197+            - seqnum
7198+            - root hash
7199+            - a blank (nothing)
7200+            - segsize
7201+            - datalen
7202+            - k
7203+            - n
7204+            - prefix (the thing that you sign)
7205+            - a tuple of offsets
7206+
7207+        We include the nonce in MDMF to simplify processing of version
7208+        information tuples.
7209+
7210+        The verinfo tuple for SDMF files is the same, but contains a
7211+        16-byte IV instead of a hash of salts.
7212+        """
7213+        d = self._maybe_fetch_offsets_and_header()
7214+        def _build_verinfo(ignored):
7215+            if self._version_number == SDMF_VERSION:
7216+                salt_to_use = self._salt
7217+            else:
7218+                salt_to_use = None
7219+            return (self._sequence_number,
7220+                    self._root_hash,
7221+                    salt_to_use,
7222+                    self._segment_size,
7223+                    self._data_length,
7224+                    self._required_shares,
7225+                    self._total_shares,
7226+                    self._build_prefix(),
7227+                    self._get_offsets_tuple())
7228+        d.addCallback(_build_verinfo)
7229+        return d
7230+
7231+
7232+    def flush(self):
7233+        """
7234+        I flush my queue of read vectors.
7235+        """
7236+        d = self._read(self._readvs)
7237+        def _then(results):
7238+            self._readvs = []
7239+            if isinstance(results, failure.Failure):
7240+                self._queue_errbacks.notify(results)
7241+            else:
7242+                self._queue_observers.notify(results)
7243+            self._queue_observers = observer.ObserverList()
7244+            self._queue_errbacks = observer.ObserverList()
7245+        d.addBoth(_then)
7246+
7247+
7248+    def _read(self, readvs, force_remote=False, queue=False):
7249+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
7250+        # TODO: It's entirely possible to tweak this so that it just
7251+        # fulfills the requests that it can, and not demand that all
7252+        # requests are satisfiable before running it.
7253+        if not unsatisfiable and not force_remote:
7254+            results = [self._data[offset:offset+length]
7255+                       for (offset, length) in readvs]
7256+            results = {self.shnum: results}
7257+            return defer.succeed(results)
7258+        else:
7259+            if queue:
7260+                start = len(self._readvs)
7261+                self._readvs += readvs
7262+                end = len(self._readvs)
7263+                def _get_results(results, start, end):
7264+                    if not self.shnum in results:
7265+                        return {self._shnum: [""]}
7266+                    return {self.shnum: results[self.shnum][start:end]}
7267+                d = defer.Deferred()
7268+                d.addCallback(_get_results, start, end)
7269+                self._queue_observers.subscribe(d.callback)
7270+                self._queue_errbacks.subscribe(d.errback)
7271+                return d
7272+            return self._rref.callRemote("slot_readv",
7273+                                         self._storage_index,
7274+                                         [self.shnum],
7275+                                         readvs)
7276+
7277+
7278+    def is_sdmf(self):
7279+        """I tell my caller whether or not my remote file is SDMF or MDMF
7280+        """
7281+        d = self._maybe_fetch_offsets_and_header()
7282+        d.addCallback(lambda ignored:
7283+            self._version_number == 0)
7284+        return d
7285+
7286+
7287+class LayoutInvalid(Exception):
7288+    """
7289+    This isn't a valid MDMF mutable file
7290+    """
7291hunk ./src/allmydata/test/test_storage.py 2
7292 
7293-import time, os.path, stat, re, simplejson, struct
7294+import time, os.path, stat, re, simplejson, struct, shutil
7295 
7296 from twisted.trial import unittest
7297 
7298hunk ./src/allmydata/test/test_storage.py 22
7299 from allmydata.storage.expirer import LeaseCheckingCrawler
7300 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
7301      ReadBucketProxy
7302-from allmydata.interfaces import BadWriteEnablerError
7303-from allmydata.test.common import LoggingServiceParent
7304+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
7305+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
7306+                                     SIGNED_PREFIX, MDMFHEADER, \
7307+                                     MDMFOFFSETS, SDMFSlotWriteProxy
7308+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
7309+                                 SDMF_VERSION
7310+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
7311 from allmydata.test.common_web import WebRenderingMixin
7312 from allmydata.web.storage import StorageStatus, remove_prefix
7313 
7314hunk ./src/allmydata/test/test_storage.py 106
7315 
7316 class RemoteBucket:
7317 
7318+    def __init__(self):
7319+        self.read_count = 0
7320+        self.write_count = 0
7321+
7322     def callRemote(self, methname, *args, **kwargs):
7323         def _call():
7324             meth = getattr(self.target, "remote_" + methname)
7325hunk ./src/allmydata/test/test_storage.py 114
7326             return meth(*args, **kwargs)
7327+
7328+        if methname == "slot_readv":
7329+            self.read_count += 1
7330+        if "writev" in methname:
7331+            self.write_count += 1
7332+
7333         return defer.maybeDeferred(_call)
7334 
7335hunk ./src/allmydata/test/test_storage.py 122
7336+
7337 class BucketProxy(unittest.TestCase):
7338     def make_bucket(self, name, size):
7339         basedir = os.path.join("storage", "BucketProxy", name)
7340hunk ./src/allmydata/test/test_storage.py 1313
7341         self.failUnless(os.path.exists(prefixdir), prefixdir)
7342         self.failIf(os.path.exists(bucketdir), bucketdir)
7343 
7344+
7345+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
7346+    def setUp(self):
7347+        self.sparent = LoggingServiceParent()
7348+        self._lease_secret = itertools.count()
7349+        self.ss = self.create("MDMFProxies storage test server")
7350+        self.rref = RemoteBucket()
7351+        self.rref.target = self.ss
7352+        self.secrets = (self.write_enabler("we_secret"),
7353+                        self.renew_secret("renew_secret"),
7354+                        self.cancel_secret("cancel_secret"))
7355+        self.segment = "aaaaaa"
7356+        self.block = "aa"
7357+        self.salt = "a" * 16
7358+        self.block_hash = "a" * 32
7359+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
7360+        self.share_hash = self.block_hash
7361+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
7362+        self.signature = "foobarbaz"
7363+        self.verification_key = "vvvvvv"
7364+        self.encprivkey = "private"
7365+        self.root_hash = self.block_hash
7366+        self.salt_hash = self.root_hash
7367+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
7368+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
7369+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
7370+        # blockhashes and salt hashes are serialized in the same way,
7371+        # only we lop off the first element and store that in the
7372+        # header.
7373+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
7374+
7375+
7376+    def tearDown(self):
7377+        self.sparent.stopService()
7378+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
7379+
7380+
7381+    def write_enabler(self, we_tag):
7382+        return hashutil.tagged_hash("we_blah", we_tag)
7383+
7384+
7385+    def renew_secret(self, tag):
7386+        return hashutil.tagged_hash("renew_blah", str(tag))
7387+
7388+
7389+    def cancel_secret(self, tag):
7390+        return hashutil.tagged_hash("cancel_blah", str(tag))
7391+
7392+
7393+    def workdir(self, name):
7394+        basedir = os.path.join("storage", "MutableServer", name)
7395+        return basedir
7396+
7397+
7398+    def create(self, name):
7399+        workdir = self.workdir(name)
7400+        ss = StorageServer(workdir, "\x00" * 20)
7401+        ss.setServiceParent(self.sparent)
7402+        return ss
7403+
7404+
7405+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
7406+        # Start with the checkstring
7407+        data = struct.pack(">BQ32s",
7408+                           1,
7409+                           0,
7410+                           self.root_hash)
7411+        self.checkstring = data
7412+        # Next, the encoding parameters
7413+        if tail_segment:
7414+            data += struct.pack(">BBQQ",
7415+                                3,
7416+                                10,
7417+                                6,
7418+                                33)
7419+        elif empty:
7420+            data += struct.pack(">BBQQ",
7421+                                3,
7422+                                10,
7423+                                0,
7424+                                0)
7425+        else:
7426+            data += struct.pack(">BBQQ",
7427+                                3,
7428+                                10,
7429+                                6,
7430+                                36)
7431+        # Now we'll build the offsets.
7432+        sharedata = ""
7433+        if not tail_segment and not empty:
7434+            for i in xrange(6):
7435+                sharedata += self.salt + self.block
7436+        elif tail_segment:
7437+            for i in xrange(5):
7438+                sharedata += self.salt + self.block
7439+            sharedata += self.salt + "a"
7440+
7441+        # The encrypted private key comes after the shares + salts
7442+        offset_size = struct.calcsize(MDMFOFFSETS)
7443+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
7444+        # The blockhashes come after the private key
7445+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
7446+        # The sharehashes come after the salt hashes
7447+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
7448+        # The signature comes after the share hash chain
7449+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
7450+        # The verification key comes after the signature
7451+        verification_offset = signature_offset + len(self.signature)
7452+        # The EOF comes after the verification key
7453+        eof_offset = verification_offset + len(self.verification_key)
7454+        data += struct.pack(MDMFOFFSETS,
7455+                            encrypted_private_key_offset,
7456+                            blockhashes_offset,
7457+                            sharehashes_offset,
7458+                            signature_offset,
7459+                            verification_offset,
7460+                            eof_offset)
7461+        self.offsets = {}
7462+        self.offsets['enc_privkey'] = encrypted_private_key_offset
7463+        self.offsets['block_hash_tree'] = blockhashes_offset
7464+        self.offsets['share_hash_chain'] = sharehashes_offset
7465+        self.offsets['signature'] = signature_offset
7466+        self.offsets['verification_key'] = verification_offset
7467+        self.offsets['EOF'] = eof_offset
7468+        # Next, we'll add in the salts and share data,
7469+        data += sharedata
7470+        # the private key,
7471+        data += self.encprivkey
7472+        # the block hash tree,
7473+        data += self.block_hash_tree_s
7474+        # the share hash chain,
7475+        data += self.share_hash_chain_s
7476+        # the signature,
7477+        data += self.signature
7478+        # and the verification key
7479+        data += self.verification_key
7480+        return data
7481+
7482+
7483+    def write_test_share_to_server(self,
7484+                                   storage_index,
7485+                                   tail_segment=False,
7486+                                   empty=False):
7487+        """
7488+        I write some data for the read tests to read to self.ss
7489+
7490+        If tail_segment=True, then I will write a share that has a
7491+        smaller tail segment than other segments.
7492+        """
7493+        write = self.ss.remote_slot_testv_and_readv_and_writev
7494+        data = self.build_test_mdmf_share(tail_segment, empty)
7495+        # Finally, we write the whole thing to the storage server in one
7496+        # pass.
7497+        testvs = [(0, 1, "eq", "")]
7498+        tws = {}
7499+        tws[0] = (testvs, [(0, data)], None)
7500+        readv = [(0, 1)]
7501+        results = write(storage_index, self.secrets, tws, readv)
7502+        self.failUnless(results[0])
7503+
7504+
7505+    def build_test_sdmf_share(self, empty=False):
7506+        if empty:
7507+            sharedata = ""
7508+        else:
7509+            sharedata = self.segment * 6
7510+        self.sharedata = sharedata
7511+        blocksize = len(sharedata) / 3
7512+        block = sharedata[:blocksize]
7513+        self.blockdata = block
7514+        prefix = struct.pack(">BQ32s16s BBQQ",
7515+                             0, # version,
7516+                             0,
7517+                             self.root_hash,
7518+                             self.salt,
7519+                             3,
7520+                             10,
7521+                             len(sharedata),
7522+                             len(sharedata),
7523+                            )
7524+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
7525+        signature_offset = post_offset + len(self.verification_key)
7526+        sharehashes_offset = signature_offset + len(self.signature)
7527+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
7528+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
7529+        encprivkey_offset = sharedata_offset + len(block)
7530+        eof_offset = encprivkey_offset + len(self.encprivkey)
7531+        offsets = struct.pack(">LLLLQQ",
7532+                              signature_offset,
7533+                              sharehashes_offset,
7534+                              blockhashes_offset,
7535+                              sharedata_offset,
7536+                              encprivkey_offset,
7537+                              eof_offset)
7538+        final_share = "".join([prefix,
7539+                           offsets,
7540+                           self.verification_key,
7541+                           self.signature,
7542+                           self.share_hash_chain_s,
7543+                           self.block_hash_tree_s,
7544+                           block,
7545+                           self.encprivkey])
7546+        self.offsets = {}
7547+        self.offsets['signature'] = signature_offset
7548+        self.offsets['share_hash_chain'] = sharehashes_offset
7549+        self.offsets['block_hash_tree'] = blockhashes_offset
7550+        self.offsets['share_data'] = sharedata_offset
7551+        self.offsets['enc_privkey'] = encprivkey_offset
7552+        self.offsets['EOF'] = eof_offset
7553+        return final_share
7554+
7555+
7556+    def write_sdmf_share_to_server(self,
7557+                                   storage_index,
7558+                                   empty=False):
7559+        # Some tests need SDMF shares to verify that we can still
7560+        # read them. This method writes one, which resembles but is not
7561+        assert self.rref
7562+        write = self.ss.remote_slot_testv_and_readv_and_writev
7563+        share = self.build_test_sdmf_share(empty)
7564+        testvs = [(0, 1, "eq", "")]
7565+        tws = {}
7566+        tws[0] = (testvs, [(0, share)], None)
7567+        readv = []
7568+        results = write(storage_index, self.secrets, tws, readv)
7569+        self.failUnless(results[0])
7570+
7571+
7572+    def test_read(self):
7573+        self.write_test_share_to_server("si1")
7574+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7575+        # Check that every method equals what we expect it to.
7576+        d = defer.succeed(None)
7577+        def _check_block_and_salt((block, salt)):
7578+            self.failUnlessEqual(block, self.block)
7579+            self.failUnlessEqual(salt, self.salt)
7580+
7581+        for i in xrange(6):
7582+            d.addCallback(lambda ignored, i=i:
7583+                mr.get_block_and_salt(i))
7584+            d.addCallback(_check_block_and_salt)
7585+
7586+        d.addCallback(lambda ignored:
7587+            mr.get_encprivkey())
7588+        d.addCallback(lambda encprivkey:
7589+            self.failUnlessEqual(self.encprivkey, encprivkey))
7590+
7591+        d.addCallback(lambda ignored:
7592+            mr.get_blockhashes())
7593+        d.addCallback(lambda blockhashes:
7594+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
7595+
7596+        d.addCallback(lambda ignored:
7597+            mr.get_sharehashes())
7598+        d.addCallback(lambda sharehashes:
7599+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
7600+
7601+        d.addCallback(lambda ignored:
7602+            mr.get_signature())
7603+        d.addCallback(lambda signature:
7604+            self.failUnlessEqual(signature, self.signature))
7605+
7606+        d.addCallback(lambda ignored:
7607+            mr.get_verification_key())
7608+        d.addCallback(lambda verification_key:
7609+            self.failUnlessEqual(verification_key, self.verification_key))
7610+
7611+        d.addCallback(lambda ignored:
7612+            mr.get_seqnum())
7613+        d.addCallback(lambda seqnum:
7614+            self.failUnlessEqual(seqnum, 0))
7615+
7616+        d.addCallback(lambda ignored:
7617+            mr.get_root_hash())
7618+        d.addCallback(lambda root_hash:
7619+            self.failUnlessEqual(self.root_hash, root_hash))
7620+
7621+        d.addCallback(lambda ignored:
7622+            mr.get_seqnum())
7623+        d.addCallback(lambda seqnum:
7624+            self.failUnlessEqual(0, seqnum))
7625+
7626+        d.addCallback(lambda ignored:
7627+            mr.get_encoding_parameters())
7628+        def _check_encoding_parameters((k, n, segsize, datalen)):
7629+            self.failUnlessEqual(k, 3)
7630+            self.failUnlessEqual(n, 10)
7631+            self.failUnlessEqual(segsize, 6)
7632+            self.failUnlessEqual(datalen, 36)
7633+        d.addCallback(_check_encoding_parameters)
7634+
7635+        d.addCallback(lambda ignored:
7636+            mr.get_checkstring())
7637+        d.addCallback(lambda checkstring:
7638+            self.failUnlessEqual(checkstring, checkstring))
7639+        return d
7640+
7641+
7642+    def test_read_with_different_tail_segment_size(self):
7643+        self.write_test_share_to_server("si1", tail_segment=True)
7644+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7645+        d = mr.get_block_and_salt(5)
7646+        def _check_tail_segment(results):
7647+            block, salt = results
7648+            self.failUnlessEqual(len(block), 1)
7649+            self.failUnlessEqual(block, "a")
7650+        d.addCallback(_check_tail_segment)
7651+        return d
7652+
7653+
7654+    def test_get_block_with_invalid_segnum(self):
7655+        self.write_test_share_to_server("si1")
7656+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7657+        d = defer.succeed(None)
7658+        d.addCallback(lambda ignored:
7659+            self.shouldFail(LayoutInvalid, "test invalid segnum",
7660+                            None,
7661+                            mr.get_block_and_salt, 7))
7662+        return d
7663+
7664+
7665+    def test_get_encoding_parameters_first(self):
7666+        self.write_test_share_to_server("si1")
7667+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7668+        d = mr.get_encoding_parameters()
7669+        def _check_encoding_parameters((k, n, segment_size, datalen)):
7670+            self.failUnlessEqual(k, 3)
7671+            self.failUnlessEqual(n, 10)
7672+            self.failUnlessEqual(segment_size, 6)
7673+            self.failUnlessEqual(datalen, 36)
7674+        d.addCallback(_check_encoding_parameters)
7675+        return d
7676+
7677+
7678+    def test_get_seqnum_first(self):
7679+        self.write_test_share_to_server("si1")
7680+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7681+        d = mr.get_seqnum()
7682+        d.addCallback(lambda seqnum:
7683+            self.failUnlessEqual(seqnum, 0))
7684+        return d
7685+
7686+
7687+    def test_get_root_hash_first(self):
7688+        self.write_test_share_to_server("si1")
7689+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7690+        d = mr.get_root_hash()
7691+        d.addCallback(lambda root_hash:
7692+            self.failUnlessEqual(root_hash, self.root_hash))
7693+        return d
7694+
7695+
7696+    def test_get_checkstring_first(self):
7697+        self.write_test_share_to_server("si1")
7698+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7699+        d = mr.get_checkstring()
7700+        d.addCallback(lambda checkstring:
7701+            self.failUnlessEqual(checkstring, self.checkstring))
7702+        return d
7703+
7704+
7705+    def test_write_read_vectors(self):
7706+        # When writing for us, the storage server will return to us a
7707+        # read vector, along with its result. If a write fails because
7708+        # the test vectors failed, this read vector can help us to
7709+        # diagnose the problem. This test ensures that the read vector
7710+        # is working appropriately.
7711+        mw = self._make_new_mw("si1", 0)
7712+
7713+        for i in xrange(6):
7714+            mw.put_block(self.block, i, self.salt)
7715+        mw.put_encprivkey(self.encprivkey)
7716+        mw.put_blockhashes(self.block_hash_tree)
7717+        mw.put_sharehashes(self.share_hash_chain)
7718+        mw.put_root_hash(self.root_hash)
7719+        mw.put_signature(self.signature)
7720+        mw.put_verification_key(self.verification_key)
7721+        d = mw.finish_publishing()
7722+        def _then(results):
7723+            self.failUnless(len(results), 2)
7724+            result, readv = results
7725+            self.failUnless(result)
7726+            self.failIf(readv)
7727+            self.old_checkstring = mw.get_checkstring()
7728+            mw.set_checkstring("")
7729+        d.addCallback(_then)
7730+        d.addCallback(lambda ignored:
7731+            mw.finish_publishing())
7732+        def _then_again(results):
7733+            self.failUnlessEqual(len(results), 2)
7734+            result, readvs = results
7735+            self.failIf(result)
7736+            self.failUnlessIn(0, readvs)
7737+            readv = readvs[0][0]
7738+            self.failUnlessEqual(readv, self.old_checkstring)
7739+        d.addCallback(_then_again)
7740+        # The checkstring remains the same for the rest of the process.
7741+        return d
7742+
7743+
7744+    def test_blockhashes_after_share_hash_chain(self):
7745+        mw = self._make_new_mw("si1", 0)
7746+        d = defer.succeed(None)
7747+        # Put everything up to and including the share hash chain
7748+        for i in xrange(6):
7749+            d.addCallback(lambda ignored, i=i:
7750+                mw.put_block(self.block, i, self.salt))
7751+        d.addCallback(lambda ignored:
7752+            mw.put_encprivkey(self.encprivkey))
7753+        d.addCallback(lambda ignored:
7754+            mw.put_blockhashes(self.block_hash_tree))
7755+        d.addCallback(lambda ignored:
7756+            mw.put_sharehashes(self.share_hash_chain))
7757+
7758+        # Now try to put the block hash tree again.
7759+        d.addCallback(lambda ignored:
7760+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
7761+                            None,
7762+                            mw.put_blockhashes, self.block_hash_tree))
7763+        return d
7764+
7765+
7766+    def test_encprivkey_after_blockhashes(self):
7767+        mw = self._make_new_mw("si1", 0)
7768+        d = defer.succeed(None)
7769+        # Put everything up to and including the block hash tree
7770+        for i in xrange(6):
7771+            d.addCallback(lambda ignored, i=i:
7772+                mw.put_block(self.block, i, self.salt))
7773+        d.addCallback(lambda ignored:
7774+            mw.put_encprivkey(self.encprivkey))
7775+        d.addCallback(lambda ignored:
7776+            mw.put_blockhashes(self.block_hash_tree))
7777+        d.addCallback(lambda ignored:
7778+            self.shouldFail(LayoutInvalid, "out of order private key",
7779+                            None,
7780+                            mw.put_encprivkey, self.encprivkey))
7781+        return d
7782+
7783+
7784+    def test_share_hash_chain_after_signature(self):
7785+        mw = self._make_new_mw("si1", 0)
7786+        d = defer.succeed(None)
7787+        # Put everything up to and including the signature
7788+        for i in xrange(6):
7789+            d.addCallback(lambda ignored, i=i:
7790+                mw.put_block(self.block, i, self.salt))
7791+        d.addCallback(lambda ignored:
7792+            mw.put_encprivkey(self.encprivkey))
7793+        d.addCallback(lambda ignored:
7794+            mw.put_blockhashes(self.block_hash_tree))
7795+        d.addCallback(lambda ignored:
7796+            mw.put_sharehashes(self.share_hash_chain))
7797+        d.addCallback(lambda ignored:
7798+            mw.put_root_hash(self.root_hash))
7799+        d.addCallback(lambda ignored:
7800+            mw.put_signature(self.signature))
7801+        # Now try to put the share hash chain again. This should fail
7802+        d.addCallback(lambda ignored:
7803+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
7804+                            None,
7805+                            mw.put_sharehashes, self.share_hash_chain))
7806+        return d
7807+
7808+
7809+    def test_signature_after_verification_key(self):
7810+        mw = self._make_new_mw("si1", 0)
7811+        d = defer.succeed(None)
7812+        # Put everything up to and including the verification key.
7813+        for i in xrange(6):
7814+            d.addCallback(lambda ignored, i=i:
7815+                mw.put_block(self.block, i, self.salt))
7816+        d.addCallback(lambda ignored:
7817+            mw.put_encprivkey(self.encprivkey))
7818+        d.addCallback(lambda ignored:
7819+            mw.put_blockhashes(self.block_hash_tree))
7820+        d.addCallback(lambda ignored:
7821+            mw.put_sharehashes(self.share_hash_chain))
7822+        d.addCallback(lambda ignored:
7823+            mw.put_root_hash(self.root_hash))
7824+        d.addCallback(lambda ignored:
7825+            mw.put_signature(self.signature))
7826+        d.addCallback(lambda ignored:
7827+            mw.put_verification_key(self.verification_key))
7828+        # Now try to put the signature again. This should fail
7829+        d.addCallback(lambda ignored:
7830+            self.shouldFail(LayoutInvalid, "signature after verification",
7831+                            None,
7832+                            mw.put_signature, self.signature))
7833+        return d
7834+
7835+
7836+    def test_uncoordinated_write(self):
7837+        # Make two mutable writers, both pointing to the same storage
7838+        # server, both at the same storage index, and try writing to the
7839+        # same share.
7840+        mw1 = self._make_new_mw("si1", 0)
7841+        mw2 = self._make_new_mw("si1", 0)
7842+
7843+        def _check_success(results):
7844+            result, readvs = results
7845+            self.failUnless(result)
7846+
7847+        def _check_failure(results):
7848+            result, readvs = results
7849+            self.failIf(result)
7850+
7851+        def _write_share(mw):
7852+            for i in xrange(6):
7853+                mw.put_block(self.block, i, self.salt)
7854+            mw.put_encprivkey(self.encprivkey)
7855+            mw.put_blockhashes(self.block_hash_tree)
7856+            mw.put_sharehashes(self.share_hash_chain)
7857+            mw.put_root_hash(self.root_hash)
7858+            mw.put_signature(self.signature)
7859+            mw.put_verification_key(self.verification_key)
7860+            return mw.finish_publishing()
7861+        d = _write_share(mw1)
7862+        d.addCallback(_check_success)
7863+        d.addCallback(lambda ignored:
7864+            _write_share(mw2))
7865+        d.addCallback(_check_failure)
7866+        return d
7867+
7868+
7869+    def test_invalid_salt_size(self):
7870+        # Salts need to be 16 bytes in size. Writes that attempt to
7871+        # write more or less than this should be rejected.
7872+        mw = self._make_new_mw("si1", 0)
7873+        invalid_salt = "a" * 17 # 17 bytes
7874+        another_invalid_salt = "b" * 15 # 15 bytes
7875+        d = defer.succeed(None)
7876+        d.addCallback(lambda ignored:
7877+            self.shouldFail(LayoutInvalid, "salt too big",
7878+                            None,
7879+                            mw.put_block, self.block, 0, invalid_salt))
7880+        d.addCallback(lambda ignored:
7881+            self.shouldFail(LayoutInvalid, "salt too small",
7882+                            None,
7883+                            mw.put_block, self.block, 0,
7884+                            another_invalid_salt))
7885+        return d
7886+
7887+
7888+    def test_write_test_vectors(self):
7889+        # If we give the write proxy a bogus test vector at
7890+        # any point during the process, it should fail to write when we
7891+        # tell it to write.
7892+        def _check_failure(results):
7893+            self.failUnlessEqual(len(results), 2)
7894+            res, d = results
7895+            self.failIf(res)
7896+
7897+        def _check_success(results):
7898+            self.failUnlessEqual(len(results), 2)
7899+            res, d = results
7900+            self.failUnless(results)
7901+
7902+        mw = self._make_new_mw("si1", 0)
7903+        mw.set_checkstring("this is a lie")
7904+        for i in xrange(6):
7905+            mw.put_block(self.block, i, self.salt)
7906+        mw.put_encprivkey(self.encprivkey)
7907+        mw.put_blockhashes(self.block_hash_tree)
7908+        mw.put_sharehashes(self.share_hash_chain)
7909+        mw.put_root_hash(self.root_hash)
7910+        mw.put_signature(self.signature)
7911+        mw.put_verification_key(self.verification_key)
7912+        d = mw.finish_publishing()
7913+        d.addCallback(_check_failure)
7914+        d.addCallback(lambda ignored:
7915+            mw.set_checkstring(""))
7916+        d.addCallback(lambda ignored:
7917+            mw.finish_publishing())
7918+        d.addCallback(_check_success)
7919+        return d
7920+
7921+
7922+    def serialize_blockhashes(self, blockhashes):
7923+        return "".join(blockhashes)
7924+
7925+
7926+    def serialize_sharehashes(self, sharehashes):
7927+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
7928+                        for i in sorted(sharehashes.keys())])
7929+        return ret
7930+
7931+
7932+    def test_write(self):
7933+        # This translates to a file with 6 6-byte segments, and with 2-byte
7934+        # blocks.
7935+        mw = self._make_new_mw("si1", 0)
7936+        # Test writing some blocks.
7937+        read = self.ss.remote_slot_readv
7938+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
7939+        written_block_size = 2 + len(self.salt)
7940+        written_block = self.block + self.salt
7941+        for i in xrange(6):
7942+            mw.put_block(self.block, i, self.salt)
7943+
7944+        mw.put_encprivkey(self.encprivkey)
7945+        mw.put_blockhashes(self.block_hash_tree)
7946+        mw.put_sharehashes(self.share_hash_chain)
7947+        mw.put_root_hash(self.root_hash)
7948+        mw.put_signature(self.signature)
7949+        mw.put_verification_key(self.verification_key)
7950+        d = mw.finish_publishing()
7951+        def _check_publish(results):
7952+            self.failUnlessEqual(len(results), 2)
7953+            result, ign = results
7954+            self.failUnless(result, "publish failed")
7955+            for i in xrange(6):
7956+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
7957+                                {0: [written_block]})
7958+
7959+            expected_private_key_offset = expected_sharedata_offset + \
7960+                                      len(written_block) * 6
7961+            self.failUnlessEqual(len(self.encprivkey), 7)
7962+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
7963+                                 {0: [self.encprivkey]})
7964+
7965+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
7966+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
7967+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
7968+                                 {0: [self.block_hash_tree_s]})
7969+
7970+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
7971+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
7972+                                 {0: [self.share_hash_chain_s]})
7973+
7974+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
7975+                                 {0: [self.root_hash]})
7976+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
7977+            self.failUnlessEqual(len(self.signature), 9)
7978+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
7979+                                 {0: [self.signature]})
7980+
7981+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
7982+            self.failUnlessEqual(len(self.verification_key), 6)
7983+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
7984+                                 {0: [self.verification_key]})
7985+
7986+            signable = mw.get_signable()
7987+            verno, seq, roothash, k, n, segsize, datalen = \
7988+                                            struct.unpack(">BQ32sBBQQ",
7989+                                                          signable)
7990+            self.failUnlessEqual(verno, 1)
7991+            self.failUnlessEqual(seq, 0)
7992+            self.failUnlessEqual(roothash, self.root_hash)
7993+            self.failUnlessEqual(k, 3)
7994+            self.failUnlessEqual(n, 10)
7995+            self.failUnlessEqual(segsize, 6)
7996+            self.failUnlessEqual(datalen, 36)
7997+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
7998+
7999+            # Check the version number to make sure that it is correct.
8000+            expected_version_number = struct.pack(">B", 1)
8001+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
8002+                                 {0: [expected_version_number]})
8003+            # Check the sequence number to make sure that it is correct
8004+            expected_sequence_number = struct.pack(">Q", 0)
8005+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
8006+                                 {0: [expected_sequence_number]})
8007+            # Check that the encoding parameters (k, N, segement size, data
8008+            # length) are what they should be. These are  3, 10, 6, 36
8009+            expected_k = struct.pack(">B", 3)
8010+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
8011+                                 {0: [expected_k]})
8012+            expected_n = struct.pack(">B", 10)
8013+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
8014+                                 {0: [expected_n]})
8015+            expected_segment_size = struct.pack(">Q", 6)
8016+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
8017+                                 {0: [expected_segment_size]})
8018+            expected_data_length = struct.pack(">Q", 36)
8019+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
8020+                                 {0: [expected_data_length]})
8021+            expected_offset = struct.pack(">Q", expected_private_key_offset)
8022+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
8023+                                 {0: [expected_offset]})
8024+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
8025+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
8026+                                 {0: [expected_offset]})
8027+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
8028+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
8029+                                 {0: [expected_offset]})
8030+            expected_offset = struct.pack(">Q", expected_signature_offset)
8031+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
8032+                                 {0: [expected_offset]})
8033+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
8034+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
8035+                                 {0: [expected_offset]})
8036+            expected_offset = struct.pack(">Q", expected_eof_offset)
8037+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
8038+                                 {0: [expected_offset]})
8039+        d.addCallback(_check_publish)
8040+        return d
8041+
8042+    def _make_new_mw(self, si, share, datalength=36):
8043+        # This is a file of size 36 bytes. Since it has a segment
8044+        # size of 6, we know that it has 6 byte segments, which will
8045+        # be split into blocks of 2 bytes because our FEC k
8046+        # parameter is 3.
8047+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
8048+                                6, datalength)
8049+        return mw
8050+
8051+
8052+    def test_write_rejected_with_too_many_blocks(self):
8053+        mw = self._make_new_mw("si0", 0)
8054+
8055+        # Try writing too many blocks. We should not be able to write
8056+        # more than 6
8057+        # blocks into each share.
8058+        d = defer.succeed(None)
8059+        for i in xrange(6):
8060+            d.addCallback(lambda ignored, i=i:
8061+                mw.put_block(self.block, i, self.salt))
8062+        d.addCallback(lambda ignored:
8063+            self.shouldFail(LayoutInvalid, "too many blocks",
8064+                            None,
8065+                            mw.put_block, self.block, 7, self.salt))
8066+        return d
8067+
8068+
8069+    def test_write_rejected_with_invalid_salt(self):
8070+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
8071+        # less should cause an error.
8072+        mw = self._make_new_mw("si1", 0)
8073+        bad_salt = "a" * 17 # 17 bytes
8074+        d = defer.succeed(None)
8075+        d.addCallback(lambda ignored:
8076+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
8077+                            None, mw.put_block, self.block, 7, bad_salt))
8078+        return d
8079+
8080+
8081+    def test_write_rejected_with_invalid_root_hash(self):
8082+        # Try writing an invalid root hash. This should be SHA256d, and
8083+        # 32 bytes long as a result.
8084+        mw = self._make_new_mw("si2", 0)
8085+        # 17 bytes != 32 bytes
8086+        invalid_root_hash = "a" * 17
8087+        d = defer.succeed(None)
8088+        # Before this test can work, we need to put some blocks + salts,
8089+        # a block hash tree, and a share hash tree. Otherwise, we'll see
8090+        # failures that match what we are looking for, but are caused by
8091+        # the constraints imposed on operation ordering.
8092+        for i in xrange(6):
8093+            d.addCallback(lambda ignored, i=i:
8094+                mw.put_block(self.block, i, self.salt))
8095+        d.addCallback(lambda ignored:
8096+            mw.put_encprivkey(self.encprivkey))
8097+        d.addCallback(lambda ignored:
8098+            mw.put_blockhashes(self.block_hash_tree))
8099+        d.addCallback(lambda ignored:
8100+            mw.put_sharehashes(self.share_hash_chain))
8101+        d.addCallback(lambda ignored:
8102+            self.shouldFail(LayoutInvalid, "invalid root hash",
8103+                            None, mw.put_root_hash, invalid_root_hash))
8104+        return d
8105+
8106+
8107+    def test_write_rejected_with_invalid_blocksize(self):
8108+        # The blocksize implied by the writer that we get from
8109+        # _make_new_mw is 2bytes -- any more or any less than this
8110+        # should be cause for failure, unless it is the tail segment, in
8111+        # which case it may not be failure.
8112+        invalid_block = "a"
8113+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
8114+                                             # one byte blocks
8115+        # 1 bytes != 2 bytes
8116+        d = defer.succeed(None)
8117+        d.addCallback(lambda ignored, invalid_block=invalid_block:
8118+            self.shouldFail(LayoutInvalid, "test blocksize too small",
8119+                            None, mw.put_block, invalid_block, 0,
8120+                            self.salt))
8121+        invalid_block = invalid_block * 3
8122+        # 3 bytes != 2 bytes
8123+        d.addCallback(lambda ignored:
8124+            self.shouldFail(LayoutInvalid, "test blocksize too large",
8125+                            None,
8126+                            mw.put_block, invalid_block, 0, self.salt))
8127+        for i in xrange(5):
8128+            d.addCallback(lambda ignored, i=i:
8129+                mw.put_block(self.block, i, self.salt))
8130+        # Try to put an invalid tail segment
8131+        d.addCallback(lambda ignored:
8132+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
8133+                            None,
8134+                            mw.put_block, self.block, 5, self.salt))
8135+        valid_block = "a"
8136+        d.addCallback(lambda ignored:
8137+            mw.put_block(valid_block, 5, self.salt))
8138+        return d
8139+
8140+
8141+    def test_write_enforces_order_constraints(self):
8142+        # We require that the MDMFSlotWriteProxy be interacted with in a
8143+        # specific way.
8144+        # That way is:
8145+        # 0: __init__
8146+        # 1: write blocks and salts
8147+        # 2: Write the encrypted private key
8148+        # 3: Write the block hashes
8149+        # 4: Write the share hashes
8150+        # 5: Write the root hash and salt hash
8151+        # 6: Write the signature and verification key
8152+        # 7: Write the file.
8153+        #
8154+        # Some of these can be performed out-of-order, and some can't.
8155+        # The dependencies that I want to test here are:
8156+        #  - Private key before block hashes
8157+        #  - share hashes and block hashes before root hash
8158+        #  - root hash before signature
8159+        #  - signature before verification key
8160+        mw0 = self._make_new_mw("si0", 0)
8161+        # Write some shares
8162+        d = defer.succeed(None)
8163+        for i in xrange(6):
8164+            d.addCallback(lambda ignored, i=i:
8165+                mw0.put_block(self.block, i, self.salt))
8166+        # Try to write the block hashes before writing the encrypted
8167+        # private key
8168+        d.addCallback(lambda ignored:
8169+            self.shouldFail(LayoutInvalid, "block hashes before key",
8170+                            None, mw0.put_blockhashes,
8171+                            self.block_hash_tree))
8172+
8173+        # Write the private key.
8174+        d.addCallback(lambda ignored:
8175+            mw0.put_encprivkey(self.encprivkey))
8176+
8177+
8178+        # Try to write the share hash chain without writing the block
8179+        # hash tree
8180+        d.addCallback(lambda ignored:
8181+            self.shouldFail(LayoutInvalid, "share hash chain before "
8182+                                           "salt hash tree",
8183+                            None,
8184+                            mw0.put_sharehashes, self.share_hash_chain))
8185+
8186+        # Try to write the root hash and without writing either the
8187+        # block hashes or the or the share hashes
8188+        d.addCallback(lambda ignored:
8189+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
8190+                            None,
8191+                            mw0.put_root_hash, self.root_hash))
8192+
8193+        # Now write the block hashes and try again
8194+        d.addCallback(lambda ignored:
8195+            mw0.put_blockhashes(self.block_hash_tree))
8196+
8197+        d.addCallback(lambda ignored:
8198+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
8199+                            None, mw0.put_root_hash, self.root_hash))
8200+
8201+        # We haven't yet put the root hash on the share, so we shouldn't
8202+        # be able to sign it.
8203+        d.addCallback(lambda ignored:
8204+            self.shouldFail(LayoutInvalid, "signature before root hash",
8205+                            None, mw0.put_signature, self.signature))
8206+
8207+        d.addCallback(lambda ignored:
8208+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
8209+
8210+        # ..and, since that fails, we also shouldn't be able to put the
8211+        # verification key.
8212+        d.addCallback(lambda ignored:
8213+            self.shouldFail(LayoutInvalid, "key before signature",
8214+                            None, mw0.put_verification_key,
8215+                            self.verification_key))
8216+
8217+        # Now write the share hashes.
8218+        d.addCallback(lambda ignored:
8219+            mw0.put_sharehashes(self.share_hash_chain))
8220+        # We should be able to write the root hash now too
8221+        d.addCallback(lambda ignored:
8222+            mw0.put_root_hash(self.root_hash))
8223+
8224+        # We should still be unable to put the verification key
8225+        d.addCallback(lambda ignored:
8226+            self.shouldFail(LayoutInvalid, "key before signature",
8227+                            None, mw0.put_verification_key,
8228+                            self.verification_key))
8229+
8230+        d.addCallback(lambda ignored:
8231+            mw0.put_signature(self.signature))
8232+
8233+        # We shouldn't be able to write the offsets to the remote server
8234+        # until the offset table is finished; IOW, until we have written
8235+        # the verification key.
8236+        d.addCallback(lambda ignored:
8237+            self.shouldFail(LayoutInvalid, "offsets before verification key",
8238+                            None,
8239+                            mw0.finish_publishing))
8240+
8241+        d.addCallback(lambda ignored:
8242+            mw0.put_verification_key(self.verification_key))
8243+        return d
8244+
8245+
8246+    def test_end_to_end(self):
8247+        mw = self._make_new_mw("si1", 0)
8248+        # Write a share using the mutable writer, and make sure that the
8249+        # reader knows how to read everything back to us.
8250+        d = defer.succeed(None)
8251+        for i in xrange(6):
8252+            d.addCallback(lambda ignored, i=i:
8253+                mw.put_block(self.block, i, self.salt))
8254+        d.addCallback(lambda ignored:
8255+            mw.put_encprivkey(self.encprivkey))
8256+        d.addCallback(lambda ignored:
8257+            mw.put_blockhashes(self.block_hash_tree))
8258+        d.addCallback(lambda ignored:
8259+            mw.put_sharehashes(self.share_hash_chain))
8260+        d.addCallback(lambda ignored:
8261+            mw.put_root_hash(self.root_hash))
8262+        d.addCallback(lambda ignored:
8263+            mw.put_signature(self.signature))
8264+        d.addCallback(lambda ignored:
8265+            mw.put_verification_key(self.verification_key))
8266+        d.addCallback(lambda ignored:
8267+            mw.finish_publishing())
8268+
8269+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8270+        def _check_block_and_salt((block, salt)):
8271+            self.failUnlessEqual(block, self.block)
8272+            self.failUnlessEqual(salt, self.salt)
8273+
8274+        for i in xrange(6):
8275+            d.addCallback(lambda ignored, i=i:
8276+                mr.get_block_and_salt(i))
8277+            d.addCallback(_check_block_and_salt)
8278+
8279+        d.addCallback(lambda ignored:
8280+            mr.get_encprivkey())
8281+        d.addCallback(lambda encprivkey:
8282+            self.failUnlessEqual(self.encprivkey, encprivkey))
8283+
8284+        d.addCallback(lambda ignored:
8285+            mr.get_blockhashes())
8286+        d.addCallback(lambda blockhashes:
8287+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
8288+
8289+        d.addCallback(lambda ignored:
8290+            mr.get_sharehashes())
8291+        d.addCallback(lambda sharehashes:
8292+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
8293+
8294+        d.addCallback(lambda ignored:
8295+            mr.get_signature())
8296+        d.addCallback(lambda signature:
8297+            self.failUnlessEqual(signature, self.signature))
8298+
8299+        d.addCallback(lambda ignored:
8300+            mr.get_verification_key())
8301+        d.addCallback(lambda verification_key:
8302+            self.failUnlessEqual(verification_key, self.verification_key))
8303+
8304+        d.addCallback(lambda ignored:
8305+            mr.get_seqnum())
8306+        d.addCallback(lambda seqnum:
8307+            self.failUnlessEqual(seqnum, 0))
8308+
8309+        d.addCallback(lambda ignored:
8310+            mr.get_root_hash())
8311+        d.addCallback(lambda root_hash:
8312+            self.failUnlessEqual(self.root_hash, root_hash))
8313+
8314+        d.addCallback(lambda ignored:
8315+            mr.get_encoding_parameters())
8316+        def _check_encoding_parameters((k, n, segsize, datalen)):
8317+            self.failUnlessEqual(k, 3)
8318+            self.failUnlessEqual(n, 10)
8319+            self.failUnlessEqual(segsize, 6)
8320+            self.failUnlessEqual(datalen, 36)
8321+        d.addCallback(_check_encoding_parameters)
8322+
8323+        d.addCallback(lambda ignored:
8324+            mr.get_checkstring())
8325+        d.addCallback(lambda checkstring:
8326+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
8327+        return d
8328+
8329+
8330+    def test_is_sdmf(self):
8331+        # The MDMFSlotReadProxy should also know how to read SDMF files,
8332+        # since it will encounter them on the grid. Callers use the
8333+        # is_sdmf method to test this.
8334+        self.write_sdmf_share_to_server("si1")
8335+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8336+        d = mr.is_sdmf()
8337+        d.addCallback(lambda issdmf:
8338+            self.failUnless(issdmf))
8339+        return d
8340+
8341+
8342+    def test_reads_sdmf(self):
8343+        # The slot read proxy should, naturally, know how to tell us
8344+        # about data in the SDMF format
8345+        self.write_sdmf_share_to_server("si1")
8346+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8347+        d = defer.succeed(None)
8348+        d.addCallback(lambda ignored:
8349+            mr.is_sdmf())
8350+        d.addCallback(lambda issdmf:
8351+            self.failUnless(issdmf))
8352+
8353+        # What do we need to read?
8354+        #  - The sharedata
8355+        #  - The salt
8356+        d.addCallback(lambda ignored:
8357+            mr.get_block_and_salt(0))
8358+        def _check_block_and_salt(results):
8359+            block, salt = results
8360+            # Our original file is 36 bytes long. Then each share is 12
8361+            # bytes in size. The share is composed entirely of the
8362+            # letter a. self.block contains 2 as, so 6 * self.block is
8363+            # what we are looking for.
8364+            self.failUnlessEqual(block, self.block * 6)
8365+            self.failUnlessEqual(salt, self.salt)
8366+        d.addCallback(_check_block_and_salt)
8367+
8368+        #  - The blockhashes
8369+        d.addCallback(lambda ignored:
8370+            mr.get_blockhashes())
8371+        d.addCallback(lambda blockhashes:
8372+            self.failUnlessEqual(self.block_hash_tree,
8373+                                 blockhashes,
8374+                                 blockhashes))
8375+        #  - The sharehashes
8376+        d.addCallback(lambda ignored:
8377+            mr.get_sharehashes())
8378+        d.addCallback(lambda sharehashes:
8379+            self.failUnlessEqual(self.share_hash_chain,
8380+                                 sharehashes))
8381+        #  - The keys
8382+        d.addCallback(lambda ignored:
8383+            mr.get_encprivkey())
8384+        d.addCallback(lambda encprivkey:
8385+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
8386+        d.addCallback(lambda ignored:
8387+            mr.get_verification_key())
8388+        d.addCallback(lambda verification_key:
8389+            self.failUnlessEqual(verification_key,
8390+                                 self.verification_key,
8391+                                 verification_key))
8392+        #  - The signature
8393+        d.addCallback(lambda ignored:
8394+            mr.get_signature())
8395+        d.addCallback(lambda signature:
8396+            self.failUnlessEqual(signature, self.signature, signature))
8397+
8398+        #  - The sequence number
8399+        d.addCallback(lambda ignored:
8400+            mr.get_seqnum())
8401+        d.addCallback(lambda seqnum:
8402+            self.failUnlessEqual(seqnum, 0, seqnum))
8403+
8404+        #  - The root hash
8405+        d.addCallback(lambda ignored:
8406+            mr.get_root_hash())
8407+        d.addCallback(lambda root_hash:
8408+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
8409+        return d
8410+
8411+
8412+    def test_only_reads_one_segment_sdmf(self):
8413+        # SDMF shares have only one segment, so it doesn't make sense to
8414+        # read more segments than that. The reader should know this and
8415+        # complain if we try to do that.
8416+        self.write_sdmf_share_to_server("si1")
8417+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8418+        d = defer.succeed(None)
8419+        d.addCallback(lambda ignored:
8420+            mr.is_sdmf())
8421+        d.addCallback(lambda issdmf:
8422+            self.failUnless(issdmf))
8423+        d.addCallback(lambda ignored:
8424+            self.shouldFail(LayoutInvalid, "test bad segment",
8425+                            None,
8426+                            mr.get_block_and_salt, 1))
8427+        return d
8428+
8429+
8430+    def test_read_with_prefetched_mdmf_data(self):
8431+        # The MDMFSlotReadProxy will prefill certain fields if you pass
8432+        # it data that you have already fetched. This is useful for
8433+        # cases like the Servermap, which prefetches ~2kb of data while
8434+        # finding out which shares are on the remote peer so that it
8435+        # doesn't waste round trips.
8436+        mdmf_data = self.build_test_mdmf_share()
8437+        self.write_test_share_to_server("si1")
8438+        def _make_mr(ignored, length):
8439+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
8440+            return mr
8441+
8442+        d = defer.succeed(None)
8443+        # This should be enough to fill in both the encoding parameters
8444+        # and the table of offsets, which will complete the version
8445+        # information tuple.
8446+        d.addCallback(_make_mr, 107)
8447+        d.addCallback(lambda mr:
8448+            mr.get_verinfo())
8449+        def _check_verinfo(verinfo):
8450+            self.failUnless(verinfo)
8451+            self.failUnlessEqual(len(verinfo), 9)
8452+            (seqnum,
8453+             root_hash,
8454+             salt_hash,
8455+             segsize,
8456+             datalen,
8457+             k,
8458+             n,
8459+             prefix,
8460+             offsets) = verinfo
8461+            self.failUnlessEqual(seqnum, 0)
8462+            self.failUnlessEqual(root_hash, self.root_hash)
8463+            self.failUnlessEqual(segsize, 6)
8464+            self.failUnlessEqual(datalen, 36)
8465+            self.failUnlessEqual(k, 3)
8466+            self.failUnlessEqual(n, 10)
8467+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
8468+                                          1,
8469+                                          seqnum,
8470+                                          root_hash,
8471+                                          k,
8472+                                          n,
8473+                                          segsize,
8474+                                          datalen)
8475+            self.failUnlessEqual(expected_prefix, prefix)
8476+            self.failUnlessEqual(self.rref.read_count, 0)
8477+        d.addCallback(_check_verinfo)
8478+        # This is not enough data to read a block and a share, so the
8479+        # wrapper should attempt to read this from the remote server.
8480+        d.addCallback(_make_mr, 107)
8481+        d.addCallback(lambda mr:
8482+            mr.get_block_and_salt(0))
8483+        def _check_block_and_salt((block, salt)):
8484+            self.failUnlessEqual(block, self.block)
8485+            self.failUnlessEqual(salt, self.salt)
8486+            self.failUnlessEqual(self.rref.read_count, 1)
8487+        # This should be enough data to read one block.
8488+        d.addCallback(_make_mr, 249)
8489+        d.addCallback(lambda mr:
8490+            mr.get_block_and_salt(0))
8491+        d.addCallback(_check_block_and_salt)
8492+        return d
8493+
8494+
8495+    def test_read_with_prefetched_sdmf_data(self):
8496+        sdmf_data = self.build_test_sdmf_share()
8497+        self.write_sdmf_share_to_server("si1")
8498+        def _make_mr(ignored, length):
8499+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
8500+            return mr
8501+
8502+        d = defer.succeed(None)
8503+        # This should be enough to get us the encoding parameters,
8504+        # offset table, and everything else we need to build a verinfo
8505+        # string.
8506+        d.addCallback(_make_mr, 107)
8507+        d.addCallback(lambda mr:
8508+            mr.get_verinfo())
8509+        def _check_verinfo(verinfo):
8510+            self.failUnless(verinfo)
8511+            self.failUnlessEqual(len(verinfo), 9)
8512+            (seqnum,
8513+             root_hash,
8514+             salt,
8515+             segsize,
8516+             datalen,
8517+             k,
8518+             n,
8519+             prefix,
8520+             offsets) = verinfo
8521+            self.failUnlessEqual(seqnum, 0)
8522+            self.failUnlessEqual(root_hash, self.root_hash)
8523+            self.failUnlessEqual(salt, self.salt)
8524+            self.failUnlessEqual(segsize, 36)
8525+            self.failUnlessEqual(datalen, 36)
8526+            self.failUnlessEqual(k, 3)
8527+            self.failUnlessEqual(n, 10)
8528+            expected_prefix = struct.pack(SIGNED_PREFIX,
8529+                                          0,
8530+                                          seqnum,
8531+                                          root_hash,
8532+                                          salt,
8533+                                          k,
8534+                                          n,
8535+                                          segsize,
8536+                                          datalen)
8537+            self.failUnlessEqual(expected_prefix, prefix)
8538+            self.failUnlessEqual(self.rref.read_count, 0)
8539+        d.addCallback(_check_verinfo)
8540+        # This shouldn't be enough to read any share data.
8541+        d.addCallback(_make_mr, 107)
8542+        d.addCallback(lambda mr:
8543+            mr.get_block_and_salt(0))
8544+        def _check_block_and_salt((block, salt)):
8545+            self.failUnlessEqual(block, self.block * 6)
8546+            self.failUnlessEqual(salt, self.salt)
8547+            # TODO: Fix the read routine so that it reads only the data
8548+            #       that it has cached if it can't read all of it.
8549+            self.failUnlessEqual(self.rref.read_count, 2)
8550+
8551+        # This should be enough to read share data.
8552+        d.addCallback(_make_mr, self.offsets['share_data'])
8553+        d.addCallback(lambda mr:
8554+            mr.get_block_and_salt(0))
8555+        d.addCallback(_check_block_and_salt)
8556+        return d
8557+
8558+
8559+    def test_read_with_empty_mdmf_file(self):
8560+        # Some tests upload a file with no contents to test things
8561+        # unrelated to the actual handling of the content of the file.
8562+        # The reader should behave intelligently in these cases.
8563+        self.write_test_share_to_server("si1", empty=True)
8564+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8565+        # We should be able to get the encoding parameters, and they
8566+        # should be correct.
8567+        d = defer.succeed(None)
8568+        d.addCallback(lambda ignored:
8569+            mr.get_encoding_parameters())
8570+        def _check_encoding_parameters(params):
8571+            self.failUnlessEqual(len(params), 4)
8572+            k, n, segsize, datalen = params
8573+            self.failUnlessEqual(k, 3)
8574+            self.failUnlessEqual(n, 10)
8575+            self.failUnlessEqual(segsize, 0)
8576+            self.failUnlessEqual(datalen, 0)
8577+        d.addCallback(_check_encoding_parameters)
8578+
8579+        # We should not be able to fetch a block, since there are no
8580+        # blocks to fetch
8581+        d.addCallback(lambda ignored:
8582+            self.shouldFail(LayoutInvalid, "get block on empty file",
8583+                            None,
8584+                            mr.get_block_and_salt, 0))
8585+        return d
8586+
8587+
8588+    def test_read_with_empty_sdmf_file(self):
8589+        self.write_sdmf_share_to_server("si1", empty=True)
8590+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8591+        # We should be able to get the encoding parameters, and they
8592+        # should be correct
8593+        d = defer.succeed(None)
8594+        d.addCallback(lambda ignored:
8595+            mr.get_encoding_parameters())
8596+        def _check_encoding_parameters(params):
8597+            self.failUnlessEqual(len(params), 4)
8598+            k, n, segsize, datalen = params
8599+            self.failUnlessEqual(k, 3)
8600+            self.failUnlessEqual(n, 10)
8601+            self.failUnlessEqual(segsize, 0)
8602+            self.failUnlessEqual(datalen, 0)
8603+        d.addCallback(_check_encoding_parameters)
8604+
8605+        # It does not make sense to get a block in this format, so we
8606+        # should not be able to.
8607+        d.addCallback(lambda ignored:
8608+            self.shouldFail(LayoutInvalid, "get block on an empty file",
8609+                            None,
8610+                            mr.get_block_and_salt, 0))
8611+        return d
8612+
8613+
8614+    def test_verinfo_with_sdmf_file(self):
8615+        self.write_sdmf_share_to_server("si1")
8616+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8617+        # We should be able to get the version information.
8618+        d = defer.succeed(None)
8619+        d.addCallback(lambda ignored:
8620+            mr.get_verinfo())
8621+        def _check_verinfo(verinfo):
8622+            self.failUnless(verinfo)
8623+            self.failUnlessEqual(len(verinfo), 9)
8624+            (seqnum,
8625+             root_hash,
8626+             salt,
8627+             segsize,
8628+             datalen,
8629+             k,
8630+             n,
8631+             prefix,
8632+             offsets) = verinfo
8633+            self.failUnlessEqual(seqnum, 0)
8634+            self.failUnlessEqual(root_hash, self.root_hash)
8635+            self.failUnlessEqual(salt, self.salt)
8636+            self.failUnlessEqual(segsize, 36)
8637+            self.failUnlessEqual(datalen, 36)
8638+            self.failUnlessEqual(k, 3)
8639+            self.failUnlessEqual(n, 10)
8640+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
8641+                                          0,
8642+                                          seqnum,
8643+                                          root_hash,
8644+                                          salt,
8645+                                          k,
8646+                                          n,
8647+                                          segsize,
8648+                                          datalen)
8649+            self.failUnlessEqual(prefix, expected_prefix)
8650+            self.failUnlessEqual(offsets, self.offsets)
8651+        d.addCallback(_check_verinfo)
8652+        return d
8653+
8654+
8655+    def test_verinfo_with_mdmf_file(self):
8656+        self.write_test_share_to_server("si1")
8657+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8658+        d = defer.succeed(None)
8659+        d.addCallback(lambda ignored:
8660+            mr.get_verinfo())
8661+        def _check_verinfo(verinfo):
8662+            self.failUnless(verinfo)
8663+            self.failUnlessEqual(len(verinfo), 9)
8664+            (seqnum,
8665+             root_hash,
8666+             IV,
8667+             segsize,
8668+             datalen,
8669+             k,
8670+             n,
8671+             prefix,
8672+             offsets) = verinfo
8673+            self.failUnlessEqual(seqnum, 0)
8674+            self.failUnlessEqual(root_hash, self.root_hash)
8675+            self.failIf(IV)
8676+            self.failUnlessEqual(segsize, 6)
8677+            self.failUnlessEqual(datalen, 36)
8678+            self.failUnlessEqual(k, 3)
8679+            self.failUnlessEqual(n, 10)
8680+            expected_prefix = struct.pack(">BQ32s BBQQ",
8681+                                          1,
8682+                                          seqnum,
8683+                                          root_hash,
8684+                                          k,
8685+                                          n,
8686+                                          segsize,
8687+                                          datalen)
8688+            self.failUnlessEqual(prefix, expected_prefix)
8689+            self.failUnlessEqual(offsets, self.offsets)
8690+        d.addCallback(_check_verinfo)
8691+        return d
8692+
8693+
8694+    def test_reader_queue(self):
8695+        self.write_test_share_to_server('si1')
8696+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8697+        d1 = mr.get_block_and_salt(0, queue=True)
8698+        d2 = mr.get_blockhashes(queue=True)
8699+        d3 = mr.get_sharehashes(queue=True)
8700+        d4 = mr.get_signature(queue=True)
8701+        d5 = mr.get_verification_key(queue=True)
8702+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
8703+        mr.flush()
8704+        def _print(results):
8705+            self.failUnlessEqual(len(results), 5)
8706+            # We have one read for version information and offsets, and
8707+            # one for everything else.
8708+            self.failUnlessEqual(self.rref.read_count, 2)
8709+            block, salt = results[0][1] # results[0] is a boolean that says
8710+                                           # whether or not the operation
8711+                                           # worked.
8712+            self.failUnlessEqual(self.block, block)
8713+            self.failUnlessEqual(self.salt, salt)
8714+
8715+            blockhashes = results[1][1]
8716+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
8717+
8718+            sharehashes = results[2][1]
8719+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
8720+
8721+            signature = results[3][1]
8722+            self.failUnlessEqual(self.signature, signature)
8723+
8724+            verification_key = results[4][1]
8725+            self.failUnlessEqual(self.verification_key, verification_key)
8726+        dl.addCallback(_print)
8727+        return dl
8728+
8729+
8730+    def test_sdmf_writer(self):
8731+        # Go through the motions of writing an SDMF share to the storage
8732+        # server. Then read the storage server to see that the share got
8733+        # written in the way that we think it should have.
8734+
8735+        # We do this first so that the necessary instance variables get
8736+        # set the way we want them for the tests below.
8737+        data = self.build_test_sdmf_share()
8738+        sdmfr = SDMFSlotWriteProxy(0,
8739+                                   self.rref,
8740+                                   "si1",
8741+                                   self.secrets,
8742+                                   0, 3, 10, 36, 36)
8743+        # Put the block and salt.
8744+        sdmfr.put_block(self.blockdata, 0, self.salt)
8745+
8746+        # Put the encprivkey
8747+        sdmfr.put_encprivkey(self.encprivkey)
8748+
8749+        # Put the block and share hash chains
8750+        sdmfr.put_blockhashes(self.block_hash_tree)
8751+        sdmfr.put_sharehashes(self.share_hash_chain)
8752+        sdmfr.put_root_hash(self.root_hash)
8753+
8754+        # Put the signature
8755+        sdmfr.put_signature(self.signature)
8756+
8757+        # Put the verification key
8758+        sdmfr.put_verification_key(self.verification_key)
8759+
8760+        # Now check to make sure that nothing has been written yet.
8761+        self.failUnlessEqual(self.rref.write_count, 0)
8762+
8763+        # Now finish publishing
8764+        d = sdmfr.finish_publishing()
8765+        def _then(ignored):
8766+            self.failUnlessEqual(self.rref.write_count, 1)
8767+            read = self.ss.remote_slot_readv
8768+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
8769+                                 {0: [data]})
8770+        d.addCallback(_then)
8771+        return d
8772+
8773+
8774+    def test_sdmf_writer_preexisting_share(self):
8775+        data = self.build_test_sdmf_share()
8776+        self.write_sdmf_share_to_server("si1")
8777+
8778+        # Now there is a share on the storage server. To successfully
8779+        # write, we need to set the checkstring correctly. When we
8780+        # don't, no write should occur.
8781+        sdmfw = SDMFSlotWriteProxy(0,
8782+                                   self.rref,
8783+                                   "si1",
8784+                                   self.secrets,
8785+                                   1, 3, 10, 36, 36)
8786+        sdmfw.put_block(self.blockdata, 0, self.salt)
8787+
8788+        # Put the encprivkey
8789+        sdmfw.put_encprivkey(self.encprivkey)
8790+
8791+        # Put the block and share hash chains
8792+        sdmfw.put_blockhashes(self.block_hash_tree)
8793+        sdmfw.put_sharehashes(self.share_hash_chain)
8794+
8795+        # Put the root hash
8796+        sdmfw.put_root_hash(self.root_hash)
8797+
8798+        # Put the signature
8799+        sdmfw.put_signature(self.signature)
8800+
8801+        # Put the verification key
8802+        sdmfw.put_verification_key(self.verification_key)
8803+
8804+        # We shouldn't have a checkstring yet
8805+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
8806+
8807+        d = sdmfw.finish_publishing()
8808+        def _then(results):
8809+            self.failIf(results[0])
8810+            # this is the correct checkstring
8811+            self._expected_checkstring = results[1][0][0]
8812+            return self._expected_checkstring
8813+
8814+        d.addCallback(_then)
8815+        d.addCallback(sdmfw.set_checkstring)
8816+        d.addCallback(lambda ignored:
8817+            sdmfw.get_checkstring())
8818+        d.addCallback(lambda checkstring:
8819+            self.failUnlessEqual(checkstring, self._expected_checkstring))
8820+        d.addCallback(lambda ignored:
8821+            sdmfw.finish_publishing())
8822+        def _then_again(results):
8823+            self.failUnless(results[0])
8824+            read = self.ss.remote_slot_readv
8825+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
8826+                                 {0: [struct.pack(">Q", 1)]})
8827+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
8828+                                 {0: [data[9:]]})
8829+        d.addCallback(_then_again)
8830+        return d
8831+
8832+
8833 class Stats(unittest.TestCase):
8834 
8835     def setUp(self):
8836}
8837[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
8838Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
8839 Ignore-this: 93e536c0f8efb705310f13ff64621527
8840] {
8841hunk ./src/allmydata/immutable/filenode.py 8
8842 now = time.time
8843 from zope.interface import implements, Interface
8844 from twisted.internet import defer
8845-from twisted.internet.interfaces import IConsumer
8846 
8847hunk ./src/allmydata/immutable/filenode.py 9
8848-from allmydata.interfaces import IImmutableFileNode, IUploadResults
8849 from allmydata import uri
8850hunk ./src/allmydata/immutable/filenode.py 10
8851+from twisted.internet.interfaces import IConsumer
8852+from twisted.protocols import basic
8853+from foolscap.api import eventually
8854+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
8855+     IDownloadTarget, IUploadResults
8856+from allmydata.util import dictutil, log, base32, consumer
8857+from allmydata.immutable.checker import Checker
8858 from allmydata.check_results import CheckResults, CheckAndRepairResults
8859 from allmydata.util.dictutil import DictOfSets
8860 from pycryptopp.cipher.aes import AES
8861hunk ./src/allmydata/immutable/filenode.py 296
8862         return self._cnode.check_and_repair(monitor, verify, add_lease)
8863     def check(self, monitor, verify=False, add_lease=False):
8864         return self._cnode.check(monitor, verify, add_lease)
8865+
8866+    def get_best_readable_version(self):
8867+        """
8868+        Return an IReadable of the best version of this file. Since
8869+        immutable files can have only one version, we just return the
8870+        current filenode.
8871+        """
8872+        return defer.succeed(self)
8873+
8874+
8875+    def download_best_version(self):
8876+        """
8877+        Download the best version of this file, returning its contents
8878+        as a bytestring. Since there is only one version of an immutable
8879+        file, we download and return the contents of this file.
8880+        """
8881+        d = consumer.download_to_data(self)
8882+        return d
8883+
8884+    # for an immutable file, download_to_data (specified in IReadable)
8885+    # is the same as download_best_version (specified in IFileNode). For
8886+    # mutable files, the difference is more meaningful, since they can
8887+    # have multiple versions.
8888+    download_to_data = download_best_version
8889+
8890+
8891+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
8892+    # get_size_of_best_version(IFileNode) are all the same for immutable
8893+    # files.
8894+    get_size_of_best_version = get_current_size
8895}
8896[immutable/literal.py: implement the same interfaces as other filenodes
8897Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
8898 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
8899] hunk ./src/allmydata/immutable/literal.py 106
8900         d.addCallback(lambda lastSent: consumer)
8901         return d
8902 
8903+    # IReadable, IFileNode, IFilesystemNode
8904+    def get_best_readable_version(self):
8905+        return defer.succeed(self)
8906+
8907+
8908+    def download_best_version(self):
8909+        return defer.succeed(self.u.data)
8910+
8911+
8912+    download_to_data = download_best_version
8913+    get_size_of_best_version = get_current_size
8914+
8915[tests:
8916Kevan Carstensen <kevan@isnotajoke.com>**20100810000709
8917 Ignore-this: 34cd02f5717e192f2e648c66d856fd2e
8918 
8919     - A lot of existing tests relied on aspects of the mutable file
8920       implementation that were changed. This patch updates those tests
8921       to work with the changes.
8922     - This patch also adds tests for new features.
8923] {
8924hunk ./src/allmydata/test/common.py 12
8925 from allmydata import uri, dirnode, client
8926 from allmydata.introducer.server import IntroducerNode
8927 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
8928-     FileTooLargeError, NotEnoughSharesError, ICheckable
8929+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
8930+     IMutableUploadable
8931 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
8932      DeepCheckResults, DeepCheckAndRepairResults
8933 from allmydata.mutable.common import CorruptShareError
8934hunk ./src/allmydata/test/common.py 18
8935 from allmydata.mutable.layout import unpack_header
8936+from allmydata.mutable.publish import MutableData
8937 from allmydata.storage.server import storage_index_to_dir
8938 from allmydata.storage.mutable import MutableShareFile
8939 from allmydata.util import hashutil, log, fileutil, pollmixin
8940hunk ./src/allmydata/test/common.py 152
8941         consumer.write(data[start:end])
8942         return consumer
8943 
8944+
8945+    def get_best_readable_version(self):
8946+        return defer.succeed(self)
8947+
8948+
8949+    download_best_version = download_to_data
8950+
8951+
8952+    def download_to_data(self):
8953+        return download_to_data(self)
8954+
8955+
8956+    def get_size_of_best_version(self):
8957+        return defer.succeed(self.get_size)
8958+
8959+
8960 def make_chk_file_cap(size):
8961     return uri.CHKFileURI(key=os.urandom(16),
8962                           uri_extension_hash=os.urandom(32),
8963hunk ./src/allmydata/test/common.py 198
8964         self.init_from_cap(make_mutable_file_cap())
8965     def create(self, contents, key_generator=None, keysize=None):
8966         initial_contents = self._get_initial_contents(contents)
8967-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
8968-            raise FileTooLargeError("SDMF is limited to one segment, and "
8969-                                    "%d > %d" % (len(initial_contents),
8970-                                                 self.MUTABLE_SIZELIMIT))
8971-        self.all_contents[self.storage_index] = initial_contents
8972+        data = initial_contents.read(initial_contents.get_size())
8973+        data = "".join(data)
8974+        self.all_contents[self.storage_index] = data
8975         return defer.succeed(self)
8976     def _get_initial_contents(self, contents):
8977hunk ./src/allmydata/test/common.py 203
8978-        if isinstance(contents, str):
8979-            return contents
8980         if contents is None:
8981hunk ./src/allmydata/test/common.py 204
8982-            return ""
8983+            return MutableData("")
8984+
8985+        if IMutableUploadable.providedBy(contents):
8986+            return contents
8987+
8988         assert callable(contents), "%s should be callable, not %s" % \
8989                (contents, type(contents))
8990         return contents(self)
8991hunk ./src/allmydata/test/common.py 314
8992         return d
8993 
8994     def download_best_version(self):
8995+        return defer.succeed(self._download_best_version())
8996+
8997+
8998+    def _download_best_version(self, ignored=None):
8999         if isinstance(self.my_uri, uri.LiteralFileURI):
9000hunk ./src/allmydata/test/common.py 319
9001-            return defer.succeed(self.my_uri.data)
9002+            return self.my_uri.data
9003         if self.storage_index not in self.all_contents:
9004hunk ./src/allmydata/test/common.py 321
9005-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9006-        return defer.succeed(self.all_contents[self.storage_index])
9007+            raise NotEnoughSharesError(None, 0, 3)
9008+        return self.all_contents[self.storage_index]
9009+
9010 
9011     def overwrite(self, new_contents):
9012hunk ./src/allmydata/test/common.py 326
9013-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9014-            raise FileTooLargeError("SDMF is limited to one segment, and "
9015-                                    "%d > %d" % (len(new_contents),
9016-                                                 self.MUTABLE_SIZELIMIT))
9017         assert not self.is_readonly()
9018hunk ./src/allmydata/test/common.py 327
9019-        self.all_contents[self.storage_index] = new_contents
9020+        new_data = new_contents.read(new_contents.get_size())
9021+        new_data = "".join(new_data)
9022+        self.all_contents[self.storage_index] = new_data
9023         return defer.succeed(None)
9024     def modify(self, modifier):
9025         # this does not implement FileTooLargeError, but the real one does
9026hunk ./src/allmydata/test/common.py 337
9027     def _modify(self, modifier):
9028         assert not self.is_readonly()
9029         old_contents = self.all_contents[self.storage_index]
9030-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9031+        new_data = modifier(old_contents, None, True)
9032+        self.all_contents[self.storage_index] = new_data
9033         return None
9034 
9035hunk ./src/allmydata/test/common.py 341
9036+    # As actually implemented, MutableFilenode and MutableFileVersion
9037+    # are distinct. However, nothing in the webapi uses (yet) that
9038+    # distinction -- it just uses the unified download interface
9039+    # provided by get_best_readable_version and read. When we start
9040+    # doing cooler things like LDMF, we will want to revise this code to
9041+    # be less simplistic.
9042+    def get_best_readable_version(self):
9043+        return defer.succeed(self)
9044+
9045+
9046+    def get_best_mutable_version(self):
9047+        return defer.succeed(self)
9048+
9049+    # Ditto for this, which is an implementation of IWritable.
9050+    # XXX: Declare that the same is implemented.
9051+    def update(self, data, offset):
9052+        assert not self.is_readonly()
9053+        def modifier(old, servermap, first_time):
9054+            new = old[:offset] + "".join(data.read(data.get_size()))
9055+            new += old[len(new):]
9056+            return new
9057+        return self.modify(modifier)
9058+
9059+
9060+    def read(self, consumer, offset=0, size=None):
9061+        data = self._download_best_version()
9062+        if size:
9063+            data = data[offset:offset+size]
9064+        consumer.write(data)
9065+        return defer.succeed(consumer)
9066+
9067+
9068 def make_mutable_file_cap():
9069     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
9070                                    fingerprint=os.urandom(32))
9071hunk ./src/allmydata/test/test_checker.py 11
9072 from allmydata.test.no_network import GridTestMixin
9073 from allmydata.immutable.upload import Data
9074 from allmydata.test.common_web import WebRenderingMixin
9075+from allmydata.mutable.publish import MutableData
9076 
9077 class FakeClient:
9078     def get_storage_broker(self):
9079hunk ./src/allmydata/test/test_checker.py 291
9080         def _stash_immutable(ur):
9081             self.imm = c0.create_node_from_uri(ur.uri)
9082         d.addCallback(_stash_immutable)
9083-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9084+        d.addCallback(lambda ign:
9085+            c0.create_mutable_file(MutableData("contents")))
9086         def _stash_mutable(node):
9087             self.mut = node
9088         d.addCallback(_stash_mutable)
9089hunk ./src/allmydata/test/test_cli.py 11
9090 from allmydata.util import fileutil, hashutil, base32
9091 from allmydata import uri
9092 from allmydata.immutable import upload
9093+from allmydata.mutable.publish import MutableData
9094 from allmydata.dirnode import normalize
9095 
9096 # Test that the scripts can be imported -- although the actual tests of their
9097hunk ./src/allmydata/test/test_cli.py 644
9098 
9099         d = self.do_cli("create-alias", etudes_arg)
9100         def _check_create_unicode((rc, out, err)):
9101-            self.failUnlessReallyEqual(rc, 0)
9102+            #self.failUnlessReallyEqual(rc, 0)
9103             self.failUnlessReallyEqual(err, "")
9104             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
9105 
9106hunk ./src/allmydata/test/test_cli.py 1975
9107         self.set_up_grid()
9108         c0 = self.g.clients[0]
9109         DATA = "data" * 100
9110-        d = c0.create_mutable_file(DATA)
9111+        DATA_uploadable = MutableData(DATA)
9112+        d = c0.create_mutable_file(DATA_uploadable)
9113         def _stash_uri(n):
9114             self.uri = n.get_uri()
9115         d.addCallback(_stash_uri)
9116hunk ./src/allmydata/test/test_cli.py 2077
9117                                            upload.Data("literal",
9118                                                         convergence="")))
9119         d.addCallback(_stash_uri, "small")
9120-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9121+        d.addCallback(lambda ign:
9122+            c0.create_mutable_file(MutableData(DATA+"1")))
9123         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9124         d.addCallback(_stash_uri, "mutable")
9125 
9126hunk ./src/allmydata/test/test_cli.py 2096
9127         # root/small
9128         # root/mutable
9129 
9130+        # We haven't broken anything yet, so this should all be healthy.
9131         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9132                                               self.rooturi))
9133         def _check2((rc, out, err)):
9134hunk ./src/allmydata/test/test_cli.py 2111
9135                             in lines, out)
9136         d.addCallback(_check2)
9137 
9138+        # Similarly, all of these results should be as we expect them to
9139+        # be for a healthy file layout.
9140         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9141         def _check_stats((rc, out, err)):
9142             self.failUnlessReallyEqual(err, "")
9143hunk ./src/allmydata/test/test_cli.py 2128
9144             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9145         d.addCallback(_check_stats)
9146 
9147+        # Now we break things.
9148         def _clobber_shares(ignored):
9149             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
9150             self.failUnlessReallyEqual(len(shares), 10)
9151hunk ./src/allmydata/test/test_cli.py 2153
9152 
9153         d.addCallback(lambda ign:
9154                       self.do_cli("deep-check", "--verbose", self.rooturi))
9155+        # This should reveal the missing share, but not the corrupt
9156+        # share, since we didn't tell the deep check operation to also
9157+        # verify.
9158         def _check3((rc, out, err)):
9159             self.failUnlessReallyEqual(err, "")
9160             self.failUnlessReallyEqual(rc, 0)
9161hunk ./src/allmydata/test/test_cli.py 2204
9162                                   "--verbose", "--verify", "--repair",
9163                                   self.rooturi))
9164         def _check6((rc, out, err)):
9165+            # We've just repaired the directory. There is no reason for
9166+            # that repair to be unsuccessful.
9167             self.failUnlessReallyEqual(err, "")
9168             self.failUnlessReallyEqual(rc, 0)
9169             lines = out.splitlines()
9170hunk ./src/allmydata/test/test_deepcheck.py 9
9171 from twisted.internet import threads # CLI tests use deferToThread
9172 from allmydata.immutable import upload
9173 from allmydata.mutable.common import UnrecoverableFileError
9174+from allmydata.mutable.publish import MutableData
9175 from allmydata.util import idlib
9176 from allmydata.util import base32
9177 from allmydata.scripts import runner
9178hunk ./src/allmydata/test/test_deepcheck.py 38
9179         self.basedir = "deepcheck/MutableChecker/good"
9180         self.set_up_grid()
9181         CONTENTS = "a little bit of data"
9182-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9183+        CONTENTS_uploadable = MutableData(CONTENTS)
9184+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9185         def _created(node):
9186             self.node = node
9187             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9188hunk ./src/allmydata/test/test_deepcheck.py 61
9189         self.basedir = "deepcheck/MutableChecker/corrupt"
9190         self.set_up_grid()
9191         CONTENTS = "a little bit of data"
9192-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9193+        CONTENTS_uploadable = MutableData(CONTENTS)
9194+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9195         def _stash_and_corrupt(node):
9196             self.node = node
9197             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9198hunk ./src/allmydata/test/test_deepcheck.py 99
9199         self.basedir = "deepcheck/MutableChecker/delete_share"
9200         self.set_up_grid()
9201         CONTENTS = "a little bit of data"
9202-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9203+        CONTENTS_uploadable = MutableData(CONTENTS)
9204+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9205         def _stash_and_delete(node):
9206             self.node = node
9207             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9208hunk ./src/allmydata/test/test_deepcheck.py 223
9209             self.root = n
9210             self.root_uri = n.get_uri()
9211         d.addCallback(_created_root)
9212-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9213+        d.addCallback(lambda ign:
9214+            c0.create_mutable_file(MutableData("mutable file contents")))
9215         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9216         def _created_mutable(n):
9217             self.mutable = n
9218hunk ./src/allmydata/test/test_deepcheck.py 965
9219     def create_mangled(self, ignored, name):
9220         nodetype, mangletype = name.split("-", 1)
9221         if nodetype == "mutable":
9222-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9223+            mutable_uploadable = MutableData("mutable file contents")
9224+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9225             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9226         elif nodetype == "large":
9227             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9228hunk ./src/allmydata/test/test_dirnode.py 1304
9229     implements(IMutableFileNode)
9230     counter = 0
9231     def __init__(self, initial_contents=""):
9232-        self.data = self._get_initial_contents(initial_contents)
9233+        data = self._get_initial_contents(initial_contents)
9234+        self.data = data.read(data.get_size())
9235+        self.data = "".join(self.data)
9236+
9237         counter = FakeMutableFile.counter
9238         FakeMutableFile.counter += 1
9239         writekey = hashutil.ssk_writekey_hash(str(counter))
9240hunk ./src/allmydata/test/test_dirnode.py 1354
9241         pass
9242 
9243     def modify(self, modifier):
9244-        self.data = modifier(self.data, None, True)
9245+        data = modifier(self.data, None, True)
9246+        self.data = data
9247         return defer.succeed(None)
9248 
9249 class FakeNodeMaker(NodeMaker):
9250hunk ./src/allmydata/test/test_filenode.py 98
9251         def _check_segment(res):
9252             self.failUnlessEqual(res, DATA[1:1+5])
9253         d.addCallback(_check_segment)
9254+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
9255+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
9256+        d.addCallback(lambda ignored:
9257+            fn1.get_size_of_best_version())
9258+        d.addCallback(lambda size:
9259+            self.failUnlessEqual(size, len(DATA)))
9260+        d.addCallback(lambda ignored:
9261+            fn1.download_to_data())
9262+        d.addCallback(lambda data:
9263+            self.failUnlessEqual(data, DATA))
9264+        d.addCallback(lambda ignored:
9265+            fn1.download_best_version())
9266+        d.addCallback(lambda data:
9267+            self.failUnlessEqual(data, DATA))
9268 
9269         return d
9270 
9271hunk ./src/allmydata/test/test_hung_server.py 10
9272 from allmydata.util.consumer import download_to_data
9273 from allmydata.immutable import upload
9274 from allmydata.mutable.common import UnrecoverableFileError
9275+from allmydata.mutable.publish import MutableData
9276 from allmydata.storage.common import storage_index_to_dir
9277 from allmydata.test.no_network import GridTestMixin
9278 from allmydata.test.common import ShouldFailMixin
9279hunk ./src/allmydata/test/test_hung_server.py 108
9280         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9281 
9282         if mutable:
9283-            d = nm.create_mutable_file(mutable_plaintext)
9284+            uploadable = MutableData(mutable_plaintext)
9285+            d = nm.create_mutable_file(uploadable)
9286             def _uploaded_mutable(node):
9287                 self.uri = node.get_uri()
9288                 self.shares = self.find_uri_shares(self.uri)
9289hunk ./src/allmydata/test/test_immutable.py 4
9290 from allmydata.test import common
9291 from allmydata.interfaces import NotEnoughSharesError
9292 from allmydata.util.consumer import download_to_data
9293-from twisted.internet import defer
9294+from twisted.internet import defer, base
9295 from twisted.trial import unittest
9296 import random
9297 
9298hunk ./src/allmydata/test/test_immutable.py 143
9299         d.addCallback(_after_attempt)
9300         return d
9301 
9302+    def test_download_to_data(self):
9303+        d = self.n.download_to_data()
9304+        d.addCallback(lambda data:
9305+            self.failUnlessEqual(data, common.TEST_DATA))
9306+        return d
9307 
9308hunk ./src/allmydata/test/test_immutable.py 149
9309+
9310+    def test_download_best_version(self):
9311+        d = self.n.download_best_version()
9312+        d.addCallback(lambda data:
9313+            self.failUnlessEqual(data, common.TEST_DATA))
9314+        return d
9315+
9316+
9317+    def test_get_best_readable_version(self):
9318+        d = self.n.get_best_readable_version()
9319+        d.addCallback(lambda n2:
9320+            self.failUnlessEqual(n2, self.n))
9321+        return d
9322+
9323+    def test_get_size_of_best_version(self):
9324+        d = self.n.get_size_of_best_version()
9325+        d.addCallback(lambda size:
9326+            self.failUnlessEqual(size, len(common.TEST_DATA)))
9327+        return d
9328+
9329+
9330 # XXX extend these tests to show bad behavior of various kinds from servers:
9331 # raising exception from each remove_foo() method, for example
9332 
9333hunk ./src/allmydata/test/test_mutable.py 2
9334 
9335-import struct
9336+import struct, os
9337 from cStringIO import StringIO
9338 from twisted.trial import unittest
9339 from twisted.internet import defer, reactor
9340hunk ./src/allmydata/test/test_mutable.py 8
9341 from allmydata import uri, client
9342 from allmydata.nodemaker import NodeMaker
9343-from allmydata.util import base32
9344+from allmydata.util import base32, consumer, mathutil
9345 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
9346      ssk_pubkey_fingerprint_hash
9347hunk ./src/allmydata/test/test_mutable.py 11
9348+from allmydata.util.deferredutil import gatherResults
9349 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
9350hunk ./src/allmydata/test/test_mutable.py 13
9351-     NotEnoughSharesError
9352+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
9353 from allmydata.monitor import Monitor
9354 from allmydata.test.common import ShouldFailMixin
9355 from allmydata.test.no_network import GridTestMixin
9356hunk ./src/allmydata/test/test_mutable.py 27
9357      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
9358      NotEnoughServersError, CorruptShareError
9359 from allmydata.mutable.retrieve import Retrieve
9360-from allmydata.mutable.publish import Publish
9361+from allmydata.mutable.publish import Publish, MutableFileHandle, \
9362+                                      MutableData, \
9363+                                      DEFAULT_MAX_SEGMENT_SIZE
9364 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
9365hunk ./src/allmydata/test/test_mutable.py 31
9366-from allmydata.mutable.layout import unpack_header, unpack_share
9367+from allmydata.mutable.layout import unpack_header, unpack_share, \
9368+                                     MDMFSlotReadProxy
9369 from allmydata.mutable.repairer import MustForceRepairError
9370 
9371 import allmydata.test.common_util as testutil
9372hunk ./src/allmydata/test/test_mutable.py 101
9373         self.storage = storage
9374         self.queries = 0
9375     def callRemote(self, methname, *args, **kwargs):
9376+        self.queries += 1
9377         def _call():
9378             meth = getattr(self, methname)
9379             return meth(*args, **kwargs)
9380hunk ./src/allmydata/test/test_mutable.py 108
9381         d = fireEventually()
9382         d.addCallback(lambda res: _call())
9383         return d
9384+
9385     def callRemoteOnly(self, methname, *args, **kwargs):
9386hunk ./src/allmydata/test/test_mutable.py 110
9387+        self.queries += 1
9388         d = self.callRemote(methname, *args, **kwargs)
9389         d.addBoth(lambda ignore: None)
9390         pass
9391hunk ./src/allmydata/test/test_mutable.py 158
9392             chr(ord(original[byte_offset]) ^ 0x01) +
9393             original[byte_offset+1:])
9394 
9395+def add_two(original, byte_offset):
9396+    # It isn't enough to simply flip the bit for the version number,
9397+    # because 1 is a valid version number. So we add two instead.
9398+    return (original[:byte_offset] +
9399+            chr(ord(original[byte_offset]) ^ 0x02) +
9400+            original[byte_offset+1:])
9401+
9402 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
9403     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
9404     # list of shnums to corrupt.
9405hunk ./src/allmydata/test/test_mutable.py 168
9406+    ds = []
9407     for peerid in s._peers:
9408         shares = s._peers[peerid]
9409         for shnum in shares:
9410hunk ./src/allmydata/test/test_mutable.py 176
9411                 and shnum not in shnums_to_corrupt):
9412                 continue
9413             data = shares[shnum]
9414-            (version,
9415-             seqnum,
9416-             root_hash,
9417-             IV,
9418-             k, N, segsize, datalen,
9419-             o) = unpack_header(data)
9420-            if isinstance(offset, tuple):
9421-                offset1, offset2 = offset
9422-            else:
9423-                offset1 = offset
9424-                offset2 = 0
9425-            if offset1 == "pubkey":
9426-                real_offset = 107
9427-            elif offset1 in o:
9428-                real_offset = o[offset1]
9429-            else:
9430-                real_offset = offset1
9431-            real_offset = int(real_offset) + offset2 + offset_offset
9432-            assert isinstance(real_offset, int), offset
9433-            shares[shnum] = flip_bit(data, real_offset)
9434-    return res
9435+            # We're feeding the reader all of the share data, so it
9436+            # won't need to use the rref that we didn't provide, nor the
9437+            # storage index that we didn't provide. We do this because
9438+            # the reader will work for both MDMF and SDMF.
9439+            reader = MDMFSlotReadProxy(None, None, shnum, data)
9440+            # We need to get the offsets for the next part.
9441+            d = reader.get_verinfo()
9442+            def _do_corruption(verinfo, data, shnum):
9443+                (seqnum,
9444+                 root_hash,
9445+                 IV,
9446+                 segsize,
9447+                 datalen,
9448+                 k, n, prefix, o) = verinfo
9449+                if isinstance(offset, tuple):
9450+                    offset1, offset2 = offset
9451+                else:
9452+                    offset1 = offset
9453+                    offset2 = 0
9454+                if offset1 == "pubkey" and IV:
9455+                    real_offset = 107
9456+                elif offset1 == "share_data" and not IV:
9457+                    real_offset = 107
9458+                elif offset1 in o:
9459+                    real_offset = o[offset1]
9460+                else:
9461+                    real_offset = offset1
9462+                real_offset = int(real_offset) + offset2 + offset_offset
9463+                assert isinstance(real_offset, int), offset
9464+                if offset1 == 0: # verbyte
9465+                    f = add_two
9466+                else:
9467+                    f = flip_bit
9468+                shares[shnum] = f(data, real_offset)
9469+            d.addCallback(_do_corruption, data, shnum)
9470+            ds.append(d)
9471+    dl = defer.DeferredList(ds)
9472+    dl.addCallback(lambda ignored: res)
9473+    return dl
9474 
9475 def make_storagebroker(s=None, num_peers=10):
9476     if not s:
9477hunk ./src/allmydata/test/test_mutable.py 257
9478             self.failUnlessEqual(len(shnums), 1)
9479         d.addCallback(_created)
9480         return d
9481+    test_create.timeout = 15
9482+
9483+
9484+    def test_create_mdmf(self):
9485+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
9486+        def _created(n):
9487+            self.failUnless(isinstance(n, MutableFileNode))
9488+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
9489+            sb = self.nodemaker.storage_broker
9490+            peer0 = sorted(sb.get_all_serverids())[0]
9491+            shnums = self._storage._peers[peer0].keys()
9492+            self.failUnlessEqual(len(shnums), 1)
9493+        d.addCallback(_created)
9494+        return d
9495+
9496 
9497     def test_serialize(self):
9498         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
9499hunk ./src/allmydata/test/test_mutable.py 302
9500             d.addCallback(lambda smap: smap.dump(StringIO()))
9501             d.addCallback(lambda sio:
9502                           self.failUnless("3-of-10" in sio.getvalue()))
9503-            d.addCallback(lambda res: n.overwrite("contents 1"))
9504+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
9505             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9506             d.addCallback(lambda res: n.download_best_version())
9507             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9508hunk ./src/allmydata/test/test_mutable.py 309
9509             d.addCallback(lambda res: n.get_size_of_best_version())
9510             d.addCallback(lambda size:
9511                           self.failUnlessEqual(size, len("contents 1")))
9512-            d.addCallback(lambda res: n.overwrite("contents 2"))
9513+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
9514             d.addCallback(lambda res: n.download_best_version())
9515             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9516             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9517hunk ./src/allmydata/test/test_mutable.py 313
9518-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9519+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
9520             d.addCallback(lambda res: n.download_best_version())
9521             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9522             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9523hunk ./src/allmydata/test/test_mutable.py 325
9524             # mapupdate-to-retrieve data caching (i.e. make the shares larger
9525             # than the default readsize, which is 2000 bytes). A 15kB file
9526             # will have 5kB shares.
9527-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
9528+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
9529             d.addCallback(lambda res: n.download_best_version())
9530             d.addCallback(lambda res:
9531                           self.failUnlessEqual(res, "large size file" * 1000))
9532hunk ./src/allmydata/test/test_mutable.py 333
9533         d.addCallback(_created)
9534         return d
9535 
9536+
9537+    def test_upload_and_download_mdmf(self):
9538+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
9539+        def _created(n):
9540+            d = defer.succeed(None)
9541+            d.addCallback(lambda ignored:
9542+                n.get_servermap(MODE_READ))
9543+            def _then(servermap):
9544+                dumped = servermap.dump(StringIO())
9545+                self.failUnlessIn("3-of-10", dumped.getvalue())
9546+            d.addCallback(_then)
9547+            # Now overwrite the contents with some new contents. We want
9548+            # to make them big enough to force the file to be uploaded
9549+            # in more than one segment.
9550+            big_contents = "contents1" * 100000 # about 900 KiB
9551+            big_contents_uploadable = MutableData(big_contents)
9552+            d.addCallback(lambda ignored:
9553+                n.overwrite(big_contents_uploadable))
9554+            d.addCallback(lambda ignored:
9555+                n.download_best_version())
9556+            d.addCallback(lambda data:
9557+                self.failUnlessEqual(data, big_contents))
9558+            # Overwrite the contents again with some new contents. As
9559+            # before, they need to be big enough to force multiple
9560+            # segments, so that we make the downloader deal with
9561+            # multiple segments.
9562+            bigger_contents = "contents2" * 1000000 # about 9MiB
9563+            bigger_contents_uploadable = MutableData(bigger_contents)
9564+            d.addCallback(lambda ignored:
9565+                n.overwrite(bigger_contents_uploadable))
9566+            d.addCallback(lambda ignored:
9567+                n.download_best_version())
9568+            d.addCallback(lambda data:
9569+                self.failUnlessEqual(data, bigger_contents))
9570+            return d
9571+        d.addCallback(_created)
9572+        return d
9573+
9574+
9575+    def test_mdmf_write_count(self):
9576+        # Publishing an MDMF file should only cause one write for each
9577+        # share that is to be published. Otherwise, we introduce
9578+        # undesirable semantics that are a regression from SDMF
9579+        upload = MutableData("MDMF" * 100000) # about 400 KiB
9580+        d = self.nodemaker.create_mutable_file(upload,
9581+                                               version=MDMF_VERSION)
9582+        def _check_server_write_counts(ignored):
9583+            sb = self.nodemaker.storage_broker
9584+            peers = sb.test_servers.values()
9585+            for peer in peers:
9586+                self.failUnlessEqual(peer.queries, 1)
9587+        d.addCallback(_check_server_write_counts)
9588+        return d
9589+
9590+
9591     def test_create_with_initial_contents(self):
9592hunk ./src/allmydata/test/test_mutable.py 389
9593-        d = self.nodemaker.create_mutable_file("contents 1")
9594+        upload1 = MutableData("contents 1")
9595+        d = self.nodemaker.create_mutable_file(upload1)
9596         def _created(n):
9597             d = n.download_best_version()
9598             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9599hunk ./src/allmydata/test/test_mutable.py 394
9600-            d.addCallback(lambda res: n.overwrite("contents 2"))
9601+            upload2 = MutableData("contents 2")
9602+            d.addCallback(lambda res: n.overwrite(upload2))
9603             d.addCallback(lambda res: n.download_best_version())
9604             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9605             return d
9606hunk ./src/allmydata/test/test_mutable.py 401
9607         d.addCallback(_created)
9608         return d
9609+    test_create_with_initial_contents.timeout = 15
9610+
9611+
9612+    def test_create_mdmf_with_initial_contents(self):
9613+        initial_contents = "foobarbaz" * 131072 # 900KiB
9614+        initial_contents_uploadable = MutableData(initial_contents)
9615+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
9616+                                               version=MDMF_VERSION)
9617+        def _created(n):
9618+            d = n.download_best_version()
9619+            d.addCallback(lambda data:
9620+                self.failUnlessEqual(data, initial_contents))
9621+            uploadable2 = MutableData(initial_contents + "foobarbaz")
9622+            d.addCallback(lambda ignored:
9623+                n.overwrite(uploadable2))
9624+            d.addCallback(lambda ignored:
9625+                n.download_best_version())
9626+            d.addCallback(lambda data:
9627+                self.failUnlessEqual(data, initial_contents +
9628+                                           "foobarbaz"))
9629+            return d
9630+        d.addCallback(_created)
9631+        return d
9632+    test_create_mdmf_with_initial_contents.timeout = 20
9633+
9634 
9635     def test_create_with_initial_contents_function(self):
9636         data = "initial contents"
9637hunk ./src/allmydata/test/test_mutable.py 434
9638             key = n.get_writekey()
9639             self.failUnless(isinstance(key, str), key)
9640             self.failUnlessEqual(len(key), 16) # AES key size
9641-            return data
9642+            return MutableData(data)
9643         d = self.nodemaker.create_mutable_file(_make_contents)
9644         def _created(n):
9645             return n.download_best_version()
9646hunk ./src/allmydata/test/test_mutable.py 442
9647         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
9648         return d
9649 
9650+
9651+    def test_create_mdmf_with_initial_contents_function(self):
9652+        data = "initial contents" * 100000
9653+        def _make_contents(n):
9654+            self.failUnless(isinstance(n, MutableFileNode))
9655+            key = n.get_writekey()
9656+            self.failUnless(isinstance(key, str), key)
9657+            self.failUnlessEqual(len(key), 16)
9658+            return MutableData(data)
9659+        d = self.nodemaker.create_mutable_file(_make_contents,
9660+                                               version=MDMF_VERSION)
9661+        d.addCallback(lambda n:
9662+            n.download_best_version())
9663+        d.addCallback(lambda data2:
9664+            self.failUnlessEqual(data2, data))
9665+        return d
9666+
9667+
9668     def test_create_with_too_large_contents(self):
9669         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9670hunk ./src/allmydata/test/test_mutable.py 462
9671-        d = self.nodemaker.create_mutable_file(BIG)
9672+        BIG_uploadable = MutableData(BIG)
9673+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
9674         def _created(n):
9675hunk ./src/allmydata/test/test_mutable.py 465
9676-            d = n.overwrite(BIG)
9677+            other_BIG_uploadable = MutableData(BIG)
9678+            d = n.overwrite(other_BIG_uploadable)
9679             return d
9680         d.addCallback(_created)
9681         return d
9682hunk ./src/allmydata/test/test_mutable.py 480
9683 
9684     def test_modify(self):
9685         def _modifier(old_contents, servermap, first_time):
9686-            return old_contents + "line2"
9687+            new_contents = old_contents + "line2"
9688+            return new_contents
9689         def _non_modifier(old_contents, servermap, first_time):
9690             return old_contents
9691         def _none_modifier(old_contents, servermap, first_time):
9692hunk ./src/allmydata/test/test_mutable.py 489
9693         def _error_modifier(old_contents, servermap, first_time):
9694             raise ValueError("oops")
9695         def _toobig_modifier(old_contents, servermap, first_time):
9696-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
9697+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9698+            return new_content
9699         calls = []
9700         def _ucw_error_modifier(old_contents, servermap, first_time):
9701             # simulate an UncoordinatedWriteError once
9702hunk ./src/allmydata/test/test_mutable.py 497
9703             calls.append(1)
9704             if len(calls) <= 1:
9705                 raise UncoordinatedWriteError("simulated")
9706-            return old_contents + "line3"
9707+            new_contents = old_contents + "line3"
9708+            return new_contents
9709         def _ucw_error_non_modifier(old_contents, servermap, first_time):
9710             # simulate an UncoordinatedWriteError once, and don't actually
9711             # modify the contents on subsequent invocations
9712hunk ./src/allmydata/test/test_mutable.py 507
9713                 raise UncoordinatedWriteError("simulated")
9714             return old_contents
9715 
9716-        d = self.nodemaker.create_mutable_file("line1")
9717+        initial_contents = "line1"
9718+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
9719         def _created(n):
9720             d = n.modify(_modifier)
9721             d.addCallback(lambda res: n.download_best_version())
9722hunk ./src/allmydata/test/test_mutable.py 565
9723             return d
9724         d.addCallback(_created)
9725         return d
9726+    test_modify.timeout = 15
9727+
9728 
9729     def test_modify_backoffer(self):
9730         def _modifier(old_contents, servermap, first_time):
9731hunk ./src/allmydata/test/test_mutable.py 592
9732         giveuper._delay = 0.1
9733         giveuper.factor = 1
9734 
9735-        d = self.nodemaker.create_mutable_file("line1")
9736+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
9737         def _created(n):
9738             d = n.modify(_modifier)
9739             d.addCallback(lambda res: n.download_best_version())
9740hunk ./src/allmydata/test/test_mutable.py 642
9741             d.addCallback(lambda smap: smap.dump(StringIO()))
9742             d.addCallback(lambda sio:
9743                           self.failUnless("3-of-10" in sio.getvalue()))
9744-            d.addCallback(lambda res: n.overwrite("contents 1"))
9745+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
9746             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9747             d.addCallback(lambda res: n.download_best_version())
9748             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9749hunk ./src/allmydata/test/test_mutable.py 646
9750-            d.addCallback(lambda res: n.overwrite("contents 2"))
9751+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
9752             d.addCallback(lambda res: n.download_best_version())
9753             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9754             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9755hunk ./src/allmydata/test/test_mutable.py 650
9756-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9757+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
9758             d.addCallback(lambda res: n.download_best_version())
9759             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9760             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9761hunk ./src/allmydata/test/test_mutable.py 663
9762         return d
9763 
9764 
9765-class MakeShares(unittest.TestCase):
9766-    def test_encrypt(self):
9767-        nm = make_nodemaker()
9768-        CONTENTS = "some initial contents"
9769-        d = nm.create_mutable_file(CONTENTS)
9770-        def _created(fn):
9771-            p = Publish(fn, nm.storage_broker, None)
9772-            p.salt = "SALT" * 4
9773-            p.readkey = "\x00" * 16
9774-            p.newdata = CONTENTS
9775-            p.required_shares = 3
9776-            p.total_shares = 10
9777-            p.setup_encoding_parameters()
9778-            return p._encrypt_and_encode()
9779+class PublishMixin:
9780+    def publish_one(self):
9781+        # publish a file and create shares, which can then be manipulated
9782+        # later.
9783+        self.CONTENTS = "New contents go here" * 1000
9784+        self.uploadable = MutableData(self.CONTENTS)
9785+        self._storage = FakeStorage()
9786+        self._nodemaker = make_nodemaker(self._storage)
9787+        self._storage_broker = self._nodemaker.storage_broker
9788+        d = self._nodemaker.create_mutable_file(self.uploadable)
9789+        def _created(node):
9790+            self._fn = node
9791+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9792         d.addCallback(_created)
9793hunk ./src/allmydata/test/test_mutable.py 677
9794-        def _done(shares_and_shareids):
9795-            (shares, share_ids) = shares_and_shareids
9796-            self.failUnlessEqual(len(shares), 10)
9797-            for sh in shares:
9798-                self.failUnless(isinstance(sh, str))
9799-                self.failUnlessEqual(len(sh), 7)
9800-            self.failUnlessEqual(len(share_ids), 10)
9801-        d.addCallback(_done)
9802         return d
9803 
9804hunk ./src/allmydata/test/test_mutable.py 679
9805-    def test_generate(self):
9806-        nm = make_nodemaker()
9807-        CONTENTS = "some initial contents"
9808-        d = nm.create_mutable_file(CONTENTS)
9809-        def _created(fn):
9810-            self._fn = fn
9811-            p = Publish(fn, nm.storage_broker, None)
9812-            self._p = p
9813-            p.newdata = CONTENTS
9814-            p.required_shares = 3
9815-            p.total_shares = 10
9816-            p.setup_encoding_parameters()
9817-            p._new_seqnum = 3
9818-            p.salt = "SALT" * 4
9819-            # make some fake shares
9820-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
9821-            p._privkey = fn.get_privkey()
9822-            p._encprivkey = fn.get_encprivkey()
9823-            p._pubkey = fn.get_pubkey()
9824-            return p._generate_shares(shares_and_ids)
9825+    def publish_mdmf(self):
9826+        # like publish_one, except that the result is guaranteed to be
9827+        # an MDMF file.
9828+        # self.CONTENTS should have more than one segment.
9829+        self.CONTENTS = "This is an MDMF file" * 100000
9830+        self.uploadable = MutableData(self.CONTENTS)
9831+        self._storage = FakeStorage()
9832+        self._nodemaker = make_nodemaker(self._storage)
9833+        self._storage_broker = self._nodemaker.storage_broker
9834+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
9835+        def _created(node):
9836+            self._fn = node
9837+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9838         d.addCallback(_created)
9839hunk ./src/allmydata/test/test_mutable.py 693
9840-        def _generated(res):
9841-            p = self._p
9842-            final_shares = p.shares
9843-            root_hash = p.root_hash
9844-            self.failUnlessEqual(len(root_hash), 32)
9845-            self.failUnless(isinstance(final_shares, dict))
9846-            self.failUnlessEqual(len(final_shares), 10)
9847-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
9848-            for i,sh in final_shares.items():
9849-                self.failUnless(isinstance(sh, str))
9850-                # feed the share through the unpacker as a sanity-check
9851-                pieces = unpack_share(sh)
9852-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
9853-                 pubkey, signature, share_hash_chain, block_hash_tree,
9854-                 share_data, enc_privkey) = pieces
9855-                self.failUnlessEqual(u_seqnum, 3)
9856-                self.failUnlessEqual(u_root_hash, root_hash)
9857-                self.failUnlessEqual(k, 3)
9858-                self.failUnlessEqual(N, 10)
9859-                self.failUnlessEqual(segsize, 21)
9860-                self.failUnlessEqual(datalen, len(CONTENTS))
9861-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
9862-                sig_material = struct.pack(">BQ32s16s BBQQ",
9863-                                           0, p._new_seqnum, root_hash, IV,
9864-                                           k, N, segsize, datalen)
9865-                self.failUnless(p._pubkey.verify(sig_material, signature))
9866-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
9867-                self.failUnless(isinstance(share_hash_chain, dict))
9868-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
9869-                for shnum,share_hash in share_hash_chain.items():
9870-                    self.failUnless(isinstance(shnum, int))
9871-                    self.failUnless(isinstance(share_hash, str))
9872-                    self.failUnlessEqual(len(share_hash), 32)
9873-                self.failUnless(isinstance(block_hash_tree, list))
9874-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
9875-                self.failUnlessEqual(IV, "SALT"*4)
9876-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
9877-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
9878-        d.addCallback(_generated)
9879         return d
9880 
9881hunk ./src/allmydata/test/test_mutable.py 695
9882-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
9883-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
9884-    # when we publish to zero peers, we should get a NotEnoughSharesError
9885 
9886hunk ./src/allmydata/test/test_mutable.py 696
9887-class PublishMixin:
9888-    def publish_one(self):
9889-        # publish a file and create shares, which can then be manipulated
9890-        # later.
9891-        self.CONTENTS = "New contents go here" * 1000
9892+    def publish_sdmf(self):
9893+        # like publish_one, except that the result is guaranteed to be
9894+        # an SDMF file
9895+        self.CONTENTS = "This is an SDMF file" * 1000
9896+        self.uploadable = MutableData(self.CONTENTS)
9897         self._storage = FakeStorage()
9898         self._nodemaker = make_nodemaker(self._storage)
9899         self._storage_broker = self._nodemaker.storage_broker
9900hunk ./src/allmydata/test/test_mutable.py 704
9901-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9902+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
9903         def _created(node):
9904             self._fn = node
9905             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9906hunk ./src/allmydata/test/test_mutable.py 711
9907         d.addCallback(_created)
9908         return d
9909 
9910-    def publish_multiple(self):
9911+
9912+    def publish_multiple(self, version=0):
9913         self.CONTENTS = ["Contents 0",
9914                          "Contents 1",
9915                          "Contents 2",
9916hunk ./src/allmydata/test/test_mutable.py 718
9917                          "Contents 3a",
9918                          "Contents 3b"]
9919+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
9920         self._copied_shares = {}
9921         self._storage = FakeStorage()
9922         self._nodemaker = make_nodemaker(self._storage)
9923hunk ./src/allmydata/test/test_mutable.py 722
9924-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
9925+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
9926         def _created(node):
9927             self._fn = node
9928             # now create multiple versions of the same file, and accumulate
9929hunk ./src/allmydata/test/test_mutable.py 729
9930             # their shares, so we can mix and match them later.
9931             d = defer.succeed(None)
9932             d.addCallback(self._copy_shares, 0)
9933-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
9934+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
9935             d.addCallback(self._copy_shares, 1)
9936hunk ./src/allmydata/test/test_mutable.py 731
9937-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
9938+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
9939             d.addCallback(self._copy_shares, 2)
9940hunk ./src/allmydata/test/test_mutable.py 733
9941-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
9942+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
9943             d.addCallback(self._copy_shares, 3)
9944             # now we replace all the shares with version s3, and upload a new
9945             # version to get s4b.
9946hunk ./src/allmydata/test/test_mutable.py 739
9947             rollback = dict([(i,2) for i in range(10)])
9948             d.addCallback(lambda res: self._set_versions(rollback))
9949-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
9950+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
9951             d.addCallback(self._copy_shares, 4)
9952             # we leave the storage in state 4
9953             return d
9954hunk ./src/allmydata/test/test_mutable.py 746
9955         d.addCallback(_created)
9956         return d
9957 
9958+
9959     def _copy_shares(self, ignored, index):
9960         shares = self._storage._peers
9961         # we need a deep copy
9962hunk ./src/allmydata/test/test_mutable.py 770
9963                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
9964 
9965 
9966+
9967+
9968 class Servermap(unittest.TestCase, PublishMixin):
9969     def setUp(self):
9970         return self.publish_one()
9971hunk ./src/allmydata/test/test_mutable.py 776
9972 
9973-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
9974+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
9975+                       update_range=None):
9976         if fn is None:
9977             fn = self._fn
9978         if sb is None:
9979hunk ./src/allmydata/test/test_mutable.py 783
9980             sb = self._storage_broker
9981         smu = ServermapUpdater(fn, sb, Monitor(),
9982-                               ServerMap(), mode)
9983+                               ServerMap(), mode, update_range=update_range)
9984         d = smu.update()
9985         return d
9986 
9987hunk ./src/allmydata/test/test_mutable.py 849
9988         # create a new file, which is large enough to knock the privkey out
9989         # of the early part of the file
9990         LARGE = "These are Larger contents" * 200 # about 5KB
9991-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
9992+        LARGE_uploadable = MutableData(LARGE)
9993+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
9994         def _created(large_fn):
9995             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
9996             return self.make_servermap(MODE_WRITE, large_fn2)
9997hunk ./src/allmydata/test/test_mutable.py 858
9998         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
9999         return d
10000 
10001+
10002     def test_mark_bad(self):
10003         d = defer.succeed(None)
10004         ms = self.make_servermap
10005hunk ./src/allmydata/test/test_mutable.py 904
10006         self._storage._peers = {} # delete all shares
10007         ms = self.make_servermap
10008         d = defer.succeed(None)
10009-
10010+#
10011         d.addCallback(lambda res: ms(mode=MODE_CHECK))
10012         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
10013 
10014hunk ./src/allmydata/test/test_mutable.py 956
10015         return d
10016 
10017 
10018+    def test_servermapupdater_finds_mdmf_files(self):
10019+        # setUp already published an MDMF file for us. We just need to
10020+        # make sure that when we run the ServermapUpdater, the file is
10021+        # reported to have one recoverable version.
10022+        d = defer.succeed(None)
10023+        d.addCallback(lambda ignored:
10024+            self.publish_mdmf())
10025+        d.addCallback(lambda ignored:
10026+            self.make_servermap(mode=MODE_CHECK))
10027+        # Calling make_servermap also updates the servermap in the mode
10028+        # that we specify, so we just need to see what it says.
10029+        def _check_servermap(sm):
10030+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
10031+        d.addCallback(_check_servermap)
10032+        return d
10033+
10034+
10035+    def test_fetch_update(self):
10036+        d = defer.succeed(None)
10037+        d.addCallback(lambda ignored:
10038+            self.publish_mdmf())
10039+        d.addCallback(lambda ignored:
10040+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
10041+        def _check_servermap(sm):
10042+            # 10 shares
10043+            self.failUnlessEqual(len(sm.update_data), 10)
10044+            # one version
10045+            for data in sm.update_data.itervalues():
10046+                self.failUnlessEqual(len(data), 1)
10047+        d.addCallback(_check_servermap)
10048+        return d
10049+
10050+
10051+    def test_servermapupdater_finds_sdmf_files(self):
10052+        d = defer.succeed(None)
10053+        d.addCallback(lambda ignored:
10054+            self.publish_sdmf())
10055+        d.addCallback(lambda ignored:
10056+            self.make_servermap(mode=MODE_CHECK))
10057+        d.addCallback(lambda servermap:
10058+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
10059+        return d
10060+
10061 
10062 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
10063     def setUp(self):
10064hunk ./src/allmydata/test/test_mutable.py 1039
10065         if version is None:
10066             version = servermap.best_recoverable_version()
10067         r = Retrieve(self._fn, servermap, version)
10068-        return r.download()
10069+        c = consumer.MemoryConsumer()
10070+        d = r.download(consumer=c)
10071+        d.addCallback(lambda mc: "".join(mc.chunks))
10072+        return d
10073+
10074 
10075     def test_basic(self):
10076         d = self.make_servermap()
10077hunk ./src/allmydata/test/test_mutable.py 1120
10078         return d
10079     test_no_servers_download.timeout = 15
10080 
10081+
10082     def _test_corrupt_all(self, offset, substring,
10083hunk ./src/allmydata/test/test_mutable.py 1122
10084-                          should_succeed=False, corrupt_early=True,
10085-                          failure_checker=None):
10086+                          should_succeed=False,
10087+                          corrupt_early=True,
10088+                          failure_checker=None,
10089+                          fetch_privkey=False):
10090         d = defer.succeed(None)
10091         if corrupt_early:
10092             d.addCallback(corrupt, self._storage, offset)
10093hunk ./src/allmydata/test/test_mutable.py 1142
10094                     self.failUnlessIn(substring, "".join(allproblems))
10095                 return servermap
10096             if should_succeed:
10097-                d1 = self._fn.download_version(servermap, ver)
10098+                d1 = self._fn.download_version(servermap, ver,
10099+                                               fetch_privkey)
10100                 d1.addCallback(lambda new_contents:
10101                                self.failUnlessEqual(new_contents, self.CONTENTS))
10102             else:
10103hunk ./src/allmydata/test/test_mutable.py 1150
10104                 d1 = self.shouldFail(NotEnoughSharesError,
10105                                      "_corrupt_all(offset=%s)" % (offset,),
10106                                      substring,
10107-                                     self._fn.download_version, servermap, ver)
10108+                                     self._fn.download_version, servermap,
10109+                                                                ver,
10110+                                                                fetch_privkey)
10111             if failure_checker:
10112                 d1.addCallback(failure_checker)
10113             d1.addCallback(lambda res: servermap)
10114hunk ./src/allmydata/test/test_mutable.py 1161
10115         return d
10116 
10117     def test_corrupt_all_verbyte(self):
10118-        # when the version byte is not 0, we hit an UnknownVersionError error
10119-        # in unpack_share().
10120+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
10121+        # error in unpack_share().
10122         d = self._test_corrupt_all(0, "UnknownVersionError")
10123         def _check_servermap(servermap):
10124             # and the dump should mention the problems
10125hunk ./src/allmydata/test/test_mutable.py 1168
10126             s = StringIO()
10127             dump = servermap.dump(s).getvalue()
10128-            self.failUnless("10 PROBLEMS" in dump, dump)
10129+            self.failUnless("30 PROBLEMS" in dump, dump)
10130         d.addCallback(_check_servermap)
10131         return d
10132 
10133hunk ./src/allmydata/test/test_mutable.py 1238
10134         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
10135 
10136 
10137+    def test_corrupt_all_encprivkey_late(self):
10138+        # this should work for the same reason as above, but we corrupt
10139+        # after the servermap update to exercise the error handling
10140+        # code.
10141+        # We need to remove the privkey from the node, or the retrieve
10142+        # process won't know to update it.
10143+        self._fn._privkey = None
10144+        return self._test_corrupt_all("enc_privkey",
10145+                                      None, # this shouldn't fail
10146+                                      should_succeed=True,
10147+                                      corrupt_early=False,
10148+                                      fetch_privkey=True)
10149+
10150+
10151     def test_corrupt_all_seqnum_late(self):
10152         # corrupting the seqnum between mapupdate and retrieve should result
10153         # in NotEnoughSharesError, since each share will look invalid
10154hunk ./src/allmydata/test/test_mutable.py 1258
10155         def _check(res):
10156             f = res[0]
10157             self.failUnless(f.check(NotEnoughSharesError))
10158-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
10159+            self.failUnless("uncoordinated write" in str(f))
10160         return self._test_corrupt_all(1, "ran out of peers",
10161                                       corrupt_early=False,
10162                                       failure_checker=_check)
10163hunk ./src/allmydata/test/test_mutable.py 1302
10164                             in str(servermap.problems[0]))
10165             ver = servermap.best_recoverable_version()
10166             r = Retrieve(self._fn, servermap, ver)
10167-            return r.download()
10168+            c = consumer.MemoryConsumer()
10169+            return r.download(c)
10170         d.addCallback(_do_retrieve)
10171hunk ./src/allmydata/test/test_mutable.py 1305
10172+        d.addCallback(lambda mc: "".join(mc.chunks))
10173         d.addCallback(lambda new_contents:
10174                       self.failUnlessEqual(new_contents, self.CONTENTS))
10175         return d
10176hunk ./src/allmydata/test/test_mutable.py 1310
10177 
10178-    def test_corrupt_some(self):
10179-        # corrupt the data of first five shares (so the servermap thinks
10180-        # they're good but retrieve marks them as bad), so that the
10181-        # MODE_READ set of 6 will be insufficient, forcing node.download to
10182-        # retry with more servers.
10183-        corrupt(None, self._storage, "share_data", range(5))
10184-        d = self.make_servermap()
10185+
10186+    def _test_corrupt_some(self, offset, mdmf=False):
10187+        if mdmf:
10188+            d = self.publish_mdmf()
10189+        else:
10190+            d = defer.succeed(None)
10191+        d.addCallback(lambda ignored:
10192+            corrupt(None, self._storage, offset, range(5)))
10193+        d.addCallback(lambda ignored:
10194+            self.make_servermap())
10195         def _do_retrieve(servermap):
10196             ver = servermap.best_recoverable_version()
10197             self.failUnless(ver)
10198hunk ./src/allmydata/test/test_mutable.py 1326
10199             return self._fn.download_best_version()
10200         d.addCallback(_do_retrieve)
10201         d.addCallback(lambda new_contents:
10202-                      self.failUnlessEqual(new_contents, self.CONTENTS))
10203+            self.failUnlessEqual(new_contents, self.CONTENTS))
10204         return d
10205 
10206hunk ./src/allmydata/test/test_mutable.py 1329
10207+
10208+    def test_corrupt_some(self):
10209+        # corrupt the data of first five shares (so the servermap thinks
10210+        # they're good but retrieve marks them as bad), so that the
10211+        # MODE_READ set of 6 will be insufficient, forcing node.download to
10212+        # retry with more servers.
10213+        return self._test_corrupt_some("share_data")
10214+
10215+
10216     def test_download_fails(self):
10217hunk ./src/allmydata/test/test_mutable.py 1339
10218-        corrupt(None, self._storage, "signature")
10219-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10220+        d = corrupt(None, self._storage, "signature")
10221+        d.addCallback(lambda ignored:
10222+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10223                             "no recoverable versions",
10224hunk ./src/allmydata/test/test_mutable.py 1343
10225-                            self._fn.download_best_version)
10226+                            self._fn.download_best_version))
10227         return d
10228 
10229 
10230hunk ./src/allmydata/test/test_mutable.py 1347
10231+
10232+    def test_corrupt_mdmf_block_hash_tree(self):
10233+        d = self.publish_mdmf()
10234+        d.addCallback(lambda ignored:
10235+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10236+                                   "block hash tree failure",
10237+                                   corrupt_early=False,
10238+                                   should_succeed=False))
10239+        return d
10240+
10241+
10242+    def test_corrupt_mdmf_block_hash_tree_late(self):
10243+        d = self.publish_mdmf()
10244+        d.addCallback(lambda ignored:
10245+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10246+                                   "block hash tree failure",
10247+                                   corrupt_early=True,
10248+                                   should_succeed=False))
10249+        return d
10250+
10251+
10252+    def test_corrupt_mdmf_share_data(self):
10253+        d = self.publish_mdmf()
10254+        d.addCallback(lambda ignored:
10255+            # TODO: Find out what the block size is and corrupt a
10256+            # specific block, rather than just guessing.
10257+            self._test_corrupt_all(("share_data", 12 * 40),
10258+                                    "block hash tree failure",
10259+                                    corrupt_early=True,
10260+                                    should_succeed=False))
10261+        return d
10262+
10263+
10264+    def test_corrupt_some_mdmf(self):
10265+        return self._test_corrupt_some(("share_data", 12 * 40),
10266+                                       mdmf=True)
10267+
10268+
10269 class CheckerMixin:
10270     def check_good(self, r, where):
10271         self.failUnless(r.is_healthy(), where)
10272hunk ./src/allmydata/test/test_mutable.py 1415
10273         d.addCallback(self.check_good, "test_check_good")
10274         return d
10275 
10276+    def test_check_mdmf_good(self):
10277+        d = self.publish_mdmf()
10278+        d.addCallback(lambda ignored:
10279+            self._fn.check(Monitor()))
10280+        d.addCallback(self.check_good, "test_check_mdmf_good")
10281+        return d
10282+
10283     def test_check_no_shares(self):
10284         for shares in self._storage._peers.values():
10285             shares.clear()
10286hunk ./src/allmydata/test/test_mutable.py 1429
10287         d.addCallback(self.check_bad, "test_check_no_shares")
10288         return d
10289 
10290+    def test_check_mdmf_no_shares(self):
10291+        d = self.publish_mdmf()
10292+        def _then(ignored):
10293+            for share in self._storage._peers.values():
10294+                share.clear()
10295+        d.addCallback(_then)
10296+        d.addCallback(lambda ignored:
10297+            self._fn.check(Monitor()))
10298+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
10299+        return d
10300+
10301     def test_check_not_enough_shares(self):
10302         for shares in self._storage._peers.values():
10303             for shnum in shares.keys():
10304hunk ./src/allmydata/test/test_mutable.py 1449
10305         d.addCallback(self.check_bad, "test_check_not_enough_shares")
10306         return d
10307 
10308+    def test_check_mdmf_not_enough_shares(self):
10309+        d = self.publish_mdmf()
10310+        def _then(ignored):
10311+            for shares in self._storage._peers.values():
10312+                for shnum in shares.keys():
10313+                    if shnum > 0:
10314+                        del shares[shnum]
10315+        d.addCallback(_then)
10316+        d.addCallback(lambda ignored:
10317+            self._fn.check(Monitor()))
10318+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
10319+        return d
10320+
10321+
10322     def test_check_all_bad_sig(self):
10323hunk ./src/allmydata/test/test_mutable.py 1464
10324-        corrupt(None, self._storage, 1) # bad sig
10325-        d = self._fn.check(Monitor())
10326+        d = corrupt(None, self._storage, 1) # bad sig
10327+        d.addCallback(lambda ignored:
10328+            self._fn.check(Monitor()))
10329         d.addCallback(self.check_bad, "test_check_all_bad_sig")
10330         return d
10331 
10332hunk ./src/allmydata/test/test_mutable.py 1470
10333+    def test_check_mdmf_all_bad_sig(self):
10334+        d = self.publish_mdmf()
10335+        d.addCallback(lambda ignored:
10336+            corrupt(None, self._storage, 1))
10337+        d.addCallback(lambda ignored:
10338+            self._fn.check(Monitor()))
10339+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
10340+        return d
10341+
10342     def test_check_all_bad_blocks(self):
10343hunk ./src/allmydata/test/test_mutable.py 1480
10344-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
10345+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
10346         # the Checker won't notice this.. it doesn't look at actual data
10347hunk ./src/allmydata/test/test_mutable.py 1482
10348-        d = self._fn.check(Monitor())
10349+        d.addCallback(lambda ignored:
10350+            self._fn.check(Monitor()))
10351         d.addCallback(self.check_good, "test_check_all_bad_blocks")
10352         return d
10353 
10354hunk ./src/allmydata/test/test_mutable.py 1487
10355+
10356+    def test_check_mdmf_all_bad_blocks(self):
10357+        d = self.publish_mdmf()
10358+        d.addCallback(lambda ignored:
10359+            corrupt(None, self._storage, "share_data"))
10360+        d.addCallback(lambda ignored:
10361+            self._fn.check(Monitor()))
10362+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
10363+        return d
10364+
10365     def test_verify_good(self):
10366         d = self._fn.check(Monitor(), verify=True)
10367         d.addCallback(self.check_good, "test_verify_good")
10368hunk ./src/allmydata/test/test_mutable.py 1501
10369         return d
10370+    test_verify_good.timeout = 15
10371 
10372     def test_verify_all_bad_sig(self):
10373hunk ./src/allmydata/test/test_mutable.py 1504
10374-        corrupt(None, self._storage, 1) # bad sig
10375-        d = self._fn.check(Monitor(), verify=True)
10376+        d = corrupt(None, self._storage, 1) # bad sig
10377+        d.addCallback(lambda ignored:
10378+            self._fn.check(Monitor(), verify=True))
10379         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
10380         return d
10381 
10382hunk ./src/allmydata/test/test_mutable.py 1511
10383     def test_verify_one_bad_sig(self):
10384-        corrupt(None, self._storage, 1, [9]) # bad sig
10385-        d = self._fn.check(Monitor(), verify=True)
10386+        d = corrupt(None, self._storage, 1, [9]) # bad sig
10387+        d.addCallback(lambda ignored:
10388+            self._fn.check(Monitor(), verify=True))
10389         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
10390         return d
10391 
10392hunk ./src/allmydata/test/test_mutable.py 1518
10393     def test_verify_one_bad_block(self):
10394-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
10395+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
10396         # the Verifier *will* notice this, since it examines every byte
10397hunk ./src/allmydata/test/test_mutable.py 1520
10398-        d = self._fn.check(Monitor(), verify=True)
10399+        d.addCallback(lambda ignored:
10400+            self._fn.check(Monitor(), verify=True))
10401         d.addCallback(self.check_bad, "test_verify_one_bad_block")
10402         d.addCallback(self.check_expected_failure,
10403                       CorruptShareError, "block hash tree failure",
10404hunk ./src/allmydata/test/test_mutable.py 1529
10405         return d
10406 
10407     def test_verify_one_bad_sharehash(self):
10408-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
10409-        d = self._fn.check(Monitor(), verify=True)
10410+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
10411+        d.addCallback(lambda ignored:
10412+            self._fn.check(Monitor(), verify=True))
10413         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
10414         d.addCallback(self.check_expected_failure,
10415                       CorruptShareError, "corrupt hashes",
10416hunk ./src/allmydata/test/test_mutable.py 1539
10417         return d
10418 
10419     def test_verify_one_bad_encprivkey(self):
10420-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10421-        d = self._fn.check(Monitor(), verify=True)
10422+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10423+        d.addCallback(lambda ignored:
10424+            self._fn.check(Monitor(), verify=True))
10425         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
10426         d.addCallback(self.check_expected_failure,
10427                       CorruptShareError, "invalid privkey",
10428hunk ./src/allmydata/test/test_mutable.py 1549
10429         return d
10430 
10431     def test_verify_one_bad_encprivkey_uncheckable(self):
10432-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10433+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10434         readonly_fn = self._fn.get_readonly()
10435         # a read-only node has no way to validate the privkey
10436hunk ./src/allmydata/test/test_mutable.py 1552
10437-        d = readonly_fn.check(Monitor(), verify=True)
10438+        d.addCallback(lambda ignored:
10439+            readonly_fn.check(Monitor(), verify=True))
10440         d.addCallback(self.check_good,
10441                       "test_verify_one_bad_encprivkey_uncheckable")
10442         return d
10443hunk ./src/allmydata/test/test_mutable.py 1558
10444 
10445+
10446+    def test_verify_mdmf_good(self):
10447+        d = self.publish_mdmf()
10448+        d.addCallback(lambda ignored:
10449+            self._fn.check(Monitor(), verify=True))
10450+        d.addCallback(self.check_good, "test_verify_mdmf_good")
10451+        return d
10452+
10453+
10454+    def test_verify_mdmf_one_bad_block(self):
10455+        d = self.publish_mdmf()
10456+        d.addCallback(lambda ignored:
10457+            corrupt(None, self._storage, "share_data", [1]))
10458+        d.addCallback(lambda ignored:
10459+            self._fn.check(Monitor(), verify=True))
10460+        # We should find one bad block here
10461+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
10462+        d.addCallback(self.check_expected_failure,
10463+                      CorruptShareError, "block hash tree failure",
10464+                      "test_verify_mdmf_one_bad_block")
10465+        return d
10466+
10467+
10468+    def test_verify_mdmf_bad_encprivkey(self):
10469+        d = self.publish_mdmf()
10470+        d.addCallback(lambda ignored:
10471+            corrupt(None, self._storage, "enc_privkey", [1]))
10472+        d.addCallback(lambda ignored:
10473+            self._fn.check(Monitor(), verify=True))
10474+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
10475+        d.addCallback(self.check_expected_failure,
10476+                      CorruptShareError, "privkey",
10477+                      "test_verify_mdmf_bad_encprivkey")
10478+        return d
10479+
10480+
10481+    def test_verify_mdmf_bad_sig(self):
10482+        d = self.publish_mdmf()
10483+        d.addCallback(lambda ignored:
10484+            corrupt(None, self._storage, 1, [1]))
10485+        d.addCallback(lambda ignored:
10486+            self._fn.check(Monitor(), verify=True))
10487+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
10488+        return d
10489+
10490+
10491+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
10492+        d = self.publish_mdmf()
10493+        d.addCallback(lambda ignored:
10494+            corrupt(None, self._storage, "enc_privkey", [1]))
10495+        d.addCallback(lambda ignored:
10496+            self._fn.get_readonly())
10497+        d.addCallback(lambda fn:
10498+            fn.check(Monitor(), verify=True))
10499+        d.addCallback(self.check_good,
10500+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
10501+        return d
10502+
10503+
10504 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
10505 
10506     def get_shares(self, s):
10507hunk ./src/allmydata/test/test_mutable.py 1682
10508         current_shares = self.old_shares[-1]
10509         self.failUnlessEqual(old_shares, current_shares)
10510 
10511+
10512     def test_unrepairable_0shares(self):
10513         d = self.publish_one()
10514         def _delete_all_shares(ign):
10515hunk ./src/allmydata/test/test_mutable.py 1697
10516         d.addCallback(_check)
10517         return d
10518 
10519+    def test_mdmf_unrepairable_0shares(self):
10520+        d = self.publish_mdmf()
10521+        def _delete_all_shares(ign):
10522+            shares = self._storage._peers
10523+            for peerid in shares:
10524+                shares[peerid] = {}
10525+        d.addCallback(_delete_all_shares)
10526+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10527+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10528+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
10529+        return d
10530+
10531+
10532     def test_unrepairable_1share(self):
10533         d = self.publish_one()
10534         def _delete_all_shares(ign):
10535hunk ./src/allmydata/test/test_mutable.py 1726
10536         d.addCallback(_check)
10537         return d
10538 
10539+    def test_mdmf_unrepairable_1share(self):
10540+        d = self.publish_mdmf()
10541+        def _delete_all_shares(ign):
10542+            shares = self._storage._peers
10543+            for peerid in shares:
10544+                for shnum in list(shares[peerid]):
10545+                    if shnum > 0:
10546+                        del shares[peerid][shnum]
10547+        d.addCallback(_delete_all_shares)
10548+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10549+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10550+        def _check(crr):
10551+            self.failUnlessEqual(crr.get_successful(), False)
10552+        d.addCallback(_check)
10553+        return d
10554+
10555+    def test_repairable_5shares(self):
10556+        d = self.publish_mdmf()
10557+        def _delete_all_shares(ign):
10558+            shares = self._storage._peers
10559+            for peerid in shares:
10560+                for shnum in list(shares[peerid]):
10561+                    if shnum > 4:
10562+                        del shares[peerid][shnum]
10563+        d.addCallback(_delete_all_shares)
10564+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10565+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10566+        def _check(crr):
10567+            self.failUnlessEqual(crr.get_successful(), True)
10568+        d.addCallback(_check)
10569+        return d
10570+
10571+    def test_mdmf_repairable_5shares(self):
10572+        d = self.publish_mdmf()
10573+        def _delete_some_shares(ign):
10574+            shares = self._storage._peers
10575+            for peerid in shares:
10576+                for shnum in list(shares[peerid]):
10577+                    if shnum > 5:
10578+                        del shares[peerid][shnum]
10579+        d.addCallback(_delete_some_shares)
10580+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10581+        def _check(cr):
10582+            self.failIf(cr.is_healthy())
10583+            self.failUnless(cr.is_recoverable())
10584+            return cr
10585+        d.addCallback(_check)
10586+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10587+        def _check1(crr):
10588+            self.failUnlessEqual(crr.get_successful(), True)
10589+        d.addCallback(_check1)
10590+        return d
10591+
10592+
10593     def test_merge(self):
10594         self.old_shares = []
10595         d = self.publish_multiple()
10596hunk ./src/allmydata/test/test_mutable.py 1894
10597 class MultipleEncodings(unittest.TestCase):
10598     def setUp(self):
10599         self.CONTENTS = "New contents go here"
10600+        self.uploadable = MutableData(self.CONTENTS)
10601         self._storage = FakeStorage()
10602         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
10603         self._storage_broker = self._nodemaker.storage_broker
10604hunk ./src/allmydata/test/test_mutable.py 1898
10605-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10606+        d = self._nodemaker.create_mutable_file(self.uploadable)
10607         def _created(node):
10608             self._fn = node
10609         d.addCallback(_created)
10610hunk ./src/allmydata/test/test_mutable.py 1924
10611         s = self._storage
10612         s._peers = {} # clear existing storage
10613         p2 = Publish(fn2, self._storage_broker, None)
10614-        d = p2.publish(data)
10615+        uploadable = MutableData(data)
10616+        d = p2.publish(uploadable)
10617         def _published(res):
10618             shares = s._peers
10619             s._peers = {}
10620hunk ./src/allmydata/test/test_mutable.py 2227
10621         self.basedir = "mutable/Problems/test_publish_surprise"
10622         self.set_up_grid()
10623         nm = self.g.clients[0].nodemaker
10624-        d = nm.create_mutable_file("contents 1")
10625+        d = nm.create_mutable_file(MutableData("contents 1"))
10626         def _created(n):
10627             d = defer.succeed(None)
10628             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10629hunk ./src/allmydata/test/test_mutable.py 2237
10630             d.addCallback(_got_smap1)
10631             # then modify the file, leaving the old map untouched
10632             d.addCallback(lambda res: log.msg("starting winning write"))
10633-            d.addCallback(lambda res: n.overwrite("contents 2"))
10634+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10635             # now attempt to modify the file with the old servermap. This
10636             # will look just like an uncoordinated write, in which every
10637             # single share got updated between our mapupdate and our publish
10638hunk ./src/allmydata/test/test_mutable.py 2246
10639                           self.shouldFail(UncoordinatedWriteError,
10640                                           "test_publish_surprise", None,
10641                                           n.upload,
10642-                                          "contents 2a", self.old_map))
10643+                                          MutableData("contents 2a"), self.old_map))
10644             return d
10645         d.addCallback(_created)
10646         return d
10647hunk ./src/allmydata/test/test_mutable.py 2255
10648         self.basedir = "mutable/Problems/test_retrieve_surprise"
10649         self.set_up_grid()
10650         nm = self.g.clients[0].nodemaker
10651-        d = nm.create_mutable_file("contents 1")
10652+        d = nm.create_mutable_file(MutableData("contents 1"))
10653         def _created(n):
10654             d = defer.succeed(None)
10655             d.addCallback(lambda res: n.get_servermap(MODE_READ))
10656hunk ./src/allmydata/test/test_mutable.py 2265
10657             d.addCallback(_got_smap1)
10658             # then modify the file, leaving the old map untouched
10659             d.addCallback(lambda res: log.msg("starting winning write"))
10660-            d.addCallback(lambda res: n.overwrite("contents 2"))
10661+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10662             # now attempt to retrieve the old version with the old servermap.
10663             # This will look like someone has changed the file since we
10664             # updated the servermap.
10665hunk ./src/allmydata/test/test_mutable.py 2274
10666             d.addCallback(lambda res:
10667                           self.shouldFail(NotEnoughSharesError,
10668                                           "test_retrieve_surprise",
10669-                                          "ran out of peers: have 0 shares (k=3)",
10670+                                          "ran out of peers: have 0 of 1",
10671                                           n.download_version,
10672                                           self.old_map,
10673                                           self.old_map.best_recoverable_version(),
10674hunk ./src/allmydata/test/test_mutable.py 2283
10675         d.addCallback(_created)
10676         return d
10677 
10678+
10679     def test_unexpected_shares(self):
10680         # upload the file, take a servermap, shut down one of the servers,
10681         # upload it again (causing shares to appear on a new server), then
10682hunk ./src/allmydata/test/test_mutable.py 2293
10683         self.basedir = "mutable/Problems/test_unexpected_shares"
10684         self.set_up_grid()
10685         nm = self.g.clients[0].nodemaker
10686-        d = nm.create_mutable_file("contents 1")
10687+        d = nm.create_mutable_file(MutableData("contents 1"))
10688         def _created(n):
10689             d = defer.succeed(None)
10690             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10691hunk ./src/allmydata/test/test_mutable.py 2305
10692                 self.g.remove_server(peer0)
10693                 # then modify the file, leaving the old map untouched
10694                 log.msg("starting winning write")
10695-                return n.overwrite("contents 2")
10696+                return n.overwrite(MutableData("contents 2"))
10697             d.addCallback(_got_smap1)
10698             # now attempt to modify the file with the old servermap. This
10699             # will look just like an uncoordinated write, in which every
10700hunk ./src/allmydata/test/test_mutable.py 2315
10701                           self.shouldFail(UncoordinatedWriteError,
10702                                           "test_surprise", None,
10703                                           n.upload,
10704-                                          "contents 2a", self.old_map))
10705+                                          MutableData("contents 2a"), self.old_map))
10706             return d
10707         d.addCallback(_created)
10708         return d
10709hunk ./src/allmydata/test/test_mutable.py 2319
10710+    test_unexpected_shares.timeout = 15
10711 
10712     def test_bad_server(self):
10713         # Break one server, then create the file: the initial publish should
10714hunk ./src/allmydata/test/test_mutable.py 2355
10715         d.addCallback(_break_peer0)
10716         # now "create" the file, using the pre-established key, and let the
10717         # initial publish finally happen
10718-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
10719+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
10720         # that ought to work
10721         def _got_node(n):
10722             d = n.download_best_version()
10723hunk ./src/allmydata/test/test_mutable.py 2364
10724             def _break_peer1(res):
10725                 self.connection1.broken = True
10726             d.addCallback(_break_peer1)
10727-            d.addCallback(lambda res: n.overwrite("contents 2"))
10728+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10729             # that ought to work too
10730             d.addCallback(lambda res: n.download_best_version())
10731             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10732hunk ./src/allmydata/test/test_mutable.py 2396
10733         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
10734         self.g.break_server(peerids[0])
10735 
10736-        d = nm.create_mutable_file("contents 1")
10737+        d = nm.create_mutable_file(MutableData("contents 1"))
10738         def _created(n):
10739             d = n.download_best_version()
10740             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10741hunk ./src/allmydata/test/test_mutable.py 2404
10742             def _break_second_server(res):
10743                 self.g.break_server(peerids[1])
10744             d.addCallback(_break_second_server)
10745-            d.addCallback(lambda res: n.overwrite("contents 2"))
10746+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10747             # that ought to work too
10748             d.addCallback(lambda res: n.download_best_version())
10749             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10750hunk ./src/allmydata/test/test_mutable.py 2423
10751         d = self.shouldFail(NotEnoughServersError,
10752                             "test_publish_all_servers_bad",
10753                             "Ran out of non-bad servers",
10754-                            nm.create_mutable_file, "contents")
10755+                            nm.create_mutable_file, MutableData("contents"))
10756         return d
10757 
10758     def test_publish_no_servers(self):
10759hunk ./src/allmydata/test/test_mutable.py 2435
10760         d = self.shouldFail(NotEnoughServersError,
10761                             "test_publish_no_servers",
10762                             "Ran out of non-bad servers",
10763-                            nm.create_mutable_file, "contents")
10764+                            nm.create_mutable_file, MutableData("contents"))
10765         return d
10766     test_publish_no_servers.timeout = 30
10767 
10768hunk ./src/allmydata/test/test_mutable.py 2453
10769         # we need some contents that are large enough to push the privkey out
10770         # of the early part of the file
10771         LARGE = "These are Larger contents" * 2000 # about 50KB
10772-        d = nm.create_mutable_file(LARGE)
10773+        LARGE_uploadable = MutableData(LARGE)
10774+        d = nm.create_mutable_file(LARGE_uploadable)
10775         def _created(n):
10776             self.uri = n.get_uri()
10777             self.n2 = nm.create_from_cap(self.uri)
10778hunk ./src/allmydata/test/test_mutable.py 2489
10779         self.basedir = "mutable/Problems/test_privkey_query_missing"
10780         self.set_up_grid(num_servers=20)
10781         nm = self.g.clients[0].nodemaker
10782-        LARGE = "These are Larger contents" * 2000 # about 50KB
10783+        LARGE = "These are Larger contents" * 2000 # about 50KiB
10784+        LARGE_uploadable = MutableData(LARGE)
10785         nm._node_cache = DevNullDictionary() # disable the nodecache
10786 
10787hunk ./src/allmydata/test/test_mutable.py 2493
10788-        d = nm.create_mutable_file(LARGE)
10789+        d = nm.create_mutable_file(LARGE_uploadable)
10790         def _created(n):
10791             self.uri = n.get_uri()
10792             self.n2 = nm.create_from_cap(self.uri)
10793hunk ./src/allmydata/test/test_mutable.py 2503
10794         d.addCallback(_created)
10795         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
10796         return d
10797+
10798+
10799+    def test_block_and_hash_query_error(self):
10800+        # This tests for what happens when a query to a remote server
10801+        # fails in either the hash validation step or the block getting
10802+        # step (because of batching, this is the same actual query).
10803+        # We need to have the storage server persist up until the point
10804+        # that its prefix is validated, then suddenly die. This
10805+        # exercises some exception handling code in Retrieve.
10806+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
10807+        self.set_up_grid(num_servers=20)
10808+        nm = self.g.clients[0].nodemaker
10809+        CONTENTS = "contents" * 2000
10810+        CONTENTS_uploadable = MutableData(CONTENTS)
10811+        d = nm.create_mutable_file(CONTENTS_uploadable)
10812+        def _created(node):
10813+            self._node = node
10814+        d.addCallback(_created)
10815+        d.addCallback(lambda ignored:
10816+            self._node.get_servermap(MODE_READ))
10817+        def _then(servermap):
10818+            # we have our servermap. Now we set up the servers like the
10819+            # tests above -- the first one that gets a read call should
10820+            # start throwing errors, but only after returning its prefix
10821+            # for validation. Since we'll download without fetching the
10822+            # private key, the next query to the remote server will be
10823+            # for either a block and salt or for hashes, either of which
10824+            # will exercise the error handling code.
10825+            killer = FirstServerGetsKilled()
10826+            for (serverid, ss) in nm.storage_broker.get_all_servers():
10827+                ss.post_call_notifier = killer.notify
10828+            ver = servermap.best_recoverable_version()
10829+            assert ver
10830+            return self._node.download_version(servermap, ver)
10831+        d.addCallback(_then)
10832+        d.addCallback(lambda data:
10833+            self.failUnlessEqual(data, CONTENTS))
10834+        return d
10835+
10836+
10837+class FileHandle(unittest.TestCase):
10838+    def setUp(self):
10839+        self.test_data = "Test Data" * 50000
10840+        self.sio = StringIO(self.test_data)
10841+        self.uploadable = MutableFileHandle(self.sio)
10842+
10843+
10844+    def test_filehandle_read(self):
10845+        self.basedir = "mutable/FileHandle/test_filehandle_read"
10846+        chunk_size = 10
10847+        for i in xrange(0, len(self.test_data), chunk_size):
10848+            data = self.uploadable.read(chunk_size)
10849+            data = "".join(data)
10850+            start = i
10851+            end = i + chunk_size
10852+            self.failUnlessEqual(data, self.test_data[start:end])
10853+
10854+
10855+    def test_filehandle_get_size(self):
10856+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
10857+        actual_size = len(self.test_data)
10858+        size = self.uploadable.get_size()
10859+        self.failUnlessEqual(size, actual_size)
10860+
10861+
10862+    def test_filehandle_get_size_out_of_order(self):
10863+        # We should be able to call get_size whenever we want without
10864+        # disturbing the location of the seek pointer.
10865+        chunk_size = 100
10866+        data = self.uploadable.read(chunk_size)
10867+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
10868+
10869+        # Now get the size.
10870+        size = self.uploadable.get_size()
10871+        self.failUnlessEqual(size, len(self.test_data))
10872+
10873+        # Now get more data. We should be right where we left off.
10874+        more_data = self.uploadable.read(chunk_size)
10875+        start = chunk_size
10876+        end = chunk_size * 2
10877+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
10878+
10879+
10880+    def test_filehandle_file(self):
10881+        # Make sure that the MutableFileHandle works on a file as well
10882+        # as a StringIO object, since in some cases it will be asked to
10883+        # deal with files.
10884+        self.basedir = self.mktemp()
10885+        # necessary? What am I doing wrong here?
10886+        os.mkdir(self.basedir)
10887+        f_path = os.path.join(self.basedir, "test_file")
10888+        f = open(f_path, "w")
10889+        f.write(self.test_data)
10890+        f.close()
10891+        f = open(f_path, "r")
10892+
10893+        uploadable = MutableFileHandle(f)
10894+
10895+        data = uploadable.read(len(self.test_data))
10896+        self.failUnlessEqual("".join(data), self.test_data)
10897+        size = uploadable.get_size()
10898+        self.failUnlessEqual(size, len(self.test_data))
10899+
10900+
10901+    def test_close(self):
10902+        # Make sure that the MutableFileHandle closes its handle when
10903+        # told to do so.
10904+        self.uploadable.close()
10905+        self.failUnless(self.sio.closed)
10906+
10907+
10908+class DataHandle(unittest.TestCase):
10909+    def setUp(self):
10910+        self.test_data = "Test Data" * 50000
10911+        self.uploadable = MutableData(self.test_data)
10912+
10913+
10914+    def test_datahandle_read(self):
10915+        chunk_size = 10
10916+        for i in xrange(0, len(self.test_data), chunk_size):
10917+            data = self.uploadable.read(chunk_size)
10918+            data = "".join(data)
10919+            start = i
10920+            end = i + chunk_size
10921+            self.failUnlessEqual(data, self.test_data[start:end])
10922+
10923+
10924+    def test_datahandle_get_size(self):
10925+        actual_size = len(self.test_data)
10926+        size = self.uploadable.get_size()
10927+        self.failUnlessEqual(size, actual_size)
10928+
10929+
10930+    def test_datahandle_get_size_out_of_order(self):
10931+        # We should be able to call get_size whenever we want without
10932+        # disturbing the location of the seek pointer.
10933+        chunk_size = 100
10934+        data = self.uploadable.read(chunk_size)
10935+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
10936+
10937+        # Now get the size.
10938+        size = self.uploadable.get_size()
10939+        self.failUnlessEqual(size, len(self.test_data))
10940+
10941+        # Now get more data. We should be right where we left off.
10942+        more_data = self.uploadable.read(chunk_size)
10943+        start = chunk_size
10944+        end = chunk_size * 2
10945+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
10946+
10947+
10948+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
10949+              PublishMixin):
10950+    def setUp(self):
10951+        GridTestMixin.setUp(self)
10952+        self.basedir = self.mktemp()
10953+        self.set_up_grid()
10954+        self.c = self.g.clients[0]
10955+        self.nm = self.c.nodemaker
10956+        self.data = "test data" * 100000 # about 900 KiB; MDMF
10957+        self.small_data = "test data" * 10 # about 90 B; SDMF
10958+        return self.do_upload()
10959+
10960+
10961+    def do_upload(self):
10962+        d1 = self.nm.create_mutable_file(MutableData(self.data),
10963+                                         version=MDMF_VERSION)
10964+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
10965+        dl = gatherResults([d1, d2])
10966+        def _then((n1, n2)):
10967+            assert isinstance(n1, MutableFileNode)
10968+            assert isinstance(n2, MutableFileNode)
10969+
10970+            self.mdmf_node = n1
10971+            self.sdmf_node = n2
10972+        dl.addCallback(_then)
10973+        return dl
10974+
10975+
10976+    def test_get_readonly_mutable_version(self):
10977+        # Attempting to get a mutable version of a mutable file from a
10978+        # filenode initialized with a readcap should return a readonly
10979+        # version of that same node.
10980+        ro = self.mdmf_node.get_readonly()
10981+        d = ro.get_best_mutable_version()
10982+        d.addCallback(lambda version:
10983+            self.failUnless(version.is_readonly()))
10984+        d.addCallback(lambda ignored:
10985+            self.sdmf_node.get_readonly())
10986+        d.addCallback(lambda version:
10987+            self.failUnless(version.is_readonly()))
10988+        return d
10989+
10990+
10991+    def test_get_sequence_number(self):
10992+        d = self.mdmf_node.get_best_readable_version()
10993+        d.addCallback(lambda bv:
10994+            self.failUnlessEqual(bv.get_sequence_number(), 1))
10995+        d.addCallback(lambda ignored:
10996+            self.sdmf_node.get_best_readable_version())
10997+        d.addCallback(lambda bv:
10998+            self.failUnlessEqual(bv.get_sequence_number(), 1))
10999+        # Now update. The sequence number in both cases should be 1 in
11000+        # both cases.
11001+        def _do_update(ignored):
11002+            new_data = MutableData("foo bar baz" * 100000)
11003+            new_small_data = MutableData("foo bar baz" * 10)
11004+            d1 = self.mdmf_node.overwrite(new_data)
11005+            d2 = self.sdmf_node.overwrite(new_small_data)
11006+            dl = gatherResults([d1, d2])
11007+            return dl
11008+        d.addCallback(_do_update)
11009+        d.addCallback(lambda ignored:
11010+            self.mdmf_node.get_best_readable_version())
11011+        d.addCallback(lambda bv:
11012+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11013+        d.addCallback(lambda ignored:
11014+            self.sdmf_node.get_best_readable_version())
11015+        d.addCallback(lambda bv:
11016+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11017+        return d
11018+
11019+
11020+    def test_get_writekey(self):
11021+        d = self.mdmf_node.get_best_mutable_version()
11022+        d.addCallback(lambda bv:
11023+            self.failUnlessEqual(bv.get_writekey(),
11024+                                 self.mdmf_node.get_writekey()))
11025+        d.addCallback(lambda ignored:
11026+            self.sdmf_node.get_best_mutable_version())
11027+        d.addCallback(lambda bv:
11028+            self.failUnlessEqual(bv.get_writekey(),
11029+                                 self.sdmf_node.get_writekey()))
11030+        return d
11031+
11032+
11033+    def test_get_storage_index(self):
11034+        d = self.mdmf_node.get_best_mutable_version()
11035+        d.addCallback(lambda bv:
11036+            self.failUnlessEqual(bv.get_storage_index(),
11037+                                 self.mdmf_node.get_storage_index()))
11038+        d.addCallback(lambda ignored:
11039+            self.sdmf_node.get_best_mutable_version())
11040+        d.addCallback(lambda bv:
11041+            self.failUnlessEqual(bv.get_storage_index(),
11042+                                 self.sdmf_node.get_storage_index()))
11043+        return d
11044+
11045+
11046+    def test_get_readonly_version(self):
11047+        d = self.mdmf_node.get_best_readable_version()
11048+        d.addCallback(lambda bv:
11049+            self.failUnless(bv.is_readonly()))
11050+        d.addCallback(lambda ignored:
11051+            self.sdmf_node.get_best_readable_version())
11052+        d.addCallback(lambda bv:
11053+            self.failUnless(bv.is_readonly()))
11054+        return d
11055+
11056+
11057+    def test_get_mutable_version(self):
11058+        d = self.mdmf_node.get_best_mutable_version()
11059+        d.addCallback(lambda bv:
11060+            self.failIf(bv.is_readonly()))
11061+        d.addCallback(lambda ignored:
11062+            self.sdmf_node.get_best_mutable_version())
11063+        d.addCallback(lambda bv:
11064+            self.failIf(bv.is_readonly()))
11065+        return d
11066+
11067+
11068+    def test_toplevel_overwrite(self):
11069+        new_data = MutableData("foo bar baz" * 100000)
11070+        new_small_data = MutableData("foo bar baz" * 10)
11071+        d = self.mdmf_node.overwrite(new_data)
11072+        d.addCallback(lambda ignored:
11073+            self.mdmf_node.download_best_version())
11074+        d.addCallback(lambda data:
11075+            self.failUnlessEqual(data, "foo bar baz" * 100000))
11076+        d.addCallback(lambda ignored:
11077+            self.sdmf_node.overwrite(new_small_data))
11078+        d.addCallback(lambda ignored:
11079+            self.sdmf_node.download_best_version())
11080+        d.addCallback(lambda data:
11081+            self.failUnlessEqual(data, "foo bar baz" * 10))
11082+        return d
11083+
11084+
11085+    def test_toplevel_modify(self):
11086+        def modifier(old_contents, servermap, first_time):
11087+            return old_contents + "modified"
11088+        d = self.mdmf_node.modify(modifier)
11089+        d.addCallback(lambda ignored:
11090+            self.mdmf_node.download_best_version())
11091+        d.addCallback(lambda data:
11092+            self.failUnlessIn("modified", data))
11093+        d.addCallback(lambda ignored:
11094+            self.sdmf_node.modify(modifier))
11095+        d.addCallback(lambda ignored:
11096+            self.sdmf_node.download_best_version())
11097+        d.addCallback(lambda data:
11098+            self.failUnlessIn("modified", data))
11099+        return d
11100+
11101+
11102+    def test_version_modify(self):
11103+        # TODO: When we can publish multiple versions, alter this test
11104+        # to modify a version other than the best usable version, then
11105+        # test to see that the best recoverable version is that.
11106+        def modifier(old_contents, servermap, first_time):
11107+            return old_contents + "modified"
11108+        d = self.mdmf_node.modify(modifier)
11109+        d.addCallback(lambda ignored:
11110+            self.mdmf_node.download_best_version())
11111+        d.addCallback(lambda data:
11112+            self.failUnlessIn("modified", data))
11113+        d.addCallback(lambda ignored:
11114+            self.sdmf_node.modify(modifier))
11115+        d.addCallback(lambda ignored:
11116+            self.sdmf_node.download_best_version())
11117+        d.addCallback(lambda data:
11118+            self.failUnlessIn("modified", data))
11119+        return d
11120+
11121+
11122+    def test_download_version(self):
11123+        d = self.publish_multiple()
11124+        # We want to have two recoverable versions on the grid.
11125+        d.addCallback(lambda res:
11126+                      self._set_versions({0:3,2:3,4:3,6:3,8:3,
11127+                                          1:4,3:4,5:4,7:4,9:4}))
11128+        # Now try to download each version. We should get the plaintext
11129+        # associated with that version.
11130+        d.addCallback(lambda ignored:
11131+            self._fn.get_servermap(mode=MODE_READ))
11132+        def _got_servermap(smap):
11133+            versions = smap.recoverable_versions()
11134+            assert len(versions) == 2
11135+
11136+            self.servermap = smap
11137+            self.version1, self.version2 = versions
11138+            self.version1_seqnum = self.version1[0]
11139+            self.version2_seqnum = self.version2[0]
11140+
11141+        d.addCallback(_got_servermap)
11142+        d.addCallback(lambda ignored:
11143+            self._fn.download_version(self.servermap, self.version1))
11144+        d.addCallback(lambda results:
11145+            self.failUnlessEqual(self.CONTENTS[self.version1_seqnum],
11146+                                 results))
11147+        d.addCallback(lambda ignored:
11148+            self._fn.download_version(self.servermap, self.version2))
11149+        d.addCallback(lambda results:
11150+            self.failUnlessEqual(self.CONTENTS[self.version2_seqnum],
11151+                                 results))
11152+        return d
11153+
11154+
11155+    def test_partial_read(self):
11156+        # read only a few bytes at a time, and see that the results are
11157+        # what we expect.
11158+        d = self.mdmf_node.get_best_readable_version()
11159+        def _read_data(version):
11160+            c = consumer.MemoryConsumer()
11161+            d2 = defer.succeed(None)
11162+            for i in xrange(0, len(self.data), 10000):
11163+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
11164+            d2.addCallback(lambda ignored:
11165+                self.failUnlessEqual(self.data, "".join(c.chunks)))
11166+            return d2
11167+        d.addCallback(_read_data)
11168+        return d
11169+
11170+
11171+    def test_read(self):
11172+        d = self.mdmf_node.get_best_readable_version()
11173+        def _read_data(version):
11174+            c = consumer.MemoryConsumer()
11175+            d2 = defer.succeed(None)
11176+            d2.addCallback(lambda ignored: version.read(c))
11177+            d2.addCallback(lambda ignored:
11178+                self.failUnlessEqual("".join(c.chunks), self.data))
11179+            return d2
11180+        d.addCallback(_read_data)
11181+        return d
11182+
11183+
11184+    def test_download_best_version(self):
11185+        d = self.mdmf_node.download_best_version()
11186+        d.addCallback(lambda data:
11187+            self.failUnlessEqual(data, self.data))
11188+        d.addCallback(lambda ignored:
11189+            self.sdmf_node.download_best_version())
11190+        d.addCallback(lambda data:
11191+            self.failUnlessEqual(data, self.small_data))
11192+        return d
11193+
11194+
11195+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
11196+    def setUp(self):
11197+        GridTestMixin.setUp(self)
11198+        self.basedir = self.mktemp()
11199+        self.set_up_grid()
11200+        self.c = self.g.clients[0]
11201+        self.nm = self.c.nodemaker
11202+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11203+        self.small_data = "test data" * 10 # about 90 B; SDMF
11204+        return self.do_upload()
11205+
11206+
11207+    def do_upload(self):
11208+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11209+                                         version=MDMF_VERSION)
11210+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11211+        dl = gatherResults([d1, d2])
11212+        def _then((n1, n2)):
11213+            assert isinstance(n1, MutableFileNode)
11214+            assert isinstance(n2, MutableFileNode)
11215+
11216+            self.mdmf_node = n1
11217+            self.sdmf_node = n2
11218+        dl.addCallback(_then)
11219+        return dl
11220+
11221+
11222+    def test_append(self):
11223+        # We should be able to append data to the middle of a mutable
11224+        # file and get what we expect.
11225+        new_data = self.data + "appended"
11226+        d = self.mdmf_node.get_best_mutable_version()
11227+        d.addCallback(lambda mv:
11228+            mv.update(MutableData("appended"), len(self.data)))
11229+        d.addCallback(lambda ignored:
11230+            self.mdmf_node.download_best_version())
11231+        d.addCallback(lambda results:
11232+            self.failUnlessEqual(results, new_data))
11233+        return d
11234+    test_append.timeout = 15
11235+
11236+
11237+    def test_replace(self):
11238+        # We should be able to replace data in the middle of a mutable
11239+        # file and get what we expect back.
11240+        new_data = self.data[:100]
11241+        new_data += "appended"
11242+        new_data += self.data[108:]
11243+        d = self.mdmf_node.get_best_mutable_version()
11244+        d.addCallback(lambda mv:
11245+            mv.update(MutableData("appended"), 100))
11246+        d.addCallback(lambda ignored:
11247+            self.mdmf_node.download_best_version())
11248+        d.addCallback(lambda results:
11249+            self.failUnlessEqual(results, new_data))
11250+        return d
11251+
11252+
11253+    def test_replace_and_extend(self):
11254+        # We should be able to replace data in the middle of a mutable
11255+        # file and extend that mutable file and get what we expect.
11256+        new_data = self.data[:100]
11257+        new_data += "modified " * 100000
11258+        d = self.mdmf_node.get_best_mutable_version()
11259+        d.addCallback(lambda mv:
11260+            mv.update(MutableData("modified " * 100000), 100))
11261+        d.addCallback(lambda ignored:
11262+            self.mdmf_node.download_best_version())
11263+        d.addCallback(lambda results:
11264+            self.failUnlessEqual(results, new_data))
11265+        return d
11266+
11267+
11268+    def test_append_power_of_two(self):
11269+        # If we attempt to extend a mutable file so that its segment
11270+        # count crosses a power-of-two boundary, the update operation
11271+        # should know how to reencode the file.
11272+
11273+        # Note that the data populating self.mdmf_node is about 900 KiB
11274+        # long -- this is 7 segments in the default segment size. So we
11275+        # need to add 2 segments worth of data to push it over a
11276+        # power-of-two boundary.
11277+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
11278+        new_data = self.data + (segment * 2)
11279+        d = self.mdmf_node.get_best_mutable_version()
11280+        d.addCallback(lambda mv:
11281+            mv.update(MutableData(segment * 2), len(self.data)))
11282+        d.addCallback(lambda ignored:
11283+            self.mdmf_node.download_best_version())
11284+        d.addCallback(lambda results:
11285+            self.failUnlessEqual(results, new_data))
11286+        return d
11287+    test_append_power_of_two.timeout = 15
11288+
11289+
11290+    def test_update_sdmf(self):
11291+        # Running update on a single-segment file should still work.
11292+        new_data = self.small_data + "appended"
11293+        d = self.sdmf_node.get_best_mutable_version()
11294+        d.addCallback(lambda mv:
11295+            mv.update(MutableData("appended"), len(self.small_data)))
11296+        d.addCallback(lambda ignored:
11297+            self.sdmf_node.download_best_version())
11298+        d.addCallback(lambda results:
11299+            self.failUnlessEqual(results, new_data))
11300+        return d
11301+
11302+    def test_replace_in_last_segment(self):
11303+        # The wrapper should know how to handle the tail segment
11304+        # appropriately.
11305+        replace_offset = len(self.data) - 100
11306+        new_data = self.data[:replace_offset] + "replaced"
11307+        rest_offset = replace_offset + len("replaced")
11308+        new_data += self.data[rest_offset:]
11309+        d = self.mdmf_node.get_best_mutable_version()
11310+        d.addCallback(lambda mv:
11311+            mv.update(MutableData("replaced"), replace_offset))
11312+        d.addCallback(lambda ignored:
11313+            self.mdmf_node.download_best_version())
11314+        d.addCallback(lambda results:
11315+            self.failUnlessEqual(results, new_data))
11316+        return d
11317+
11318+
11319+    def test_multiple_segment_replace(self):
11320+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
11321+        new_data = self.data[:replace_offset]
11322+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
11323+        new_data += 2 * new_segment
11324+        new_data += "replaced"
11325+        rest_offset = len(new_data)
11326+        new_data += self.data[rest_offset:]
11327+        d = self.mdmf_node.get_best_mutable_version()
11328+        d.addCallback(lambda mv:
11329+            mv.update(MutableData((2 * new_segment) + "replaced"),
11330+                      replace_offset))
11331+        d.addCallback(lambda ignored:
11332+            self.mdmf_node.download_best_version())
11333+        d.addCallback(lambda results:
11334+            self.failUnlessEqual(results, new_data))
11335+        return d
11336hunk ./src/allmydata/test/test_sftp.py 32
11337 
11338 from allmydata.util.consumer import download_to_data
11339 from allmydata.immutable import upload
11340+from allmydata.mutable import publish
11341 from allmydata.test.no_network import GridTestMixin
11342 from allmydata.test.common import ShouldFailMixin
11343 from allmydata.test.common_util import ReallyEqualMixin
11344hunk ./src/allmydata/test/test_sftp.py 84
11345         return d
11346 
11347     def _set_up_tree(self):
11348-        d = self.client.create_mutable_file("mutable file contents")
11349+        u = publish.MutableData("mutable file contents")
11350+        d = self.client.create_mutable_file(u)
11351         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
11352         def _created_mutable(n):
11353             self.mutable = n
11354hunk ./src/allmydata/test/test_sftp.py 1334
11355         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
11356         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
11357         return d
11358+    test_makeDirectory.timeout = 15
11359 
11360     def test_execCommand_and_openShell(self):
11361         class FakeProtocol:
11362hunk ./src/allmydata/test/test_system.py 25
11363 from allmydata.monitor import Monitor
11364 from allmydata.mutable.common import NotWriteableError
11365 from allmydata.mutable import layout as mutable_layout
11366+from allmydata.mutable.publish import MutableData
11367 from foolscap.api import DeadReferenceError
11368 from twisted.python.failure import Failure
11369 from twisted.web.client import getPage
11370hunk ./src/allmydata/test/test_system.py 463
11371     def test_mutable(self):
11372         self.basedir = "system/SystemTest/test_mutable"
11373         DATA = "initial contents go here."  # 25 bytes % 3 != 0
11374+        DATA_uploadable = MutableData(DATA)
11375         NEWDATA = "new contents yay"
11376hunk ./src/allmydata/test/test_system.py 465
11377+        NEWDATA_uploadable = MutableData(NEWDATA)
11378         NEWERDATA = "this is getting old"
11379hunk ./src/allmydata/test/test_system.py 467
11380+        NEWERDATA_uploadable = MutableData(NEWERDATA)
11381 
11382         d = self.set_up_nodes(use_key_generator=True)
11383 
11384hunk ./src/allmydata/test/test_system.py 474
11385         def _create_mutable(res):
11386             c = self.clients[0]
11387             log.msg("starting create_mutable_file")
11388-            d1 = c.create_mutable_file(DATA)
11389+            d1 = c.create_mutable_file(DATA_uploadable)
11390             def _done(res):
11391                 log.msg("DONE: %s" % (res,))
11392                 self._mutable_node_1 = res
11393hunk ./src/allmydata/test/test_system.py 561
11394             self.failUnlessEqual(res, DATA)
11395             # replace the data
11396             log.msg("starting replace1")
11397-            d1 = newnode.overwrite(NEWDATA)
11398+            d1 = newnode.overwrite(NEWDATA_uploadable)
11399             d1.addCallback(lambda res: newnode.download_best_version())
11400             return d1
11401         d.addCallback(_check_download_3)
11402hunk ./src/allmydata/test/test_system.py 575
11403             newnode2 = self.clients[3].create_node_from_uri(uri)
11404             self._newnode3 = self.clients[3].create_node_from_uri(uri)
11405             log.msg("starting replace2")
11406-            d1 = newnode1.overwrite(NEWERDATA)
11407+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
11408             d1.addCallback(lambda res: newnode2.download_best_version())
11409             return d1
11410         d.addCallback(_check_download_4)
11411hunk ./src/allmydata/test/test_system.py 645
11412         def _check_empty_file(res):
11413             # make sure we can create empty files, this usually screws up the
11414             # segsize math
11415-            d1 = self.clients[2].create_mutable_file("")
11416+            d1 = self.clients[2].create_mutable_file(MutableData(""))
11417             d1.addCallback(lambda newnode: newnode.download_best_version())
11418             d1.addCallback(lambda res: self.failUnlessEqual("", res))
11419             return d1
11420hunk ./src/allmydata/test/test_system.py 676
11421                                  self.key_generator_svc.key_generator.pool_size + size_delta)
11422 
11423         d.addCallback(check_kg_poolsize, 0)
11424-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
11425+        d.addCallback(lambda junk:
11426+            self.clients[3].create_mutable_file(MutableData('hello, world')))
11427         d.addCallback(check_kg_poolsize, -1)
11428         d.addCallback(lambda junk: self.clients[3].create_dirnode())
11429         d.addCallback(check_kg_poolsize, -2)
11430hunk ./src/allmydata/test/test_web.py 750
11431                              self.PUT, base + "/@@name=/blah.txt", "")
11432         return d
11433 
11434+
11435     def test_GET_DIRURL_named_bad(self):
11436         base = "/file/%s" % urllib.quote(self._foo_uri)
11437         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
11438hunk ./src/allmydata/test/test_web.py 898
11439         return d
11440 
11441     def test_PUT_NEWFILEURL_mutable_toobig(self):
11442-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
11443-                             "413 Request Entity Too Large",
11444-                             "SDMF is limited to one segment, and 10001 > 10000",
11445-                             self.PUT,
11446-                             self.public_url + "/foo/new.txt?mutable=true",
11447-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
11448+        # It is okay to upload large mutable files, so we should be able
11449+        # to do that.
11450+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
11451+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
11452         return d
11453 
11454     def test_PUT_NEWFILEURL_replace(self):
11455hunk ./src/allmydata/test/test_web.py 1684
11456         return d
11457 
11458     def test_POST_upload_no_link_mutable_toobig(self):
11459-        d = self.shouldFail2(error.Error,
11460-                             "test_POST_upload_no_link_mutable_toobig",
11461-                             "413 Request Entity Too Large",
11462-                             "SDMF is limited to one segment, and 10001 > 10000",
11463-                             self.POST,
11464-                             "/uri", t="upload", mutable="true",
11465-                             file=("new.txt",
11466-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
11467+        # The SDMF size limit is no longer in place, so we should be
11468+        # able to upload mutable files that are as large as we want them
11469+        # to be.
11470+        d = self.POST("/uri", t="upload", mutable="true",
11471+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
11472         return d
11473 
11474     def test_POST_upload_mutable(self):
11475hunk ./src/allmydata/test/test_web.py 1815
11476             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
11477         d.addCallback(_got_headers)
11478 
11479-        # make sure that size errors are displayed correctly for overwrite
11480-        d.addCallback(lambda res:
11481-                      self.shouldFail2(error.Error,
11482-                                       "test_POST_upload_mutable-toobig",
11483-                                       "413 Request Entity Too Large",
11484-                                       "SDMF is limited to one segment, and 10001 > 10000",
11485-                                       self.POST,
11486-                                       self.public_url + "/foo", t="upload",
11487-                                       mutable="true",
11488-                                       file=("new.txt",
11489-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
11490-                                       ))
11491-
11492+        # make sure that outdated size limits aren't enforced anymore.
11493+        d.addCallback(lambda ignored:
11494+            self.POST(self.public_url + "/foo", t="upload",
11495+                      mutable="true",
11496+                      file=("new.txt",
11497+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
11498         d.addErrback(self.dump_error)
11499         return d
11500 
11501hunk ./src/allmydata/test/test_web.py 1825
11502     def test_POST_upload_mutable_toobig(self):
11503-        d = self.shouldFail2(error.Error,
11504-                             "test_POST_upload_mutable_toobig",
11505-                             "413 Request Entity Too Large",
11506-                             "SDMF is limited to one segment, and 10001 > 10000",
11507-                             self.POST,
11508-                             self.public_url + "/foo",
11509-                             t="upload", mutable="true",
11510-                             file=("new.txt",
11511-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
11512+        # SDMF had a size limti that was removed a while ago. MDMF has
11513+        # never had a size limit. Test to make sure that we do not
11514+        # encounter errors when trying to upload large mutable files,
11515+        # since there should be no coded prohibitions regarding large
11516+        # mutable files.
11517+        d = self.POST(self.public_url + "/foo",
11518+                      t="upload", mutable="true",
11519+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
11520         return d
11521 
11522     def dump_error(self, f):
11523hunk ./src/allmydata/test/test_web.py 2956
11524         d.addCallback(_done)
11525         return d
11526 
11527+
11528+    def test_PUT_update_at_offset(self):
11529+        file_contents = "test file" * 100000 # about 900 KiB
11530+        d = self.PUT("/uri?mutable=true", file_contents)
11531+        def _then(filecap):
11532+            self.filecap = filecap
11533+            new_data = file_contents[:100]
11534+            new = "replaced and so on"
11535+            new_data += new
11536+            new_data += file_contents[len(new_data):]
11537+            assert len(new_data) == len(file_contents)
11538+            self.new_data = new_data
11539+        d.addCallback(_then)
11540+        d.addCallback(lambda ignored:
11541+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
11542+                     "replaced and so on"))
11543+        def _get_data(filecap):
11544+            n = self.s.create_node_from_uri(filecap)
11545+            return n.download_best_version()
11546+        d.addCallback(_get_data)
11547+        d.addCallback(lambda results:
11548+            self.failUnlessEqual(results, self.new_data))
11549+        # Now try appending things to the file
11550+        d.addCallback(lambda ignored:
11551+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
11552+                     "puppies" * 100))
11553+        d.addCallback(_get_data)
11554+        d.addCallback(lambda results:
11555+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
11556+        return d
11557+
11558+
11559+    def test_PUT_update_at_offset_immutable(self):
11560+        file_contents = "Test file" * 100000
11561+        d = self.PUT("/uri", file_contents)
11562+        def _then(filecap):
11563+            self.filecap = filecap
11564+        d.addCallback(_then)
11565+        d.addCallback(lambda ignored:
11566+            self.shouldHTTPError("test immutable update",
11567+                                 400, "Bad Request",
11568+                                 "immutable",
11569+                                 self.PUT,
11570+                                 "/uri/%s?offset=50" % self.filecap,
11571+                                 "foo"))
11572+        return d
11573+
11574+
11575     def test_bad_method(self):
11576         url = self.webish_url + self.public_url + "/foo/bar.txt"
11577         d = self.shouldHTTPError("test_bad_method",
11578hunk ./src/allmydata/test/test_web.py 3257
11579         def _stash_mutable_uri(n, which):
11580             self.uris[which] = n.get_uri()
11581             assert isinstance(self.uris[which], str)
11582-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
11583+        d.addCallback(lambda ign:
11584+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
11585         d.addCallback(_stash_mutable_uri, "corrupt")
11586         d.addCallback(lambda ign:
11587                       c0.upload(upload.Data("literal", convergence="")))
11588hunk ./src/allmydata/test/test_web.py 3404
11589         def _stash_mutable_uri(n, which):
11590             self.uris[which] = n.get_uri()
11591             assert isinstance(self.uris[which], str)
11592-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
11593+        d.addCallback(lambda ign:
11594+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
11595         d.addCallback(_stash_mutable_uri, "corrupt")
11596 
11597         def _compute_fileurls(ignored):
11598hunk ./src/allmydata/test/test_web.py 4067
11599         def _stash_mutable_uri(n, which):
11600             self.uris[which] = n.get_uri()
11601             assert isinstance(self.uris[which], str)
11602-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
11603+        d.addCallback(lambda ign:
11604+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
11605         d.addCallback(_stash_mutable_uri, "mutable")
11606 
11607         def _compute_fileurls(ignored):
11608hunk ./src/allmydata/test/test_web.py 4167
11609                                                         convergence="")))
11610         d.addCallback(_stash_uri, "small")
11611 
11612-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
11613+        d.addCallback(lambda ign:
11614+            c0.create_mutable_file(publish.MutableData("mutable")))
11615         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
11616         d.addCallback(_stash_uri, "mutable")
11617 
11618}
11619
11620Context:
11621
11622[web download-status: tolerate DYHBs that haven't retired yet. Fixes #1160.
11623Brian Warner <warner@lothar.com>**20100809225100
11624 Ignore-this: cb0add71adde0a2e24f4bcc00abf9938
11625 
11626 Also add a better unit test for it.
11627] 
11628[immutable/filenode.py: put off DownloadStatus creation until first read() call
11629Brian Warner <warner@lothar.com>**20100809225055
11630 Ignore-this: 48564598f236eb73e96cd2d2a21a2445
11631 
11632 This avoids spamming the "recent uploads and downloads" /status page from
11633 FileNode instances that were created for a directory read but which nobody is
11634 ever going to read from. I also cleaned up the way DownloadStatus instances
11635 are made to only ever do it in the CiphertextFileNode, not in the
11636 higher-level plaintext FileNode. Also fixed DownloadStatus handling of read
11637 size, thanks to David-Sarah for the catch.
11638] 
11639[Share: hush log entries in the main loop() after the fetch has been completed.
11640Brian Warner <warner@lothar.com>**20100809204359
11641 Ignore-this: 72b9e262980edf5a967873ebbe1e9479
11642] 
11643[test_runner.py: correct and simplify normalization of package directory for case-insensitive filesystems.
11644david-sarah@jacaranda.org**20100808185005
11645 Ignore-this: fba96e967d4e7f33f301c7d56b577de
11646] 
11647[test_runner.py: make test_path work for test-from-installdir.
11648david-sarah@jacaranda.org**20100808171340
11649 Ignore-this: 46328d769ae6ec8d191c3cddacc91dc9
11650] 
11651[src/allmydata/__init__.py: make the package paths more accurate when we fail to get them from setuptools.
11652david-sarah@jacaranda.org**20100808171235
11653 Ignore-this: 8d534d2764d64f7434880bd70696cd75
11654] 
11655[test_runner.py: another try at calculating the rootdir correctly for test-from-egg and test-from-prefixdir.
11656david-sarah@jacaranda.org**20100808154307
11657 Ignore-this: 66737313935f2a0313d1de9b2ed68d0
11658] 
11659[test_runner.py: calculate the location of bin/tahoe correctly for test-from-prefixdir (by copying code from misc/build_helpers/run_trial.py). Also fix the false-positive check for Unicode paths in test_the_right_code, which was causing skips that should have been failures.
11660david-sarah@jacaranda.org**20100808042817
11661 Ignore-this: 1b7dfff07cbfb1a74f94141b18da2c3f
11662] 
11663[TAG allmydata-tahoe-1.8.0c1
11664david-sarah@jacaranda.org**20100807004546
11665 Ignore-this: 484ff2513774f3b48ca49c992e878b89
11666] 
11667[how_to_make_a_tahoe-lafs_release.txt: add step to check that release will report itself as the intended version.
11668david-sarah@jacaranda.org**20100807004254
11669 Ignore-this: 7709322e883f4118f38c7f042f5a9a2
11670] 
11671[relnotes.txt: 1.8.0c1 release
11672david-sarah@jacaranda.org**20100807003646
11673 Ignore-this: 1994ffcaf55089eb05e96c23c037dfee
11674] 
11675[NEWS, quickstart.html and known_issues.txt for 1.8.0c1 release.
11676david-sarah@jacaranda.org**20100806235111
11677 Ignore-this: 777cea943685cf2d48b6147a7648fca0
11678] 
11679[TAG allmydata-tahoe-1.8.0rc1
11680warner@lothar.com**20100806080450] 
11681[update NEWS and other docs in preparation for 1.8.0rc1
11682Brian Warner <warner@lothar.com>**20100806080228
11683 Ignore-this: 6ebdf11806f6dfbfde0b61115421a459
11684 
11685 in particular, merge the various 1.8.0b1/b2 sections, and remove the
11686 datestamp. NEWS gets updated just before a release, doesn't need to precisely
11687 describe pre-release candidates, and the datestamp gets updated just before
11688 the final release is tagged
11689 
11690 Also, I removed the BOM from some files. My toolchain made it hard to retain,
11691 and BOMs in UTF-8 don't make a whole lot of sense anyway. Sorry if that
11692 messes anything up.
11693] 
11694[downloader.Segmentation: unregisterProducer when asked to stopProducing, this
11695Brian Warner <warner@lothar.com>**20100806070705
11696 Ignore-this: a0a71dcf83df8a6f727deb9a61fa4fdf
11697 seems to avoid the #1155 log message which reveals the URI (and filecap).
11698 
11699 Also add an [ERROR] marker to the flog entry, since unregisterProducer also
11700 makes interrupted downloads appear "200 OK"; this makes it more obvious that
11701 the download did not complete.
11702] 
11703[TAG allmydata-tahoe-1.8.0b2
11704david-sarah@jacaranda.org**20100806052415
11705 Ignore-this: 2c1af8df5e25a6ebd90a32b49b8486dc
11706] 
11707[relnotes.txt and docs/known_issues.txt for 1.8.0beta2.
11708david-sarah@jacaranda.org**20100806040823
11709 Ignore-this: 862ad55d93ee37259ded9e2c9da78eb9
11710] 
11711[test_util.py: use SHA-256 from pycryptopp instead of MD5 from hashlib (for uses in which any hash will do), since hashlib was only added to the stdlib in Python 2.5.
11712david-sarah@jacaranda.org**20100806050051
11713 Ignore-this: 552049b5d190a5ca775a8240030dbe3f
11714] 
11715[test_runner.py: increase timeout to cater for Francois' ARM buildslave.
11716david-sarah@jacaranda.org**20100806042601
11717 Ignore-this: 6ee618cf00ac1c99cb7ddb60fd7ef078
11718] 
11719[test_util.py: remove use of 'a if p else b' syntax that requires Python 2.5.
11720david-sarah@jacaranda.org**20100806041616
11721 Ignore-this: 5fecba9aa530ef352797fcfa70d5c592
11722] 
11723[NEWS and docs/quickstart.html for 1.8.0beta2.
11724david-sarah@jacaranda.org**20100806035112
11725 Ignore-this: 3a593cfdc2ae265da8f64c6c8aebae4
11726] 
11727[docs/quickstart.html: remove link to tahoe-lafs-ticket798-1.8.0b.zip, due to appname regression. refs #1159
11728david-sarah@jacaranda.org**20100806002435
11729 Ignore-this: bad61b30cdcc3d93b4165d5800047b85
11730] 
11731[test_download.DownloadTest.test_simultaneous_goodguess: enable some disabled
11732Brian Warner <warner@lothar.com>**20100805185507
11733 Ignore-this: ac53d44643805412238ccbfae920d20c
11734 checks that used to fail but work now.
11735] 
11736[DownloadNode: fix lost-progress in fetch_failed, tolerate cancel when no segment-fetch is active. Fixes #1154.
11737Brian Warner <warner@lothar.com>**20100805185507
11738 Ignore-this: 35fd36b273b21b6dca12ab3d11ee7d2d
11739 
11740 The lost-progress bug occurred when two simultanous read() calls fetched
11741 different segments, and the first one failed (due to corruption, or the other
11742 bugs in #1154): the second read() would never complete. While in this state,
11743 cancelling the second read by having its consumer call stopProducing) would
11744 trigger the cancel-intolerance bug. Finally, in downloader.node.Cancel,
11745 prevent late cancels by adding an 'active' flag
11746] 
11747[util/spans.py: __nonzero__ cannot return a long either. for #1154
11748Brian Warner <warner@lothar.com>**20100805185507
11749 Ignore-this: 6f87fead8252e7a820bffee74a1c51a2
11750] 
11751[test_storage.py: change skip note for test_large_share to say that Windows doesn't support sparse files. refs #569
11752david-sarah@jacaranda.org**20100805022612
11753 Ignore-this: 85c807a536dc4eeb8bf14980028bb05b
11754] 
11755[One fix for bug #1154: webapi GETs with a 'Range' header broke new-downloader.
11756Brian Warner <warner@lothar.com>**20100804184549
11757 Ignore-this: ffa3e703093a905b416af125a7923b7b
11758 
11759 The Range header causes n.read() to be called with an offset= of type 'long',
11760 which eventually got used in a Spans/DataSpans object's __len__ method.
11761 Apparently python doesn't permit __len__() to return longs, only ints.
11762 Rewrote Spans/DataSpans to use s.len() instead of len(s) aka s.__len__() .
11763 Added a test in test_download. Note that test_web didn't catch this because
11764 it uses mock FileNodes for speed: it's probably time to rewrite that.
11765 
11766 There is still an unresolved error-recovery problem in #1154, so I'm not
11767 closing the ticket quite yet.
11768] 
11769[test_download: minor cleanup
11770Brian Warner <warner@lothar.com>**20100804175555
11771 Ignore-this: f4aec3c77f6a0d7f7b2c07f302755cc1
11772] 
11773[fetcher.py: improve comments
11774Brian Warner <warner@lothar.com>**20100804072814
11775 Ignore-this: 8bf74c21aef55cf0b0642e55ee4e7c5f
11776] 
11777[lazily create DownloadNode upon first read()/get_segment()
11778Brian Warner <warner@lothar.com>**20100804072808
11779 Ignore-this: 4bb1c49290cefac1dadd9d42fac46ba2
11780] 
11781[test_hung_server: update comments, remove dead "stage_4_d" code
11782Brian Warner <warner@lothar.com>**20100804072800
11783 Ignore-this: 4d18b374b568237603466f93346d00db
11784] 
11785[copy the rest of David-Sarah's changes to make my tree match 1.8.0beta
11786Brian Warner <warner@lothar.com>**20100804072752
11787 Ignore-this: 9ac7f21c9b27e53452371096146be5bb
11788] 
11789[ShareFinder: add 10s OVERDUE timer, send new requests to replace overdue ones
11790Brian Warner <warner@lothar.com>**20100804072741
11791 Ignore-this: 7fa674edbf239101b79b341bb2944349
11792 
11793 The fixed 10-second timer will eventually be replaced with a per-server
11794 value, calculated based on observed response times.
11795 
11796 test_hung_server.py: enhance to exercise DYHB=OVERDUE state. Split existing
11797 mutable+immutable tests into two pieces for clarity. Reenabled several tests.
11798 Deleted the now-obsolete "test_failover_during_stage_4".
11799] 
11800[Rewrite immutable downloader (#798). This patch adds and updates unit tests.
11801Brian Warner <warner@lothar.com>**20100804072710
11802 Ignore-this: c3c838e124d67b39edaa39e002c653e1
11803] 
11804[Rewrite immutable downloader (#798). This patch includes higher-level
11805Brian Warner <warner@lothar.com>**20100804072702
11806 Ignore-this: 40901ddb07d73505cb58d06d9bff73d9
11807 integration into the NodeMaker, and updates the web-status display to handle
11808 the new download events.
11809] 
11810[Rewrite immutable downloader (#798). This patch rearranges the rest of src/allmydata/immutable/ .
11811Brian Warner <warner@lothar.com>**20100804072639
11812 Ignore-this: 302b1427a39985bfd11ccc14a1199ea4
11813] 
11814[Rewrite immutable downloader (#798). This patch adds the new downloader itself.
11815Brian Warner <warner@lothar.com>**20100804072629
11816 Ignore-this: e9102460798123dd55ddca7653f4fc16
11817] 
11818[util/observer.py: add EventStreamObserver
11819Brian Warner <warner@lothar.com>**20100804072612
11820 Ignore-this: fb9d205f34a6db7580b9be33414dfe21
11821] 
11822[Add a byte-spans utility class, like perl's Set::IntSpan for .newsrc files.
11823Brian Warner <warner@lothar.com>**20100804072600
11824 Ignore-this: bbad42104aeb2f26b8dd0779de546128
11825 Also a data-spans class, which records a byte (instead of a bit) for each
11826 index.
11827] 
11828[check-umids: oops, forgot to add the tool
11829Brian Warner <warner@lothar.com>**20100804071713
11830 Ignore-this: bbeb74d075414f3713fabbdf66189faf
11831] 
11832[coverage tools: ignore errors, display lines-uncovered in elisp mode. Fix Makefile paths.
11833"Brian Warner <warner@lothar.com>"**20100804071131] 
11834[check-umids: new tool to check uniqueness of umids
11835"Brian Warner <warner@lothar.com>"**20100804071042] 
11836[misc/simulators/sizes.py: update, we now use SHA256 (not SHA1), so large-file overhead grows to 0.5%
11837"Brian Warner <warner@lothar.com>"**20100804070942] 
11838[storage-overhead: try to fix, probably still broken
11839"Brian Warner <warner@lothar.com>"**20100804070815] 
11840[docs/quickstart.html: link to 1.8.0beta zip, and note 'bin\tahoe' on Windows.
11841david-sarah@jacaranda.org**20100803233254
11842 Ignore-this: 3c11f249efc42a588e3a7056349739ed
11843] 
11844[docs: relnotes.txt for 1.8.0β
11845zooko@zooko.com**20100803154913
11846 Ignore-this: d9101f72572b18da3cfac3c0e272c907
11847] 
11848[test_storage.py: avoid spurious test failure by accepting either 'Next crawl in 59 minutes' or 'Next crawl in 60 minutes'. fixes #1140
11849david-sarah@jacaranda.org**20100803102058
11850 Ignore-this: aa2419fc295727e4fbccec3c7b780e76
11851] 
11852[misc/build_helpers/show-tool-versions.py: get sys.std{out,err}.encoding and 'as' version correctly, and improve formatting.
11853david-sarah@jacaranda.org**20100803101128
11854 Ignore-this: 4fd2907d86da58eb220e104010e9c6a
11855] 
11856[misc/build_helpers/show-tool-versions.py: avoid error message when 'as -version' does not create a.out.
11857david-sarah@jacaranda.org**20100803094812
11858 Ignore-this: 38fc2d639f30b4e123b9551e6931998d
11859] 
11860[CLI: further improve consistency of basedir options and add tests. addresses #118
11861david-sarah@jacaranda.org**20100803085416
11862 Ignore-this: d8f8f55738abb5ea44ed4cf24d750efe
11863] 
11864[CLI: make the synopsis for 'tahoe unlink' say unlink instead of rm.
11865david-sarah@jacaranda.org**20100803085359
11866 Ignore-this: c35d3f99f906dfab61df8f5e81a42c92
11867] 
11868[CLI: make all of the option descriptions imperative sentences.
11869david-sarah@jacaranda.org**20100803084801
11870 Ignore-this: ec80c7d2a10c6452d190fee4e1a60739
11871] 
11872[test_cli.py: make 'tahoe mkdir' tests slightly less dumb (check for 'URI:' in the output).
11873david-sarah@jacaranda.org**20100803084720
11874 Ignore-this: 31a4ae4fb5f7c123bc6b6e36a9e3911e
11875] 
11876[test_cli.py: use u-escapes instead of UTF-8.
11877david-sarah@jacaranda.org**20100803083538
11878 Ignore-this: a48af66942defe8491c6e1811c7809b5
11879] 
11880[NEWS: remove XXX comment and separate description of #890.
11881david-sarah@jacaranda.org**20100803050827
11882 Ignore-this: 6d308f34dc9d929d3d0811f7a1f5c786
11883] 
11884[docs: more updates to NEWS for 1.8.0β
11885zooko@zooko.com**20100803044618
11886 Ignore-this: 8193a1be38effe2bdcc632fdb570e9fc
11887] 
11888[docs: incomplete beginnings of a NEWS update for v1.8β
11889zooko@zooko.com**20100802072840
11890 Ignore-this: cb00fcd4f1e0eaed8c8341014a2ba4d4
11891] 
11892[docs/quickstart.html: extra step to open a new Command Prompt or log out/in on Windows.
11893david-sarah@jacaranda.org**20100803004938
11894 Ignore-this: 1334a2cd01f77e0c9eddaeccfeff2370
11895] 
11896[update bundled zetuptools with doc changes, change to script setup for Windows XP, and to have the 'develop' command run script setup.
11897david-sarah@jacaranda.org**20100803003815
11898 Ignore-this: 73c86e154f4d3f7cc9855eb31a20b1ed
11899] 
11900[bundled setuptools/command/scriptsetup.py: use SendMessageTimeoutW, to test whether that broadcasts environment changes any better.
11901david-sarah@jacaranda.org**20100802224505
11902 Ignore-this: 7788f7c2f9355e7852a376ec94182056
11903] 
11904[bundled zetuptoolz: add missing setuptools/command/scriptsetup.py
11905david-sarah@jacaranda.org**20100802072129
11906 Ignore-this: 794b1c411f6cdec76eeb716223a55d0
11907] 
11908[test_runner.py: add test_run_with_python_options, which checks that the Windows script changes haven't broken 'python <options> bin/tahoe'.
11909david-sarah@jacaranda.org**20100802062558
11910 Ignore-this: 812a2ccb7d9c7a8e01d5ca04d875aba5
11911] 
11912[test_runner.py: fix missing import of get_filesystem_encoding
11913david-sarah@jacaranda.org**20100802060902
11914 Ignore-this: 2e9e439b7feb01e0c3c94b54e802503b
11915] 
11916[Bundle setuptools-0.6c16dev (with Windows script changes, and the change to only warn if site.py wasn't generated by setuptools) instead of 0.6c15dev. addresses #565, #1073, #1074
11917david-sarah@jacaranda.org**20100802060602
11918 Ignore-this: 34ee2735e49e2c05b57e353d48f83050
11919] 
11920[.darcs-boringfile: changes needed to take account of egg directories being bundled. Also, make _trial_temp a prefix rather than exact match.
11921david-sarah@jacaranda.org**20100802050313
11922 Ignore-this: 8de6a8dbaba014ba88dec6c792fc5a9d
11923] 
11924[.darcs-boringfile: changes needed to take account of pyscript wrappers on Windows.
11925david-sarah@jacaranda.org**20100802050128
11926 Ignore-this: 7366b631e2095166696e6da5765d9180
11927] 
11928[misc/build_helpers/run_trial.py: check that the root from which the module we are testing was loaded is the current directory. This version of the patch folds in later fixes to the logic for caculating the directories to compare, and improvements to error messages. addresses #1137
11929david-sarah@jacaranda.org**20100802045535
11930 Ignore-this: 9d3c1447f0539c6308127413098eb646
11931] 
11932[Skip option arguments to the python interpreter when reconstructing Unicode argv on Windows.
11933david-sarah@jacaranda.org**20100728062731
11934 Ignore-this: 2b17fc43860bcc02a66bb6e5e050ea7c
11935] 
11936[windows/fixups.py: improve comments and reference some relevant Python bugs.
11937david-sarah@jacaranda.org**20100727181921
11938 Ignore-this: 32e61cf98dfc2e3dac60b750dda6429b
11939] 
11940[windows/fixups.py: make errors reported to original_stderr have enough information to debug even if we can't see the traceback.
11941david-sarah@jacaranda.org**20100726221904
11942 Ignore-this: e30b4629a7aa5d71554237c7e809c080
11943] 
11944[windows/fixups.py: fix paste-o in name of Unicode stderr wrapper.
11945david-sarah@jacaranda.org**20100726214736
11946 Ignore-this: cb220931f1683eb53b0c7269e18a38be
11947] 
11948[windows/fixups.py: Don't rely on buggy MSVCRT library for Unicode output, use the Win32 API instead. This should make it work on XP. Also, change how we handle the case where sys.stdout and sys.stderr are redirected, since the .encoding attribute isn't necessarily writeable.
11949david-sarah@jacaranda.org**20100726045019
11950 Ignore-this: 69267abc5065cbd5b86ca71fe4921fb6
11951] 
11952[test_runner.py: change to code for locating the bin/tahoe script that was missed when rebasing the patch for #1074.
11953david-sarah@jacaranda.org**20100725182008
11954 Ignore-this: d891a93989ecc3f4301a17110c3d196c
11955] 
11956[Add missing windows/fixups.py (for setting up Unicode args and output on Windows).
11957david-sarah@jacaranda.org**20100725092849
11958 Ignore-this: 35a1e8aeb4e1dea6e81433bf0825a6f6
11959] 
11960[Changes to Tahoe needed to work with new zetuptoolz (that does not use .exe wrappers on Windows), and to support Unicode arguments and stdout/stderr -- v5
11961david-sarah@jacaranda.org**20100725083216
11962 Ignore-this: 5041a634b1328f041130658233f6a7ce
11963] 
11964[scripts/common.py: fix an error introduced when rebasing to the ticket798 branch, which caused base directories to be duplicated in self.basedirs.
11965david-sarah@jacaranda.org**20100802064929
11966 Ignore-this: 116fd437d1f91a647879fe8d9510f513
11967] 
11968[Basedir/node directory option improvements for ticket798 branch. addresses #188, #706, #715, #772, #890
11969david-sarah@jacaranda.org**20100802043004
11970 Ignore-this: d19fc24349afa19833406518595bfdf7
11971] 
11972[scripts/create_node.py: allow nickname to be Unicode. Also ensure webport is validly encoded in config file.
11973david-sarah@jacaranda.org**20100802000212
11974 Ignore-this: fb236169280507dd1b3b70d459155f6e
11975] 
11976[test_runner.py: Fix error in message arguments to 'fail' calls.
11977david-sarah@jacaranda.org**20100802013526
11978 Ignore-this: 3bfdef19ae3cf993194811367da5d020
11979] 
11980[Additional Unicode basedir changes for ticket798 branch.
11981david-sarah@jacaranda.org**20100802010552
11982 Ignore-this: 7090d8c6b04eb6275345a55e75142028
11983] 
11984[Unicode basedir changes for ticket798 branch.
11985david-sarah@jacaranda.org**20100801235310
11986 Ignore-this: a00717eaeae8650847b5395801e04c45
11987] 
11988[fileutil: change WindowsError to OSError in abspath_expanduser_unicode, because WindowsError might not exist.
11989david-sarah@jacaranda.org**20100725222603
11990 Ignore-this: e125d503670ed049a9ade0322faa0c51
11991] 
11992[test_system: correct a failure in _test_runner caused by Unicode basedir patch on non-Unicode platforms.
11993david-sarah@jacaranda.org**20100724032123
11994 Ignore-this: 399b3953104fdd1bbed3f7564d163553
11995] 
11996[Fix test failures due to Unicode basedir patches.
11997david-sarah@jacaranda.org**20100725010318
11998 Ignore-this: fe92cd439eb3e60a56c007ae452784ed
11999] 
12000[util.encodingutil: change quote_output to do less unnecessary escaping, and to use double-quotes more consistently when needed. This version avoids u-escaping for characters that are representable in the output encoding, when double quotes are used, and includes tests. fixes #1135
12001david-sarah@jacaranda.org**20100723075314
12002 Ignore-this: b82205834d17db61612dd16436b7c5a2
12003] 
12004[Replace uses of os.path.abspath with abspath_expanduser_unicode where necessary. This makes basedir paths consistently represented as Unicode.
12005david-sarah@jacaranda.org**20100722001418
12006 Ignore-this: 9f8cb706540e695550e0dbe303c01f52
12007] 
12008[util.fileutil, test.test_util: add abspath_expanduser_unicode function, to work around <http://bugs.python.org/issue3426>. util.encodingutil: add a convenience function argv_to_abspath.
12009david-sarah@jacaranda.org**20100721231507
12010 Ignore-this: eee6904d1f65a733ff35190879844d08
12011] 
12012[setup: increase requirement on foolscap from >= 0.4.1 to >= 0.5.1 to avoid the foolscap performance bug with transferring large mutable files
12013zooko@zooko.com**20100802071748
12014 Ignore-this: 53b5b8571ebfee48e6b11e3f3a5efdb7
12015] 
12016[upload: tidy up logging messages
12017zooko@zooko.com**20100802070212
12018 Ignore-this: b3532518326f6d808d085da52c14b661
12019 reformat code to be less than 100 chars wide, refactor formatting of logging messages, add log levels to some logging messages, M-x whitespace-cleanup
12020] 
12021[tests: remove debug print
12022zooko@zooko.com**20100802063339
12023 Ignore-this: b13b8c15e946556bffca9d7ad7c890f5
12024] 
12025[docs: update the list of forums to announce Tahoe-LAFS too, add empty checkboxes
12026zooko@zooko.com**20100802063314
12027 Ignore-this: 89d0e8bd43f1749a9e85fcee2205bb04
12028] 
12029[immutable: tidy-up some code by using a set instead of list to hold homeless_shares
12030zooko@zooko.com**20100802062004
12031 Ignore-this: a70bda3cf6c48ab0f0688756b015cf8d
12032] 
12033[setup: fix a couple instances of hard-coded 'allmydata-tahoe' in the scripts, tighten the tests (as suggested by David-Sarah)
12034zooko@zooko.com**20100801164207
12035 Ignore-this: 50265b562193a9a3797293123ed8ba5c
12036] 
12037[setup: replace hardcoded 'allmydata-tahoe' with allmydata.__appname__
12038zooko@zooko.com**20100801160517
12039 Ignore-this: 55e1a98515300d228f02df10975f7ba
12040] 
12041[NEWS: describe #1055
12042zooko@zooko.com**20100801034338
12043 Ignore-this: 3a16cfa387c2b245c610ea1d1ad8d7f1
12044] 
12045[immutable: use PrefixingLogMixin to organize logging in Tahoe2PeerSelector and add more detailed messages about peer
12046zooko@zooko.com**20100719082000
12047 Ignore-this: e034c4988b327f7e138a106d913a3082
12048] 
12049[benchmarking: update bench_dirnode to be correct and use the shiniest new pyutil.benchutil features concerning what units you measure in
12050zooko@zooko.com**20100719044948
12051 Ignore-this: b72059e4ff921741b490e6b47ec687c6
12052] 
12053[trivial: rename and add in-line doc to clarify "used_peers" => "upload_servers"
12054zooko@zooko.com**20100719044744
12055 Ignore-this: 93c42081676e0dea181e55187cfc506d
12056] 
12057[abbreviate time edge case python2.5 unit test
12058jacob.lyles@gmail.com**20100729210638
12059 Ignore-this: 80f9b1dc98ee768372a50be7d0ef66af
12060] 
12061[docs: add Jacob Lyles to CREDITS
12062zooko@zooko.com**20100730230500
12063 Ignore-this: 9dbbd6a591b4b1a5a8dcb69b7b757792
12064] 
12065[web: don't use %d formatting on a potentially large negative float -- there is a bug in Python 2.5 in that case
12066jacob.lyles@gmail.com**20100730220550
12067 Ignore-this: 7080eb4bddbcce29cba5447f8f4872ee
12068 fixes #1055
12069] 
12070[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 -- fix .todo reference.
12071david-sarah@jacaranda.org**20100729152927
12072 Ignore-this: c8fe1047edcc83c87b9feb47f4aa587b
12073] 
12074[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 for consistency.
12075david-sarah@jacaranda.org**20100729142250
12076 Ignore-this: bc3aad5919ae9079ceb9968ad0f5ea5a
12077] 
12078[docs: fix licensing typo that was earlier fixed in [20090921164651-92b7f-7f97b58101d93dc588445c52a9aaa56a2c7ae336]
12079zooko@zooko.com**20100729052923
12080 Ignore-this: a975d79115911688e5469d4d869e1664
12081 I wish we didn't copies of this licensing text in several different files so that changes can be accidentally omitted from some of them.
12082] 
12083[misc/build_helpers/run-with-pythonpath.py: fix stale comment, and remove 'trial' example that is not the right way to run trial.
12084david-sarah@jacaranda.org**20100726225729
12085 Ignore-this: a61f55557ad69a1633bfb2b8172cce97
12086] 
12087[docs/specifications/dirnodes.txt: 'mesh'->'grid'.
12088david-sarah@jacaranda.org**20100723061616
12089 Ignore-this: 887bcf921ef00afba8e05e9239035bca
12090] 
12091[docs/specifications/dirnodes.txt: bring layer terminology up-to-date with architecture.txt, and a few other updates (e.g. note that the MAC is no longer verified, and that URIs can be unknown). Also 'Tahoe'->'Tahoe-LAFS'.
12092david-sarah@jacaranda.org**20100723054703
12093 Ignore-this: f3b98183e7d0a0f391225b8b93ac6c37
12094] 
12095[docs: use current cap to Zooko's wiki page in example text
12096zooko@zooko.com**20100721010543
12097 Ignore-this: 4f36f36758f9fdbaf9eb73eac23b6652
12098 fixes #1134
12099] 
12100[__init__.py: silence DeprecationWarning about BaseException.message globally. fixes #1129
12101david-sarah@jacaranda.org**20100720011939
12102 Ignore-this: 38808986ba79cb2786b010504a22f89
12103] 
12104[test_runner: test that 'tahoe --version' outputs no noise (e.g. DeprecationWarnings).
12105david-sarah@jacaranda.org**20100720011345
12106 Ignore-this: dd358b7b2e5d57282cbe133e8069702e
12107] 
12108[TAG allmydata-tahoe-1.7.1
12109zooko@zooko.com**20100719131352
12110 Ignore-this: 6942056548433dc653a746703819ad8c
12111] 
12112Patch bundle hash:
12113afb619189f48ad8601373cd041b8a82925d02dd9