Ticket #393: 393status34.dpatch

File 393status34.dpatch, 555.4 KB (added by kevan, at 2010-08-14T23:31:53Z)
Line 
1Mon Aug  9 16:25:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
3 
4  The checker and repairer required minimal changes to work with the MDMF
5  modifications made elsewhere. The checker duplicated a lot of the code
6  that was already in the downloader, so I modified the downloader
7  slightly to expose this functionality to the checker and removed the
8  duplicated code. The repairer only required a minor change to deal with
9  data representation.
10
11Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
12  * interfaces.py: Add #993 interfaces
13
14Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
15  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
16
17Mon Aug  9 16:40:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
18  * mutable/layout.py and interfaces.py: add MDMF writer and reader
19 
20  The MDMF writer is responsible for keeping state as plaintext is
21  gradually processed into share data by the upload process. When the
22  upload finishes, it will write all of its share data to a remote server,
23  reporting its status back to the publisher.
24 
25  The MDMF reader is responsible for abstracting an MDMF file as it sits
26  on the grid from the downloader; specifically, by receiving and
27  responding to requests for arbitrary data within the MDMF file.
28 
29  The interfaces.py file has also been modified to contain an interface
30  for the writer.
31
32Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
33  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
34
35Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
36  * immutable/literal.py: implement the same interfaces as other filenodes
37
38Wed Aug 11 16:31:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
39  * mutable/publish.py: Modify the publish process to support MDMF
40 
41  The inner workings of the publishing process needed to be reworked to a
42  large extend to cope with segmented mutable files, and to cope with
43  partial-file updates of mutable files. This patch does that. It also
44  introduces wrappers for uploadable data, allowing the use of
45  filehandle-like objects as data sources, in addition to strings. This
46  reduces memory inefficiency when dealing with large files through the
47  webapi, and clarifies update code there.
48
49Wed Aug 11 16:31:25 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
50  * mutable/retrieve.py: Modify the retrieval process to support MDMF
51 
52  The logic behind a mutable file download had to be adapted to work with
53  segmented mutable files; this patch performs those adaptations. It also
54  exposes some decoding and decrypting functionality to make partial-file
55  updates a little easier, and supports efficient random-access downloads
56  of parts of an MDMF file.
57
58Wed Aug 11 16:33:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
60 
61  These modifications were basically all to the end of having the
62  servermap updater use the unified MDMF + SDMF read interface whenever
63  possible -- this reduces the complexity of the code, making it easier to
64  read and maintain. To do this, I needed to modify the process of
65  updating the servermap a little bit.
66 
67  To support partial-file updates, I also modified the servermap updater
68  to fetch the block hash trees and certain segments of files while it
69  performed a servermap update (this can be done without adding any new
70  roundtrips because of batch-read functionality that the read proxy has).
71 
72
73Fri Aug 13 16:49:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
74  * scripts: tell 'tahoe put' about MDMF
75
76Sat Aug 14 01:10:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
77  * web: Alter the webapi to get along with and take advantage of the MDMF changes
78 
79  The main benefit that the webapi gets from MDMF, at least initially, is
80  the ability to do a streaming download of an MDMF mutable file. It also
81  exposes a way (through the PUT verb) to append to or otherwise modify
82  (in-place) an MDMF mutable file.
83
84Sat Aug 14 15:56:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * docs: update docs to mention MDMF
86
87Sat Aug 14 15:57:11 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * client.py: learn how to create different kinds of mutable files
89
90Sat Aug 14 15:57:38 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
92 
93  One of the goals of MDMF as a GSoC project is to lay the groundwork for
94  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
95  multiple versions of a single cap on the grid. In line with this, there
96  is a now a distinction between an overriding mutable file (which can be
97  thought to correspond to the cap/unique identifier for that mutable
98  file) and versions of the mutable file (which we can download, update,
99  and so on). All download, upload, and modification operations end up
100  happening on a particular version of a mutable file, but there are
101  shortcut methods on the object representing the overriding mutable file
102  that perform these operations on the best version of the mutable file
103  (which is what code should be doing until we have LDMF and better
104  support for other paradigms).
105 
106  Another goal of MDMF was to take advantage of segmentation to give
107  callers more efficient partial file updates or appends. This patch
108  implements methods that do that, too.
109 
110
111Sat Aug 14 15:58:29 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
112  * nodemaker.py: Make nodemaker expose a way to create MDMF files
113
114Sat Aug 14 15:58:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
115  * tests:
116 
117      - A lot of existing tests relied on aspects of the mutable file
118        implementation that were changed. This patch updates those tests
119        to work with the changes.
120      - This patch also adds tests for new features.
121
122New patches:
123
124[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
125Kevan Carstensen <kevan@isnotajoke.com>**20100809232514
126 Ignore-this: 1bcef2f262c868f61e57cc19a3cac89a
127 
128 The checker and repairer required minimal changes to work with the MDMF
129 modifications made elsewhere. The checker duplicated a lot of the code
130 that was already in the downloader, so I modified the downloader
131 slightly to expose this functionality to the checker and removed the
132 duplicated code. The repairer only required a minor change to deal with
133 data representation.
134] {
135hunk ./src/allmydata/mutable/checker.py 12
136 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
137 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
138 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
139+from allmydata.mutable.retrieve import Retrieve # for verifying
140 
141 class MutableChecker:
142 
143hunk ./src/allmydata/mutable/checker.py 29
144 
145     def check(self, verify=False, add_lease=False):
146         servermap = ServerMap()
147+        # Updating the servermap in MODE_CHECK will stand a good chance
148+        # of finding all of the shares, and getting a good idea of
149+        # recoverability, etc, without verifying.
150         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
151                              servermap, MODE_CHECK, add_lease=add_lease)
152         if self._history:
153hunk ./src/allmydata/mutable/checker.py 55
154         if num_recoverable:
155             self.best_version = servermap.best_recoverable_version()
156 
157+        # The file is unhealthy and needs to be repaired if:
158+        # - There are unrecoverable versions.
159         if servermap.unrecoverable_versions():
160             self.need_repair = True
161hunk ./src/allmydata/mutable/checker.py 59
162+        # - There isn't a recoverable version.
163         if num_recoverable != 1:
164             self.need_repair = True
165hunk ./src/allmydata/mutable/checker.py 62
166+        # - The best recoverable version is missing some shares.
167         if self.best_version:
168             available_shares = servermap.shares_available()
169             (num_distinct_shares, k, N) = available_shares[self.best_version]
170hunk ./src/allmydata/mutable/checker.py 73
171 
172     def _verify_all_shares(self, servermap):
173         # read every byte of each share
174+        #
175+        # This logic is going to be very nearly the same as the
176+        # downloader. I bet we could pass the downloader a flag that
177+        # makes it do this, and piggyback onto that instead of
178+        # duplicating a bunch of code.
179+        #
180+        # Like:
181+        #  r = Retrieve(blah, blah, blah, verify=True)
182+        #  d = r.download()
183+        #  (wait, wait, wait, d.callback)
184+        # 
185+        #  Then, when it has finished, we can check the servermap (which
186+        #  we provided to Retrieve) to figure out which shares are bad,
187+        #  since the Retrieve process will have updated the servermap as
188+        #  it went along.
189+        #
190+        #  By passing the verify=True flag to the constructor, we are
191+        #  telling the downloader a few things.
192+        #
193+        #  1. It needs to download all N shares, not just K shares.
194+        #  2. It doesn't need to decrypt or decode the shares, only
195+        #     verify them.
196         if not self.best_version:
197             return
198hunk ./src/allmydata/mutable/checker.py 97
199-        versionmap = servermap.make_versionmap()
200-        shares = versionmap[self.best_version]
201-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
202-         offsets_tuple) = self.best_version
203-        offsets = dict(offsets_tuple)
204-        readv = [ (0, offsets["EOF"]) ]
205-        dl = []
206-        for (shnum, peerid, timestamp) in shares:
207-            ss = servermap.connections[peerid]
208-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
209-            d.addCallback(self._got_answer, peerid, servermap)
210-            dl.append(d)
211-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
212 
213hunk ./src/allmydata/mutable/checker.py 98
214-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
215-        # isolate the callRemote to a separate method, so tests can subclass
216-        # Publish and override it
217-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
218+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
219+        d = r.download()
220+        d.addCallback(self._process_bad_shares)
221         return d
222 
223hunk ./src/allmydata/mutable/checker.py 103
224-    def _got_answer(self, datavs, peerid, servermap):
225-        for shnum,datav in datavs.items():
226-            data = datav[0]
227-            try:
228-                self._got_results_one_share(shnum, peerid, data)
229-            except CorruptShareError:
230-                f = failure.Failure()
231-                self.need_repair = True
232-                self.bad_shares.append( (peerid, shnum, f) )
233-                prefix = data[:SIGNED_PREFIX_LENGTH]
234-                servermap.mark_bad_share(peerid, shnum, prefix)
235-                ss = servermap.connections[peerid]
236-                self.notify_server_corruption(ss, shnum, str(f.value))
237-
238-    def check_prefix(self, peerid, shnum, data):
239-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
240-         offsets_tuple) = self.best_version
241-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
242-        if got_prefix != prefix:
243-            raise CorruptShareError(peerid, shnum,
244-                                    "prefix mismatch: share changed while we were reading it")
245-
246-    def _got_results_one_share(self, shnum, peerid, data):
247-        self.check_prefix(peerid, shnum, data)
248-
249-        # the [seqnum:signature] pieces are validated by _compare_prefix,
250-        # which checks their signature against the pubkey known to be
251-        # associated with this file.
252 
253hunk ./src/allmydata/mutable/checker.py 104
254-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
255-         share_hash_chain, block_hash_tree, share_data,
256-         enc_privkey) = unpack_share(data)
257-
258-        # validate [share_hash_chain,block_hash_tree,share_data]
259-
260-        leaves = [hashutil.block_hash(share_data)]
261-        t = hashtree.HashTree(leaves)
262-        if list(t) != block_hash_tree:
263-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
264-        share_hash_leaf = t[0]
265-        t2 = hashtree.IncompleteHashTree(N)
266-        # root_hash was checked by the signature
267-        t2.set_hashes({0: root_hash})
268-        try:
269-            t2.set_hashes(hashes=share_hash_chain,
270-                          leaves={shnum: share_hash_leaf})
271-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
272-                IndexError), e:
273-            msg = "corrupt hashes: %s" % (e,)
274-            raise CorruptShareError(peerid, shnum, msg)
275-
276-        # validate enc_privkey: only possible if we have a write-cap
277-        if not self._node.is_readonly():
278-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
279-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
280-            if alleged_writekey != self._node.get_writekey():
281-                raise CorruptShareError(peerid, shnum, "invalid privkey")
282+    def _process_bad_shares(self, bad_shares):
283+        if bad_shares:
284+            self.need_repair = True
285+        self.bad_shares = bad_shares
286 
287hunk ./src/allmydata/mutable/checker.py 109
288-    def notify_server_corruption(self, ss, shnum, reason):
289-        ss.callRemoteOnly("advise_corrupt_share",
290-                          "mutable", self._storage_index, shnum, reason)
291 
292     def _count_shares(self, smap, version):
293         available_shares = smap.shares_available()
294hunk ./src/allmydata/mutable/repairer.py 5
295 from zope.interface import implements
296 from twisted.internet import defer
297 from allmydata.interfaces import IRepairResults, ICheckResults
298+from allmydata.mutable.publish import MutableData
299 
300 class RepairResults:
301     implements(IRepairResults)
302hunk ./src/allmydata/mutable/repairer.py 108
303             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
304 
305         d = self.node.download_version(smap, best_version, fetch_privkey=True)
306+        d.addCallback(lambda data:
307+            MutableData(data))
308         d.addCallback(self.node.upload, smap)
309         d.addCallback(self.get_results, smap)
310         return d
311}
312[interfaces.py: Add #993 interfaces
313Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
314 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
315] {
316hunk ./src/allmydata/interfaces.py 495
317 class MustNotBeUnknownRWError(CapConstraintError):
318     """Cannot add an unknown child cap specified in a rw_uri field."""
319 
320+
321+class IReadable(Interface):
322+    """I represent a readable object -- either an immutable file, or a
323+    specific version of a mutable file.
324+    """
325+
326+    def is_readonly():
327+        """Return True if this reference provides mutable access to the given
328+        file or directory (i.e. if you can modify it), or False if not. Note
329+        that even if this reference is read-only, someone else may hold a
330+        read-write reference to it.
331+
332+        For an IReadable returned by get_best_readable_version(), this will
333+        always return True, but for instances of subinterfaces such as
334+        IMutableFileVersion, it may return False."""
335+
336+    def is_mutable():
337+        """Return True if this file or directory is mutable (by *somebody*,
338+        not necessarily you), False if it is is immutable. Note that a file
339+        might be mutable overall, but your reference to it might be
340+        read-only. On the other hand, all references to an immutable file
341+        will be read-only; there are no read-write references to an immutable
342+        file."""
343+
344+    def get_storage_index():
345+        """Return the storage index of the file."""
346+
347+    def get_size():
348+        """Return the length (in bytes) of this readable object."""
349+
350+    def download_to_data():
351+        """Download all of the file contents. I return a Deferred that fires
352+        with the contents as a byte string."""
353+
354+    def read(consumer, offset=0, size=None):
355+        """Download a portion (possibly all) of the file's contents, making
356+        them available to the given IConsumer. Return a Deferred that fires
357+        (with the consumer) when the consumer is unregistered (either because
358+        the last byte has been given to it, or because the consumer threw an
359+        exception during write(), possibly because it no longer wants to
360+        receive data). The portion downloaded will start at 'offset' and
361+        contain 'size' bytes (or the remainder of the file if size==None).
362+
363+        The consumer will be used in non-streaming mode: an IPullProducer
364+        will be attached to it.
365+
366+        The consumer will not receive data right away: several network trips
367+        must occur first. The order of events will be::
368+
369+         consumer.registerProducer(p, streaming)
370+          (if streaming == False)::
371+           consumer does p.resumeProducing()
372+            consumer.write(data)
373+           consumer does p.resumeProducing()
374+            consumer.write(data).. (repeat until all data is written)
375+         consumer.unregisterProducer()
376+         deferred.callback(consumer)
377+
378+        If a download error occurs, or an exception is raised by
379+        consumer.registerProducer() or consumer.write(), I will call
380+        consumer.unregisterProducer() and then deliver the exception via
381+        deferred.errback(). To cancel the download, the consumer should call
382+        p.stopProducing(), which will result in an exception being delivered
383+        via deferred.errback().
384+
385+        See src/allmydata/util/consumer.py for an example of a simple
386+        download-to-memory consumer.
387+        """
388+
389+
390+class IWritable(Interface):
391+    """
392+    I define methods that callers can use to update SDMF and MDMF
393+    mutable files on a Tahoe-LAFS grid.
394+    """
395+    # XXX: For the moment, we have only this. It is possible that we
396+    #      want to move overwrite() and modify() in here too.
397+    def update(data, offset):
398+        """
399+        I write the data from my data argument to the MDMF file,
400+        starting at offset. I continue writing data until my data
401+        argument is exhausted, appending data to the file as necessary.
402+        """
403+        # assert IMutableUploadable.providedBy(data)
404+        # to append data: offset=node.get_size_of_best_version()
405+        # do we want to support compacting MDMF?
406+        # for an MDMF file, this can be done with O(data.get_size())
407+        # memory. For an SDMF file, any modification takes
408+        # O(node.get_size_of_best_version()).
409+
410+
411+class IMutableFileVersion(IReadable):
412+    """I provide access to a particular version of a mutable file. The
413+    access is read/write if I was obtained from a filenode derived from
414+    a write cap, or read-only if the filenode was derived from a read cap.
415+    """
416+
417+    def get_sequence_number():
418+        """Return the sequence number of this version."""
419+
420+    def get_servermap():
421+        """Return the IMutableFileServerMap instance that was used to create
422+        this object.
423+        """
424+
425+    def get_writekey():
426+        """Return this filenode's writekey, or None if the node does not have
427+        write-capability. This may be used to assist with data structures
428+        that need to make certain data available only to writers, such as the
429+        read-write child caps in dirnodes. The recommended process is to have
430+        reader-visible data be submitted to the filenode in the clear (where
431+        it will be encrypted by the filenode using the readkey), but encrypt
432+        writer-visible data using this writekey.
433+        """
434+
435+    # TODO: Can this be overwrite instead of replace?
436+    def replace(new_contents):
437+        """Replace the contents of the mutable file, provided that no other
438+        node has published (or is attempting to publish, concurrently) a
439+        newer version of the file than this one.
440+
441+        I will avoid modifying any share that is different than the version
442+        given by get_sequence_number(). However, if another node is writing
443+        to the file at the same time as me, I may manage to update some shares
444+        while they update others. If I see any evidence of this, I will signal
445+        UncoordinatedWriteError, and the file will be left in an inconsistent
446+        state (possibly the version you provided, possibly the old version,
447+        possibly somebody else's version, and possibly a mix of shares from
448+        all of these).
449+
450+        The recommended response to UncoordinatedWriteError is to either
451+        return it to the caller (since they failed to coordinate their
452+        writes), or to attempt some sort of recovery. It may be sufficient to
453+        wait a random interval (with exponential backoff) and repeat your
454+        operation. If I do not signal UncoordinatedWriteError, then I was
455+        able to write the new version without incident.
456+
457+        I return a Deferred that fires (with a PublishStatus object) when the
458+        update has completed.
459+        """
460+
461+    def modify(modifier_cb):
462+        """Modify the contents of the file, by downloading this version,
463+        applying the modifier function (or bound method), then uploading
464+        the new version. This will succeed as long as no other node
465+        publishes a version between the download and the upload.
466+        I return a Deferred that fires (with a PublishStatus object) when
467+        the update is complete.
468+
469+        The modifier callable will be given three arguments: a string (with
470+        the old contents), a 'first_time' boolean, and a servermap. As with
471+        download_to_data(), the old contents will be from this version,
472+        but the modifier can use the servermap to make other decisions
473+        (such as refusing to apply the delta if there are multiple parallel
474+        versions, or if there is evidence of a newer unrecoverable version).
475+        'first_time' will be True the first time the modifier is called,
476+        and False on any subsequent calls.
477+
478+        The callable should return a string with the new contents. The
479+        callable must be prepared to be called multiple times, and must
480+        examine the input string to see if the change that it wants to make
481+        is already present in the old version. If it does not need to make
482+        any changes, it can either return None, or return its input string.
483+
484+        If the modifier raises an exception, it will be returned in the
485+        errback.
486+        """
487+
488+
489 # The hierarchy looks like this:
490 #  IFilesystemNode
491 #   IFileNode
492hunk ./src/allmydata/interfaces.py 754
493     def raise_error():
494         """Raise any error associated with this node."""
495 
496+    # XXX: These may not be appropriate outside the context of an IReadable.
497     def get_size():
498         """Return the length (in bytes) of the data this node represents. For
499         directory nodes, I return the size of the backing store. I return
500hunk ./src/allmydata/interfaces.py 771
501 class IFileNode(IFilesystemNode):
502     """I am a node which represents a file: a sequence of bytes. I am not a
503     container, like IDirectoryNode."""
504+    def get_best_readable_version():
505+        """Return a Deferred that fires with an IReadable for the 'best'
506+        available version of the file. The IReadable provides only read
507+        access, even if this filenode was derived from a write cap.
508 
509hunk ./src/allmydata/interfaces.py 776
510-class IImmutableFileNode(IFileNode):
511-    def read(consumer, offset=0, size=None):
512-        """Download a portion (possibly all) of the file's contents, making
513-        them available to the given IConsumer. Return a Deferred that fires
514-        (with the consumer) when the consumer is unregistered (either because
515-        the last byte has been given to it, or because the consumer threw an
516-        exception during write(), possibly because it no longer wants to
517-        receive data). The portion downloaded will start at 'offset' and
518-        contain 'size' bytes (or the remainder of the file if size==None).
519-
520-        The consumer will be used in non-streaming mode: an IPullProducer
521-        will be attached to it.
522+        For an immutable file, there is only one version. For a mutable
523+        file, the 'best' version is the recoverable version with the
524+        highest sequence number. If no uncoordinated writes have occurred,
525+        and if enough shares are available, then this will be the most
526+        recent version that has been uploaded. If no version is recoverable,
527+        the Deferred will errback with an UnrecoverableFileError.
528+        """
529 
530hunk ./src/allmydata/interfaces.py 784
531-        The consumer will not receive data right away: several network trips
532-        must occur first. The order of events will be::
533+    def download_best_version():
534+        """Download the contents of the version that would be returned
535+        by get_best_readable_version(). This is equivalent to calling
536+        download_to_data() on the IReadable given by that method.
537 
538hunk ./src/allmydata/interfaces.py 789
539-         consumer.registerProducer(p, streaming)
540-          (if streaming == False)::
541-           consumer does p.resumeProducing()
542-            consumer.write(data)
543-           consumer does p.resumeProducing()
544-            consumer.write(data).. (repeat until all data is written)
545-         consumer.unregisterProducer()
546-         deferred.callback(consumer)
547+        I return a Deferred that fires with a byte string when the file
548+        has been fully downloaded. To support streaming download, use
549+        the 'read' method of IReadable. If no version is recoverable,
550+        the Deferred will errback with an UnrecoverableFileError.
551+        """
552 
553hunk ./src/allmydata/interfaces.py 795
554-        If a download error occurs, or an exception is raised by
555-        consumer.registerProducer() or consumer.write(), I will call
556-        consumer.unregisterProducer() and then deliver the exception via
557-        deferred.errback(). To cancel the download, the consumer should call
558-        p.stopProducing(), which will result in an exception being delivered
559-        via deferred.errback().
560+    def get_size_of_best_version():
561+        """Find the size of the version that would be returned by
562+        get_best_readable_version().
563 
564hunk ./src/allmydata/interfaces.py 799
565-        See src/allmydata/util/consumer.py for an example of a simple
566-        download-to-memory consumer.
567+        I return a Deferred that fires with an integer. If no version
568+        is recoverable, the Deferred will errback with an
569+        UnrecoverableFileError.
570         """
571 
572hunk ./src/allmydata/interfaces.py 804
573+
574+class IImmutableFileNode(IFileNode, IReadable):
575+    """I am a node representing an immutable file. Immutable files have
576+    only one version"""
577+
578+
579 class IMutableFileNode(IFileNode):
580     """I provide access to a 'mutable file', which retains its identity
581     regardless of what contents are put in it.
582hunk ./src/allmydata/interfaces.py 869
583     only be retrieved and updated all-at-once, as a single big string. Future
584     versions of our mutable files will remove this restriction.
585     """
586-
587-    def download_best_version():
588-        """Download the 'best' available version of the file, meaning one of
589-        the recoverable versions with the highest sequence number. If no
590+    def get_best_mutable_version():
591+        """Return a Deferred that fires with an IMutableFileVersion for
592+        the 'best' available version of the file. The best version is
593+        the recoverable version with the highest sequence number. If no
594         uncoordinated writes have occurred, and if enough shares are
595hunk ./src/allmydata/interfaces.py 874
596-        available, then this will be the most recent version that has been
597-        uploaded.
598+        available, then this will be the most recent version that has
599+        been uploaded.
600 
601hunk ./src/allmydata/interfaces.py 877
602-        I update an internal servermap with MODE_READ, determine which
603-        version of the file is indicated by
604-        servermap.best_recoverable_version(), and return a Deferred that
605-        fires with its contents. If no version is recoverable, the Deferred
606-        will errback with UnrecoverableFileError.
607-        """
608-
609-    def get_size_of_best_version():
610-        """Find the size of the version that would be downloaded with
611-        download_best_version(), without actually downloading the whole file.
612-
613-        I return a Deferred that fires with an integer.
614+        If no version is recoverable, the Deferred will errback with an
615+        UnrecoverableFileError.
616         """
617 
618     def overwrite(new_contents):
619hunk ./src/allmydata/interfaces.py 917
620         errback.
621         """
622 
623-
624     def get_servermap(mode):
625         """Return a Deferred that fires with an IMutableFileServerMap
626         instance, updated using the given mode.
627hunk ./src/allmydata/interfaces.py 970
628         writer-visible data using this writekey.
629         """
630 
631+    def set_version(version):
632+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
633+        we upload in SDMF for reasons of compatibility. If you want to
634+        change this, set_version will let you do that.
635+
636+        To say that this file should be uploaded in SDMF, pass in a 0. To
637+        say that the file should be uploaded as MDMF, pass in a 1.
638+        """
639+
640+    def get_version():
641+        """Returns the mutable file protocol version."""
642+
643 class NotEnoughSharesError(Exception):
644     """Download was unable to get enough shares"""
645 
646hunk ./src/allmydata/interfaces.py 1786
647         """The upload is finished, and whatever filehandle was in use may be
648         closed."""
649 
650+
651+class IMutableUploadable(Interface):
652+    """
653+    I represent content that is due to be uploaded to a mutable filecap.
654+    """
655+    # This is somewhat simpler than the IUploadable interface above
656+    # because mutable files do not need to be concerned with possibly
657+    # generating a CHK, nor with per-file keys. It is a subset of the
658+    # methods in IUploadable, though, so we could just as well implement
659+    # the mutable uploadables as IUploadables that don't happen to use
660+    # those methods (with the understanding that the unused methods will
661+    # never be called on such objects)
662+    def get_size():
663+        """
664+        Returns a Deferred that fires with the size of the content held
665+        by the uploadable.
666+        """
667+
668+    def read(length):
669+        """
670+        Returns a list of strings which, when concatenated, are the next
671+        length bytes of the file, or fewer if there are fewer bytes
672+        between the current location and the end of the file.
673+        """
674+
675+    def close():
676+        """
677+        The process that used the Uploadable is finished using it, so
678+        the uploadable may be closed.
679+        """
680+
681 class IUploadResults(Interface):
682     """I am returned by upload() methods. I contain a number of public
683     attributes which can be read to determine the results of the upload. Some
684}
685[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
686Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
687 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
688] {
689hunk ./src/allmydata/frontends/sftpd.py 33
690 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
691      NoSuchChildError, ChildOfWrongTypeError
692 from allmydata.mutable.common import NotWriteableError
693+from allmydata.mutable.publish import MutableFileHandle
694 from allmydata.immutable.upload import FileHandle
695 from allmydata.dirnode import update_metadata
696 from allmydata.util.fileutil import EncryptedTemporaryFile
697hunk ./src/allmydata/frontends/sftpd.py 664
698         else:
699             assert IFileNode.providedBy(filenode), filenode
700 
701-            if filenode.is_mutable():
702-                self.async.addCallback(lambda ign: filenode.download_best_version())
703-                def _downloaded(data):
704-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
705-                    self.consumer.write(data)
706-                    self.consumer.finish()
707-                    return None
708-                self.async.addCallback(_downloaded)
709-            else:
710-                download_size = filenode.get_size()
711-                assert download_size is not None, "download_size is None"
712+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
713+
714+            def _read(version):
715+                if noisy: self.log("_read", level=NOISY)
716+                download_size = version.get_size()
717+                assert download_size is not None
718+
719                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
720hunk ./src/allmydata/frontends/sftpd.py 672
721-                def _read(ign):
722-                    if noisy: self.log("_read immutable", level=NOISY)
723-                    filenode.read(self.consumer, 0, None)
724-                self.async.addCallback(_read)
725+
726+                version.read(self.consumer, 0, None)
727+            self.async.addCallback(_read)
728 
729         eventually(self.async.callback, None)
730 
731hunk ./src/allmydata/frontends/sftpd.py 818
732                     assert parent and childname, (parent, childname, self.metadata)
733                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
734 
735-                d2.addCallback(lambda ign: self.consumer.get_current_size())
736-                d2.addCallback(lambda size: self.consumer.read(0, size))
737-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
738+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
739             else:
740                 def _add_file(ign):
741                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
742}
743[mutable/layout.py and interfaces.py: add MDMF writer and reader
744Kevan Carstensen <kevan@isnotajoke.com>**20100809234004
745 Ignore-this: 90db36ee3318dbbd4397baebc6014f86
746 
747 The MDMF writer is responsible for keeping state as plaintext is
748 gradually processed into share data by the upload process. When the
749 upload finishes, it will write all of its share data to a remote server,
750 reporting its status back to the publisher.
751 
752 The MDMF reader is responsible for abstracting an MDMF file as it sits
753 on the grid from the downloader; specifically, by receiving and
754 responding to requests for arbitrary data within the MDMF file.
755 
756 The interfaces.py file has also been modified to contain an interface
757 for the writer.
758] {
759hunk ./src/allmydata/interfaces.py 7
760      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
761 
762 HASH_SIZE=32
763+SALT_SIZE=16
764+
765+SDMF_VERSION=0
766+MDMF_VERSION=1
767 
768 Hash = StringConstraint(maxLength=HASH_SIZE,
769                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
770hunk ./src/allmydata/interfaces.py 420
771         """
772 
773 
774+class IMutableSlotWriter(Interface):
775+    """
776+    The interface for a writer around a mutable slot on a remote server.
777+    """
778+    def set_checkstring(checkstring, *args):
779+        """
780+        Set the checkstring that I will pass to the remote server when
781+        writing.
782+
783+            @param checkstring A packed checkstring to use.
784+
785+        Note that implementations can differ in which semantics they
786+        wish to support for set_checkstring -- they can, for example,
787+        build the checkstring themselves from its constituents, or
788+        some other thing.
789+        """
790+
791+    def get_checkstring():
792+        """
793+        Get the checkstring that I think currently exists on the remote
794+        server.
795+        """
796+
797+    def put_block(data, segnum, salt):
798+        """
799+        Add a block and salt to the share.
800+        """
801+
802+    def put_encprivey(encprivkey):
803+        """
804+        Add the encrypted private key to the share.
805+        """
806+
807+    def put_blockhashes(blockhashes=list):
808+        """
809+        Add the block hash tree to the share.
810+        """
811+
812+    def put_sharehashes(sharehashes=dict):
813+        """
814+        Add the share hash chain to the share.
815+        """
816+
817+    def get_signable():
818+        """
819+        Return the part of the share that needs to be signed.
820+        """
821+
822+    def put_signature(signature):
823+        """
824+        Add the signature to the share.
825+        """
826+
827+    def put_verification_key(verification_key):
828+        """
829+        Add the verification key to the share.
830+        """
831+
832+    def finish_publishing():
833+        """
834+        Do anything necessary to finish writing the share to a remote
835+        server. I require that no further publishing needs to take place
836+        after this method has been called.
837+        """
838+
839+
840 class IURI(Interface):
841     def init_from_string(uri):
842         """Accept a string (as created by my to_string() method) and populate
843hunk ./src/allmydata/mutable/layout.py 4
844 
845 import struct
846 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
847+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
848+                                 MDMF_VERSION, IMutableSlotWriter
849+from allmydata.util import mathutil, observer
850+from twisted.python import failure
851+from twisted.internet import defer
852+from zope.interface import implements
853+
854+
855+# These strings describe the format of the packed structs they help process
856+# Here's what they mean:
857+#
858+#  PREFIX:
859+#    >: Big-endian byte order; the most significant byte is first (leftmost).
860+#    B: The version information; an 8 bit version identifier. Stored as
861+#       an unsigned char. This is currently 00 00 00 00; our modifications
862+#       will turn it into 00 00 00 01.
863+#    Q: The sequence number; this is sort of like a revision history for
864+#       mutable files; they start at 1 and increase as they are changed after
865+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
866+#       length.
867+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
868+#       characters = 32 bytes to store the value.
869+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
870+#       16 characters.
871+#
872+#  SIGNED_PREFIX additions, things that are covered by the signature:
873+#    B: The "k" encoding parameter. We store this as an 8-bit character,
874+#       which is convenient because our erasure coding scheme cannot
875+#       encode if you ask for more than 255 pieces.
876+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
877+#       same reasons as above.
878+#    Q: The segment size of the uploaded file. This will essentially be the
879+#       length of the file in SDMF. An unsigned long long, so we can store
880+#       files of quite large size.
881+#    Q: The data length of the uploaded file. Modulo padding, this will be
882+#       the same of the data length field. Like the data length field, it is
883+#       an unsigned long long and can be quite large.
884+#
885+#   HEADER additions:
886+#     L: The offset of the signature of this. An unsigned long.
887+#     L: The offset of the share hash chain. An unsigned long.
888+#     L: The offset of the block hash tree. An unsigned long.
889+#     L: The offset of the share data. An unsigned long.
890+#     Q: The offset of the encrypted private key. An unsigned long long, to
891+#        account for the possibility of a lot of share data.
892+#     Q: The offset of the EOF. An unsigned long long, to account for the
893+#        possibility of a lot of share data.
894+#
895+#  After all of these, we have the following:
896+#    - The verification key: Occupies the space between the end of the header
897+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
898+#    - The signature, which goes from the signature offset to the share hash
899+#      chain offset.
900+#    - The share hash chain, which goes from the share hash chain offset to
901+#      the block hash tree offset.
902+#    - The share data, which goes from the share data offset to the encrypted
903+#      private key offset.
904+#    - The encrypted private key offset, which goes until the end of the file.
905+#
906+#  The block hash tree in this encoding has only one share, so the offset of
907+#  the share data will be 32 bits more than the offset of the block hash tree.
908+#  Given this, we may need to check to see how many bytes a reasonably sized
909+#  block hash tree will take up.
910 
911 PREFIX = ">BQ32s16s" # each version has a different prefix
912 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
913hunk ./src/allmydata/mutable/layout.py 73
914 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
915 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
916 HEADER_LENGTH = struct.calcsize(HEADER)
917+OFFSETS = ">LLLLQQ"
918+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
919 
920 def unpack_header(data):
921     o = {}
922hunk ./src/allmydata/mutable/layout.py 194
923     return (share_hash_chain, block_hash_tree, share_data)
924 
925 
926-def pack_checkstring(seqnum, root_hash, IV):
927+def pack_checkstring(seqnum, root_hash, IV, version=0):
928     return struct.pack(PREFIX,
929hunk ./src/allmydata/mutable/layout.py 196
930-                       0, # version,
931+                       version,
932                        seqnum,
933                        root_hash,
934                        IV)
935hunk ./src/allmydata/mutable/layout.py 269
936                            encprivkey])
937     return final_share
938 
939+def pack_prefix(seqnum, root_hash, IV,
940+                required_shares, total_shares,
941+                segment_size, data_length):
942+    prefix = struct.pack(SIGNED_PREFIX,
943+                         0, # version,
944+                         seqnum,
945+                         root_hash,
946+                         IV,
947+                         required_shares,
948+                         total_shares,
949+                         segment_size,
950+                         data_length,
951+                         )
952+    return prefix
953+
954+
955+class SDMFSlotWriteProxy:
956+    implements(IMutableSlotWriter)
957+    """
958+    I represent a remote write slot for an SDMF mutable file. I build a
959+    share in memory, and then write it in one piece to the remote
960+    server. This mimics how SDMF shares were built before MDMF (and the
961+    new MDMF uploader), but provides that functionality in a way that
962+    allows the MDMF uploader to be built without much special-casing for
963+    file format, which makes the uploader code more readable.
964+    """
965+    def __init__(self,
966+                 shnum,
967+                 rref, # a remote reference to a storage server
968+                 storage_index,
969+                 secrets, # (write_enabler, renew_secret, cancel_secret)
970+                 seqnum, # the sequence number of the mutable file
971+                 required_shares,
972+                 total_shares,
973+                 segment_size,
974+                 data_length): # the length of the original file
975+        self.shnum = shnum
976+        self._rref = rref
977+        self._storage_index = storage_index
978+        self._secrets = secrets
979+        self._seqnum = seqnum
980+        self._required_shares = required_shares
981+        self._total_shares = total_shares
982+        self._segment_size = segment_size
983+        self._data_length = data_length
984+
985+        # This is an SDMF file, so it should have only one segment, so,
986+        # modulo padding of the data length, the segment size and the
987+        # data length should be the same.
988+        expected_segment_size = mathutil.next_multiple(data_length,
989+                                                       self._required_shares)
990+        assert expected_segment_size == segment_size
991+
992+        self._block_size = self._segment_size / self._required_shares
993+
994+        # This is meant to mimic how SDMF files were built before MDMF
995+        # entered the picture: we generate each share in its entirety,
996+        # then push it off to the storage server in one write. When
997+        # callers call set_*, they are just populating this dict.
998+        # finish_publishing will stitch these pieces together into a
999+        # coherent share, and then write the coherent share to the
1000+        # storage server.
1001+        self._share_pieces = {}
1002+
1003+        # This tells the write logic what checkstring to use when
1004+        # writing remote shares.
1005+        self._testvs = []
1006+
1007+        self._readvs = [(0, struct.calcsize(PREFIX))]
1008+
1009+
1010+    def set_checkstring(self, checkstring_or_seqnum,
1011+                              root_hash=None,
1012+                              salt=None):
1013+        """
1014+        Set the checkstring that I will pass to the remote server when
1015+        writing.
1016+
1017+            @param checkstring_or_seqnum: A packed checkstring to use,
1018+                   or a sequence number. I will treat this as a checkstr
1019+
1020+        Note that implementations can differ in which semantics they
1021+        wish to support for set_checkstring -- they can, for example,
1022+        build the checkstring themselves from its constituents, or
1023+        some other thing.
1024+        """
1025+        if root_hash and salt:
1026+            checkstring = struct.pack(PREFIX,
1027+                                      0,
1028+                                      checkstring_or_seqnum,
1029+                                      root_hash,
1030+                                      salt)
1031+        else:
1032+            checkstring = checkstring_or_seqnum
1033+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
1034+
1035+
1036+    def get_checkstring(self):
1037+        """
1038+        Get the checkstring that I think currently exists on the remote
1039+        server.
1040+        """
1041+        if self._testvs:
1042+            return self._testvs[0][3]
1043+        return ""
1044+
1045+
1046+    def put_block(self, data, segnum, salt):
1047+        """
1048+        Add a block and salt to the share.
1049+        """
1050+        # SDMF files have only one segment
1051+        assert segnum == 0
1052+        assert len(data) == self._block_size
1053+        assert len(salt) == SALT_SIZE
1054+
1055+        self._share_pieces['sharedata'] = data
1056+        self._share_pieces['salt'] = salt
1057+
1058+        # TODO: Figure out something intelligent to return.
1059+        return defer.succeed(None)
1060+
1061+
1062+    def put_encprivkey(self, encprivkey):
1063+        """
1064+        Add the encrypted private key to the share.
1065+        """
1066+        self._share_pieces['encprivkey'] = encprivkey
1067+
1068+        return defer.succeed(None)
1069+
1070+
1071+    def put_blockhashes(self, blockhashes):
1072+        """
1073+        Add the block hash tree to the share.
1074+        """
1075+        assert isinstance(blockhashes, list)
1076+        for h in blockhashes:
1077+            assert len(h) == HASH_SIZE
1078+
1079+        # serialize the blockhashes, then set them.
1080+        blockhashes_s = "".join(blockhashes)
1081+        self._share_pieces['block_hash_tree'] = blockhashes_s
1082+
1083+        return defer.succeed(None)
1084+
1085+
1086+    def put_sharehashes(self, sharehashes):
1087+        """
1088+        Add the share hash chain to the share.
1089+        """
1090+        assert isinstance(sharehashes, dict)
1091+        for h in sharehashes.itervalues():
1092+            assert len(h) == HASH_SIZE
1093+
1094+        # serialize the sharehashes, then set them.
1095+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
1096+                                 for i in sorted(sharehashes.keys())])
1097+        self._share_pieces['share_hash_chain'] = sharehashes_s
1098+
1099+        return defer.succeed(None)
1100+
1101+
1102+    def put_root_hash(self, root_hash):
1103+        """
1104+        Add the root hash to the share.
1105+        """
1106+        assert len(root_hash) == HASH_SIZE
1107+
1108+        self._share_pieces['root_hash'] = root_hash
1109+
1110+        return defer.succeed(None)
1111+
1112+
1113+    def put_salt(self, salt):
1114+        """
1115+        Add a salt to an empty SDMF file.
1116+        """
1117+        assert len(salt) == SALT_SIZE
1118+
1119+        self._share_pieces['salt'] = salt
1120+        self._share_pieces['sharedata'] = ""
1121+
1122+
1123+    def get_signable(self):
1124+        """
1125+        Return the part of the share that needs to be signed.
1126+
1127+        SDMF writers need to sign the packed representation of the
1128+        first eight fields of the remote share, that is:
1129+            - version number (0)
1130+            - sequence number
1131+            - root of the share hash tree
1132+            - salt
1133+            - k
1134+            - n
1135+            - segsize
1136+            - datalen
1137+
1138+        This method is responsible for returning that to callers.
1139+        """
1140+        return struct.pack(SIGNED_PREFIX,
1141+                           0,
1142+                           self._seqnum,
1143+                           self._share_pieces['root_hash'],
1144+                           self._share_pieces['salt'],
1145+                           self._required_shares,
1146+                           self._total_shares,
1147+                           self._segment_size,
1148+                           self._data_length)
1149+
1150+
1151+    def put_signature(self, signature):
1152+        """
1153+        Add the signature to the share.
1154+        """
1155+        self._share_pieces['signature'] = signature
1156+
1157+        return defer.succeed(None)
1158+
1159+
1160+    def put_verification_key(self, verification_key):
1161+        """
1162+        Add the verification key to the share.
1163+        """
1164+        self._share_pieces['verification_key'] = verification_key
1165+
1166+        return defer.succeed(None)
1167+
1168+
1169+    def get_verinfo(self):
1170+        """
1171+        I return my verinfo tuple. This is used by the ServermapUpdater
1172+        to keep track of versions of mutable files.
1173+
1174+        The verinfo tuple for MDMF files contains:
1175+            - seqnum
1176+            - root hash
1177+            - a blank (nothing)
1178+            - segsize
1179+            - datalen
1180+            - k
1181+            - n
1182+            - prefix (the thing that you sign)
1183+            - a tuple of offsets
1184+
1185+        We include the nonce in MDMF to simplify processing of version
1186+        information tuples.
1187+
1188+        The verinfo tuple for SDMF files is the same, but contains a
1189+        16-byte IV instead of a hash of salts.
1190+        """
1191+        return (self._seqnum,
1192+                self._share_pieces['root_hash'],
1193+                self._share_pieces['salt'],
1194+                self._segment_size,
1195+                self._data_length,
1196+                self._required_shares,
1197+                self._total_shares,
1198+                self.get_signable(),
1199+                self._get_offsets_tuple())
1200+
1201+    def _get_offsets_dict(self):
1202+        post_offset = HEADER_LENGTH
1203+        offsets = {}
1204+
1205+        verification_key_length = len(self._share_pieces['verification_key'])
1206+        o1 = offsets['signature'] = post_offset + verification_key_length
1207+
1208+        signature_length = len(self._share_pieces['signature'])
1209+        o2 = offsets['share_hash_chain'] = o1 + signature_length
1210+
1211+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
1212+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
1213+
1214+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
1215+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
1216+
1217+        share_data_length = len(self._share_pieces['sharedata'])
1218+        o5 = offsets['enc_privkey'] = o4 + share_data_length
1219+
1220+        encprivkey_length = len(self._share_pieces['encprivkey'])
1221+        offsets['EOF'] = o5 + encprivkey_length
1222+        return offsets
1223+
1224+
1225+    def _get_offsets_tuple(self):
1226+        offsets = self._get_offsets_dict()
1227+        return tuple([(key, value) for key, value in offsets.items()])
1228+
1229+
1230+    def _pack_offsets(self):
1231+        offsets = self._get_offsets_dict()
1232+        return struct.pack(">LLLLQQ",
1233+                           offsets['signature'],
1234+                           offsets['share_hash_chain'],
1235+                           offsets['block_hash_tree'],
1236+                           offsets['share_data'],
1237+                           offsets['enc_privkey'],
1238+                           offsets['EOF'])
1239+
1240+
1241+    def finish_publishing(self):
1242+        """
1243+        Do anything necessary to finish writing the share to a remote
1244+        server. I require that no further publishing needs to take place
1245+        after this method has been called.
1246+        """
1247+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
1248+                  "share_hash_chain", "block_hash_tree"]:
1249+            assert k in self._share_pieces
1250+        # This is the only method that actually writes something to the
1251+        # remote server.
1252+        # First, we need to pack the share into data that we can write
1253+        # to the remote server in one write.
1254+        offsets = self._pack_offsets()
1255+        prefix = self.get_signable()
1256+        final_share = "".join([prefix,
1257+                               offsets,
1258+                               self._share_pieces['verification_key'],
1259+                               self._share_pieces['signature'],
1260+                               self._share_pieces['share_hash_chain'],
1261+                               self._share_pieces['block_hash_tree'],
1262+                               self._share_pieces['sharedata'],
1263+                               self._share_pieces['encprivkey']])
1264+
1265+        # Our only data vector is going to be writing the final share,
1266+        # in its entirely.
1267+        datavs = [(0, final_share)]
1268+
1269+        if not self._testvs:
1270+            # Our caller has not provided us with another checkstring
1271+            # yet, so we assume that we are writing a new share, and set
1272+            # a test vector that will allow a new share to be written.
1273+            self._testvs = []
1274+            self._testvs.append(tuple([0, 1, "eq", ""]))
1275+            new_share = True
1276+
1277+        tw_vectors = {}
1278+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
1279+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
1280+                                     self._storage_index,
1281+                                     self._secrets,
1282+                                     tw_vectors,
1283+                                     # TODO is it useful to read something?
1284+                                     self._readvs)
1285+
1286+
1287+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
1288+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
1289+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
1290+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
1291+MDMFCHECKSTRING = ">BQ32s"
1292+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
1293+MDMFOFFSETS = ">QQQQQQ"
1294+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
1295+
1296+class MDMFSlotWriteProxy:
1297+    implements(IMutableSlotWriter)
1298+
1299+    """
1300+    I represent a remote write slot for an MDMF mutable file.
1301+
1302+    I abstract away from my caller the details of block and salt
1303+    management, and the implementation of the on-disk format for MDMF
1304+    shares.
1305+    """
1306+    # Expected layout, MDMF:
1307+    # offset:     size:       name:
1308+    #-- signed part --
1309+    # 0           1           version number (01)
1310+    # 1           8           sequence number
1311+    # 9           32          share tree root hash
1312+    # 41          1           The "k" encoding parameter
1313+    # 42          1           The "N" encoding parameter
1314+    # 43          8           The segment size of the uploaded file
1315+    # 51          8           The data length of the original plaintext
1316+    #-- end signed part --
1317+    # 59          8           The offset of the encrypted private key
1318+    # 83          8           The offset of the signature
1319+    # 91          8           The offset of the verification key
1320+    # 67          8           The offset of the block hash tree
1321+    # 75          8           The offset of the share hash chain
1322+    # 99          8           The offset of the EOF
1323+    #
1324+    # followed by salts and share data, the encrypted private key, the
1325+    # block hash tree, the salt hash tree, the share hash chain, a
1326+    # signature over the first eight fields, and a verification key.
1327+    #
1328+    # The checkstring is the first three fields -- the version number,
1329+    # sequence number, root hash and root salt hash. This is consistent
1330+    # in meaning to what we have with SDMF files, except now instead of
1331+    # using the literal salt, we use a value derived from all of the
1332+    # salts -- the share hash root.
1333+    #
1334+    # The salt is stored before the block for each segment. The block
1335+    # hash tree is computed over the combination of block and salt for
1336+    # each segment. In this way, we get integrity checking for both
1337+    # block and salt with the current block hash tree arrangement.
1338+    #
1339+    # The ordering of the offsets is different to reflect the dependencies
1340+    # that we'll run into with an MDMF file. The expected write flow is
1341+    # something like this:
1342+    #
1343+    #   0: Initialize with the sequence number, encoding parameters and
1344+    #      data length. From this, we can deduce the number of segments,
1345+    #      and where they should go.. We can also figure out where the
1346+    #      encrypted private key should go, because we can figure out how
1347+    #      big the share data will be.
1348+    #
1349+    #   1: Encrypt, encode, and upload the file in chunks. Do something
1350+    #      like
1351+    #
1352+    #       put_block(data, segnum, salt)
1353+    #
1354+    #      to write a block and a salt to the disk. We can do both of
1355+    #      these operations now because we have enough of the offsets to
1356+    #      know where to put them.
1357+    #
1358+    #   2: Put the encrypted private key. Use:
1359+    #
1360+    #        put_encprivkey(encprivkey)
1361+    #
1362+    #      Now that we know the length of the private key, we can fill
1363+    #      in the offset for the block hash tree.
1364+    #
1365+    #   3: We're now in a position to upload the block hash tree for
1366+    #      a share. Put that using something like:
1367+    #       
1368+    #        put_blockhashes(block_hash_tree)
1369+    #
1370+    #      Note that block_hash_tree is a list of hashes -- we'll take
1371+    #      care of the details of serializing that appropriately. When
1372+    #      we get the block hash tree, we are also in a position to
1373+    #      calculate the offset for the share hash chain, and fill that
1374+    #      into the offsets table.
1375+    #
1376+    #   4: At the same time, we're in a position to upload the salt hash
1377+    #      tree. This is a Merkle tree over all of the salts. We use a
1378+    #      Merkle tree so that we can validate each block,salt pair as
1379+    #      we download them later. We do this using
1380+    #
1381+    #        put_salthashes(salt_hash_tree)
1382+    #
1383+    #      When you do this, I automatically put the root of the tree
1384+    #      (the hash at index 0 of the list) in its appropriate slot in
1385+    #      the signed prefix of the share.
1386+    #
1387+    #   5: We're now in a position to upload the share hash chain for
1388+    #      a share. Do that with something like:
1389+    #     
1390+    #        put_sharehashes(share_hash_chain)
1391+    #
1392+    #      share_hash_chain should be a dictionary mapping shnums to
1393+    #      32-byte hashes -- the wrapper handles serialization.
1394+    #      We'll know where to put the signature at this point, also.
1395+    #      The root of this tree will be put explicitly in the next
1396+    #      step.
1397+    #
1398+    #      TODO: Why? Why not just include it in the tree here?
1399+    #
1400+    #   6: Before putting the signature, we must first put the
1401+    #      root_hash. Do this with:
1402+    #
1403+    #        put_root_hash(root_hash).
1404+    #     
1405+    #      In terms of knowing where to put this value, it was always
1406+    #      possible to place it, but it makes sense semantically to
1407+    #      place it after the share hash tree, so that's why you do it
1408+    #      in this order.
1409+    #
1410+    #   6: With the root hash put, we can now sign the header. Use:
1411+    #
1412+    #        get_signable()
1413+    #
1414+    #      to get the part of the header that you want to sign, and use:
1415+    #       
1416+    #        put_signature(signature)
1417+    #
1418+    #      to write your signature to the remote server.
1419+    #
1420+    #   6: Add the verification key, and finish. Do:
1421+    #
1422+    #        put_verification_key(key)
1423+    #
1424+    #      and
1425+    #
1426+    #        finish_publish()
1427+    #
1428+    # Checkstring management:
1429+    #
1430+    # To write to a mutable slot, we have to provide test vectors to ensure
1431+    # that we are writing to the same data that we think we are. These
1432+    # vectors allow us to detect uncoordinated writes; that is, writes
1433+    # where both we and some other shareholder are writing to the
1434+    # mutable slot, and to report those back to the parts of the program
1435+    # doing the writing.
1436+    #
1437+    # With SDMF, this was easy -- all of the share data was written in
1438+    # one go, so it was easy to detect uncoordinated writes, and we only
1439+    # had to do it once. With MDMF, not all of the file is written at
1440+    # once.
1441+    #
1442+    # If a share is new, we write out as much of the header as we can
1443+    # before writing out anything else. This gives other writers a
1444+    # canary that they can use to detect uncoordinated writes, and, if
1445+    # they do the same thing, gives us the same canary. We them update
1446+    # the share. We won't be able to write out two fields of the header
1447+    # -- the share tree hash and the salt hash -- until we finish
1448+    # writing out the share. We only require the writer to provide the
1449+    # initial checkstring, and keep track of what it should be after
1450+    # updates ourselves.
1451+    #
1452+    # If we haven't written anything yet, then on the first write (which
1453+    # will probably be a block + salt of a share), we'll also write out
1454+    # the header. On subsequent passes, we'll expect to see the header.
1455+    # This changes in two places:
1456+    #
1457+    #   - When we write out the salt hash
1458+    #   - When we write out the root of the share hash tree
1459+    #
1460+    # since these values will change the header. It is possible that we
1461+    # can just make those be written in one operation to minimize
1462+    # disruption.
1463+    def __init__(self,
1464+                 shnum,
1465+                 rref, # a remote reference to a storage server
1466+                 storage_index,
1467+                 secrets, # (write_enabler, renew_secret, cancel_secret)
1468+                 seqnum, # the sequence number of the mutable file
1469+                 required_shares,
1470+                 total_shares,
1471+                 segment_size,
1472+                 data_length): # the length of the original file
1473+        self.shnum = shnum
1474+        self._rref = rref
1475+        self._storage_index = storage_index
1476+        self._seqnum = seqnum
1477+        self._required_shares = required_shares
1478+        assert self.shnum >= 0 and self.shnum < total_shares
1479+        self._total_shares = total_shares
1480+        # We build up the offset table as we write things. It is the
1481+        # last thing we write to the remote server.
1482+        self._offsets = {}
1483+        self._testvs = []
1484+        # This is a list of write vectors that will be sent to our
1485+        # remote server once we are directed to write things there.
1486+        self._writevs = []
1487+        self._secrets = secrets
1488+        # The segment size needs to be a multiple of the k parameter --
1489+        # any padding should have been carried out by the publisher
1490+        # already.
1491+        assert segment_size % required_shares == 0
1492+        self._segment_size = segment_size
1493+        self._data_length = data_length
1494+
1495+        # These are set later -- we define them here so that we can
1496+        # check for their existence easily
1497+
1498+        # This is the root of the share hash tree -- the Merkle tree
1499+        # over the roots of the block hash trees computed for shares in
1500+        # this upload.
1501+        self._root_hash = None
1502+
1503+        # We haven't yet written anything to the remote bucket. By
1504+        # setting this, we tell the _write method as much. The write
1505+        # method will then know that it also needs to add a write vector
1506+        # for the checkstring (or what we have of it) to the first write
1507+        # request. We'll then record that value for future use.  If
1508+        # we're expecting something to be there already, we need to call
1509+        # set_checkstring before we write anything to tell the first
1510+        # write about that.
1511+        self._written = False
1512+
1513+        # When writing data to the storage servers, we get a read vector
1514+        # for free. We'll read the checkstring, which will help us
1515+        # figure out what's gone wrong if a write fails.
1516+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
1517+
1518+        # We calculate the number of segments because it tells us
1519+        # where the salt part of the file ends/share segment begins,
1520+        # and also because it provides a useful amount of bounds checking.
1521+        self._num_segments = mathutil.div_ceil(self._data_length,
1522+                                               self._segment_size)
1523+        self._block_size = self._segment_size / self._required_shares
1524+        # We also calculate the share size, to help us with block
1525+        # constraints later.
1526+        tail_size = self._data_length % self._segment_size
1527+        if not tail_size:
1528+            self._tail_block_size = self._block_size
1529+        else:
1530+            self._tail_block_size = mathutil.next_multiple(tail_size,
1531+                                                           self._required_shares)
1532+            self._tail_block_size /= self._required_shares
1533+
1534+        # We already know where the sharedata starts; right after the end
1535+        # of the header (which is defined as the signable part + the offsets)
1536+        # We can also calculate where the encrypted private key begins
1537+        # from what we know know.
1538+        self._actual_block_size = self._block_size + SALT_SIZE
1539+        data_size = self._actual_block_size * (self._num_segments - 1)
1540+        data_size += self._tail_block_size
1541+        data_size += SALT_SIZE
1542+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
1543+        self._offsets['enc_privkey'] += data_size
1544+        # We'll wait for the rest. Callers can now call my "put_block" and
1545+        # "set_checkstring" methods.
1546+
1547+
1548+    def set_checkstring(self,
1549+                        seqnum_or_checkstring,
1550+                        root_hash=None,
1551+                        salt=None):
1552+        """
1553+        Set checkstring checkstring for the given shnum.
1554+
1555+        This can be invoked in one of two ways.
1556+
1557+        With one argument, I assume that you are giving me a literal
1558+        checkstring -- e.g., the output of get_checkstring. I will then
1559+        set that checkstring as it is. This form is used by unit tests.
1560+
1561+        With two arguments, I assume that you are giving me a sequence
1562+        number and root hash to make a checkstring from. In that case, I
1563+        will build a checkstring and set it for you. This form is used
1564+        by the publisher.
1565+
1566+        By default, I assume that I am writing new shares to the grid.
1567+        If you don't explcitly set your own checkstring, I will use
1568+        one that requires that the remote share not exist. You will want
1569+        to use this method if you are updating a share in-place;
1570+        otherwise, writes will fail.
1571+        """
1572+        # You're allowed to overwrite checkstrings with this method;
1573+        # I assume that users know what they are doing when they call
1574+        # it.
1575+        if root_hash:
1576+            checkstring = struct.pack(MDMFCHECKSTRING,
1577+                                      1,
1578+                                      seqnum_or_checkstring,
1579+                                      root_hash)
1580+        else:
1581+            checkstring = seqnum_or_checkstring
1582+
1583+        if checkstring == "":
1584+            # We special-case this, since len("") = 0, but we need
1585+            # length of 1 for the case of an empty share to work on the
1586+            # storage server, which is what a checkstring that is the
1587+            # empty string means.
1588+            self._testvs = []
1589+        else:
1590+            self._testvs = []
1591+            self._testvs.append((0, len(checkstring), "eq", checkstring))
1592+
1593+
1594+    def __repr__(self):
1595+        return "MDMFSlotWriteProxy for share %d" % self.shnum
1596+
1597+
1598+    def get_checkstring(self):
1599+        """
1600+        Given a share number, I return a representation of what the
1601+        checkstring for that share on the server will look like.
1602+
1603+        I am mostly used for tests.
1604+        """
1605+        if self._root_hash:
1606+            roothash = self._root_hash
1607+        else:
1608+            roothash = "\x00" * 32
1609+        return struct.pack(MDMFCHECKSTRING,
1610+                           1,
1611+                           self._seqnum,
1612+                           roothash)
1613+
1614+
1615+    def put_block(self, data, segnum, salt):
1616+        """
1617+        I queue a write vector for the data, salt, and segment number
1618+        provided to me. I return None, as I do not actually cause
1619+        anything to be written yet.
1620+        """
1621+        if segnum >= self._num_segments:
1622+            raise LayoutInvalid("I won't overwrite the private key")
1623+        if len(salt) != SALT_SIZE:
1624+            raise LayoutInvalid("I was given a salt of size %d, but "
1625+                                "I wanted a salt of size %d")
1626+        if segnum + 1 == self._num_segments:
1627+            if len(data) != self._tail_block_size:
1628+                raise LayoutInvalid("I was given the wrong size block to write")
1629+        elif len(data) != self._block_size:
1630+            raise LayoutInvalid("I was given the wrong size block to write")
1631+
1632+        # We want to write at len(MDMFHEADER) + segnum * block_size.
1633+
1634+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
1635+        data = salt + data
1636+
1637+        self._writevs.append(tuple([offset, data]))
1638+
1639+
1640+    def put_encprivkey(self, encprivkey):
1641+        """
1642+        I queue a write vector for the encrypted private key provided to
1643+        me.
1644+        """
1645+        assert self._offsets
1646+        assert self._offsets['enc_privkey']
1647+        # You shouldn't re-write the encprivkey after the block hash
1648+        # tree is written, since that could cause the private key to run
1649+        # into the block hash tree. Before it writes the block hash
1650+        # tree, the block hash tree writing method writes the offset of
1651+        # the salt hash tree. So that's a good indicator of whether or
1652+        # not the block hash tree has been written.
1653+        if "share_hash_chain" in self._offsets:
1654+            raise LayoutInvalid("You must write this before the block hash tree")
1655+
1656+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
1657+            len(encprivkey)
1658+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
1659+
1660+
1661+    def put_blockhashes(self, blockhashes):
1662+        """
1663+        I queue a write vector to put the block hash tree in blockhashes
1664+        onto the remote server.
1665+
1666+        The encrypted private key must be queued before the block hash
1667+        tree, since we need to know how large it is to know where the
1668+        block hash tree should go. The block hash tree must be put
1669+        before the salt hash tree, since its size determines the
1670+        offset of the share hash chain.
1671+        """
1672+        assert self._offsets
1673+        assert isinstance(blockhashes, list)
1674+        if "block_hash_tree" not in self._offsets:
1675+            raise LayoutInvalid("You must put the encrypted private key "
1676+                                "before you put the block hash tree")
1677+        # If written, the share hash chain causes the signature offset
1678+        # to be defined.
1679+        if "signature" in self._offsets:
1680+            raise LayoutInvalid("You must put the block hash tree before "
1681+                                "you put the share hash chain")
1682+        blockhashes_s = "".join(blockhashes)
1683+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
1684+
1685+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
1686+                                  blockhashes_s]))
1687+
1688+
1689+    def put_sharehashes(self, sharehashes):
1690+        """
1691+        I queue a write vector to put the share hash chain in my
1692+        argument onto the remote server.
1693+
1694+        The salt hash tree must be queued before the share hash chain,
1695+        since we need to know where the salt hash tree ends before we
1696+        can know where the share hash chain starts. The share hash chain
1697+        must be put before the signature, since the length of the packed
1698+        share hash chain determines the offset of the signature. Also,
1699+        semantically, you must know what the root of the salt hash tree
1700+        is before you can generate a valid signature.
1701+        """
1702+        assert isinstance(sharehashes, dict)
1703+        if "share_hash_chain" not in self._offsets:
1704+            raise LayoutInvalid("You need to put the salt hash tree before "
1705+                                "you can put the share hash chain")
1706+        # The signature comes after the share hash chain. If the
1707+        # signature has already been written, we must not write another
1708+        # share hash chain. The signature writes the verification key
1709+        # offset when it gets sent to the remote server, so we look for
1710+        # that.
1711+        if "verification_key" in self._offsets:
1712+            raise LayoutInvalid("You must write the share hash chain "
1713+                                "before you write the signature")
1714+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
1715+                                  for i in sorted(sharehashes.keys())])
1716+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
1717+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
1718+                            sharehashes_s]))
1719+
1720+
1721+    def put_root_hash(self, roothash):
1722+        """
1723+        Put the root hash (the root of the share hash tree) in the
1724+        remote slot.
1725+        """
1726+        # It does not make sense to be able to put the root
1727+        # hash without first putting the share hashes, since you need
1728+        # the share hashes to generate the root hash.
1729+        #
1730+        # Signature is defined by the routine that places the share hash
1731+        # chain, so it's a good thing to look for in finding out whether
1732+        # or not the share hash chain exists on the remote server.
1733+        if "signature" not in self._offsets:
1734+            raise LayoutInvalid("You need to put the share hash chain "
1735+                                "before you can put the root share hash")
1736+        if len(roothash) != HASH_SIZE:
1737+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
1738+                                 % HASH_SIZE)
1739+        self._root_hash = roothash
1740+        # To write both of these values, we update the checkstring on
1741+        # the remote server, which includes them
1742+        checkstring = self.get_checkstring()
1743+        self._writevs.append(tuple([0, checkstring]))
1744+        # This write, if successful, changes the checkstring, so we need
1745+        # to update our internal checkstring to be consistent with the
1746+        # one on the server.
1747+
1748+
1749+    def get_signable(self):
1750+        """
1751+        Get the first seven fields of the mutable file; the parts that
1752+        are signed.
1753+        """
1754+        if not self._root_hash:
1755+            raise LayoutInvalid("You need to set the root hash "
1756+                                "before getting something to "
1757+                                "sign")
1758+        return struct.pack(MDMFSIGNABLEHEADER,
1759+                           1,
1760+                           self._seqnum,
1761+                           self._root_hash,
1762+                           self._required_shares,
1763+                           self._total_shares,
1764+                           self._segment_size,
1765+                           self._data_length)
1766+
1767+
1768+    def put_signature(self, signature):
1769+        """
1770+        I queue a write vector for the signature of the MDMF share.
1771+
1772+        I require that the root hash and share hash chain have been put
1773+        to the grid before I will write the signature to the grid.
1774+        """
1775+        if "signature" not in self._offsets:
1776+            raise LayoutInvalid("You must put the share hash chain "
1777+        # It does not make sense to put a signature without first
1778+        # putting the root hash and the salt hash (since otherwise
1779+        # the signature would be incomplete), so we don't allow that.
1780+                       "before putting the signature")
1781+        if not self._root_hash:
1782+            raise LayoutInvalid("You must complete the signed prefix "
1783+                                "before computing a signature")
1784+        # If we put the signature after we put the verification key, we
1785+        # could end up running into the verification key, and will
1786+        # probably screw up the offsets as well. So we don't allow that.
1787+        # The method that writes the verification key defines the EOF
1788+        # offset before writing the verification key, so look for that.
1789+        if "EOF" in self._offsets:
1790+            raise LayoutInvalid("You must write the signature before the verification key")
1791+
1792+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
1793+        self._writevs.append(tuple([self._offsets['signature'], signature]))
1794+
1795+
1796+    def put_verification_key(self, verification_key):
1797+        """
1798+        I queue a write vector for the verification key.
1799+
1800+        I require that the signature have been written to the storage
1801+        server before I allow the verification key to be written to the
1802+        remote server.
1803+        """
1804+        if "verification_key" not in self._offsets:
1805+            raise LayoutInvalid("You must put the signature before you "
1806+                                "can put the verification key")
1807+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
1808+        self._writevs.append(tuple([self._offsets['verification_key'],
1809+                            verification_key]))
1810+
1811+
1812+    def _get_offsets_tuple(self):
1813+        return tuple([(key, value) for key, value in self._offsets.items()])
1814+
1815+
1816+    def get_verinfo(self):
1817+        return (self._seqnum,
1818+                self._root_hash,
1819+                self._required_shares,
1820+                self._total_shares,
1821+                self._segment_size,
1822+                self._data_length,
1823+                self.get_signable(),
1824+                self._get_offsets_tuple())
1825+
1826+
1827+    def finish_publishing(self):
1828+        """
1829+        I add a write vector for the offsets table, and then cause all
1830+        of the write vectors that I've dealt with so far to be published
1831+        to the remote server, ending the write process.
1832+        """
1833+        if "EOF" not in self._offsets:
1834+            raise LayoutInvalid("You must put the verification key before "
1835+                                "you can publish the offsets")
1836+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
1837+        offsets = struct.pack(MDMFOFFSETS,
1838+                              self._offsets['enc_privkey'],
1839+                              self._offsets['block_hash_tree'],
1840+                              self._offsets['share_hash_chain'],
1841+                              self._offsets['signature'],
1842+                              self._offsets['verification_key'],
1843+                              self._offsets['EOF'])
1844+        self._writevs.append(tuple([offsets_offset, offsets]))
1845+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
1846+        params = struct.pack(">BBQQ",
1847+                             self._required_shares,
1848+                             self._total_shares,
1849+                             self._segment_size,
1850+                             self._data_length)
1851+        self._writevs.append(tuple([encoding_parameters_offset, params]))
1852+        return self._write(self._writevs)
1853+
1854+
1855+    def _write(self, datavs, on_failure=None, on_success=None):
1856+        """I write the data vectors in datavs to the remote slot."""
1857+        tw_vectors = {}
1858+        new_share = False
1859+        if not self._testvs:
1860+            self._testvs = []
1861+            self._testvs.append(tuple([0, 1, "eq", ""]))
1862+            new_share = True
1863+        if not self._written:
1864+            # Write a new checkstring to the share when we write it, so
1865+            # that we have something to check later.
1866+            new_checkstring = self.get_checkstring()
1867+            datavs.append((0, new_checkstring))
1868+            def _first_write():
1869+                self._written = True
1870+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
1871+            on_success = _first_write
1872+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
1873+        datalength = sum([len(x[1]) for x in datavs])
1874+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
1875+                                  self._storage_index,
1876+                                  self._secrets,
1877+                                  tw_vectors,
1878+                                  self._readv)
1879+        def _result(results):
1880+            if isinstance(results, failure.Failure) or not results[0]:
1881+                # Do nothing; the write was unsuccessful.
1882+                if on_failure: on_failure()
1883+            else:
1884+                if on_success: on_success()
1885+            return results
1886+        d.addCallback(_result)
1887+        return d
1888+
1889+
1890+class MDMFSlotReadProxy:
1891+    """
1892+    I read from a mutable slot filled with data written in the MDMF data
1893+    format (which is described above).
1894+
1895+    I can be initialized with some amount of data, which I will use (if
1896+    it is valid) to eliminate some of the need to fetch it from servers.
1897+    """
1898+    def __init__(self,
1899+                 rref,
1900+                 storage_index,
1901+                 shnum,
1902+                 data=""):
1903+        # Start the initialization process.
1904+        self._rref = rref
1905+        self._storage_index = storage_index
1906+        self.shnum = shnum
1907+
1908+        # Before doing anything, the reader is probably going to want to
1909+        # verify that the signature is correct. To do that, they'll need
1910+        # the verification key, and the signature. To get those, we'll
1911+        # need the offset table. So fetch the offset table on the
1912+        # assumption that that will be the first thing that a reader is
1913+        # going to do.
1914+
1915+        # The fact that these encoding parameters are None tells us
1916+        # that we haven't yet fetched them from the remote share, so we
1917+        # should. We could just not set them, but the checks will be
1918+        # easier to read if we don't have to use hasattr.
1919+        self._version_number = None
1920+        self._sequence_number = None
1921+        self._root_hash = None
1922+        # Filled in if we're dealing with an SDMF file. Unused
1923+        # otherwise.
1924+        self._salt = None
1925+        self._required_shares = None
1926+        self._total_shares = None
1927+        self._segment_size = None
1928+        self._data_length = None
1929+        self._offsets = None
1930+
1931+        # If the user has chosen to initialize us with some data, we'll
1932+        # try to satisfy subsequent data requests with that data before
1933+        # asking the storage server for it. If
1934+        self._data = data
1935+        # The way callers interact with cache in the filenode returns
1936+        # None if there isn't any cached data, but the way we index the
1937+        # cached data requires a string, so convert None to "".
1938+        if self._data == None:
1939+            self._data = ""
1940+
1941+        self._queue_observers = observer.ObserverList()
1942+        self._queue_errbacks = observer.ObserverList()
1943+        self._readvs = []
1944+
1945+
1946+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
1947+        """
1948+        I fetch the offset table and the header from the remote slot if
1949+        I don't already have them. If I do have them, I do nothing and
1950+        return an empty Deferred.
1951+        """
1952+        if self._offsets:
1953+            return defer.succeed(None)
1954+        # At this point, we may be either SDMF or MDMF. Fetching 107
1955+        # bytes will be enough to get header and offsets for both SDMF and
1956+        # MDMF, though we'll be left with 4 more bytes than we
1957+        # need if this ends up being MDMF. This is probably less
1958+        # expensive than the cost of a second roundtrip.
1959+        readvs = [(0, 107)]
1960+        d = self._read(readvs, force_remote)
1961+        d.addCallback(self._process_encoding_parameters)
1962+        d.addCallback(self._process_offsets)
1963+        return d
1964+
1965+
1966+    def _process_encoding_parameters(self, encoding_parameters):
1967+        assert self.shnum in encoding_parameters
1968+        encoding_parameters = encoding_parameters[self.shnum][0]
1969+        # The first byte is the version number. It will tell us what
1970+        # to do next.
1971+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
1972+        if verno == MDMF_VERSION:
1973+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
1974+            (verno,
1975+             seqnum,
1976+             root_hash,
1977+             k,
1978+             n,
1979+             segsize,
1980+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
1981+                                      encoding_parameters[:read_size])
1982+            if segsize == 0 and datalen == 0:
1983+                # Empty file, no segments.
1984+                self._num_segments = 0
1985+            else:
1986+                self._num_segments = mathutil.div_ceil(datalen, segsize)
1987+
1988+        elif verno == SDMF_VERSION:
1989+            read_size = SIGNED_PREFIX_LENGTH
1990+            (verno,
1991+             seqnum,
1992+             root_hash,
1993+             salt,
1994+             k,
1995+             n,
1996+             segsize,
1997+             datalen) = struct.unpack(">BQ32s16s BBQQ",
1998+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
1999+            self._salt = salt
2000+            if segsize == 0 and datalen == 0:
2001+                # empty file
2002+                self._num_segments = 0
2003+            else:
2004+                # non-empty SDMF files have one segment.
2005+                self._num_segments = 1
2006+        else:
2007+            raise UnknownVersionError("You asked me to read mutable file "
2008+                                      "version %d, but I only understand "
2009+                                      "%d and %d" % (verno, SDMF_VERSION,
2010+                                                     MDMF_VERSION))
2011+
2012+        self._version_number = verno
2013+        self._sequence_number = seqnum
2014+        self._root_hash = root_hash
2015+        self._required_shares = k
2016+        self._total_shares = n
2017+        self._segment_size = segsize
2018+        self._data_length = datalen
2019+
2020+        self._block_size = self._segment_size / self._required_shares
2021+        # We can upload empty files, and need to account for this fact
2022+        # so as to avoid zero-division and zero-modulo errors.
2023+        if datalen > 0:
2024+            tail_size = self._data_length % self._segment_size
2025+        else:
2026+            tail_size = 0
2027+        if not tail_size:
2028+            self._tail_block_size = self._block_size
2029+        else:
2030+            self._tail_block_size = mathutil.next_multiple(tail_size,
2031+                                                    self._required_shares)
2032+            self._tail_block_size /= self._required_shares
2033+
2034+        return encoding_parameters
2035+
2036+
2037+    def _process_offsets(self, offsets):
2038+        if self._version_number == 0:
2039+            read_size = OFFSETS_LENGTH
2040+            read_offset = SIGNED_PREFIX_LENGTH
2041+            end = read_size + read_offset
2042+            (signature,
2043+             share_hash_chain,
2044+             block_hash_tree,
2045+             share_data,
2046+             enc_privkey,
2047+             EOF) = struct.unpack(">LLLLQQ",
2048+                                  offsets[read_offset:end])
2049+            self._offsets = {}
2050+            self._offsets['signature'] = signature
2051+            self._offsets['share_data'] = share_data
2052+            self._offsets['block_hash_tree'] = block_hash_tree
2053+            self._offsets['share_hash_chain'] = share_hash_chain
2054+            self._offsets['enc_privkey'] = enc_privkey
2055+            self._offsets['EOF'] = EOF
2056+
2057+        elif self._version_number == 1:
2058+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
2059+            read_length = MDMFOFFSETS_LENGTH
2060+            end = read_offset + read_length
2061+            (encprivkey,
2062+             blockhashes,
2063+             sharehashes,
2064+             signature,
2065+             verification_key,
2066+             eof) = struct.unpack(MDMFOFFSETS,
2067+                                  offsets[read_offset:end])
2068+            self._offsets = {}
2069+            self._offsets['enc_privkey'] = encprivkey
2070+            self._offsets['block_hash_tree'] = blockhashes
2071+            self._offsets['share_hash_chain'] = sharehashes
2072+            self._offsets['signature'] = signature
2073+            self._offsets['verification_key'] = verification_key
2074+            self._offsets['EOF'] = eof
2075+
2076+
2077+    def get_block_and_salt(self, segnum, queue=False):
2078+        """
2079+        I return (block, salt), where block is the block data and
2080+        salt is the salt used to encrypt that segment.
2081+        """
2082+        d = self._maybe_fetch_offsets_and_header()
2083+        def _then(ignored):
2084+            if self._version_number == 1:
2085+                base_share_offset = MDMFHEADERSIZE
2086+            else:
2087+                base_share_offset = self._offsets['share_data']
2088+
2089+            if segnum + 1 > self._num_segments:
2090+                raise LayoutInvalid("Not a valid segment number")
2091+
2092+            if self._version_number == 0:
2093+                share_offset = base_share_offset + self._block_size * segnum
2094+            else:
2095+                share_offset = base_share_offset + (self._block_size + \
2096+                                                    SALT_SIZE) * segnum
2097+            if segnum + 1 == self._num_segments:
2098+                data = self._tail_block_size
2099+            else:
2100+                data = self._block_size
2101+
2102+            if self._version_number == 1:
2103+                data += SALT_SIZE
2104+
2105+            readvs = [(share_offset, data)]
2106+            return readvs
2107+        d.addCallback(_then)
2108+        d.addCallback(lambda readvs:
2109+            self._read(readvs, queue=queue))
2110+        def _process_results(results):
2111+            assert self.shnum in results
2112+            if self._version_number == 0:
2113+                # We only read the share data, but we know the salt from
2114+                # when we fetched the header
2115+                data = results[self.shnum]
2116+                if not data:
2117+                    data = ""
2118+                else:
2119+                    assert len(data) == 1
2120+                    data = data[0]
2121+                salt = self._salt
2122+            else:
2123+                data = results[self.shnum]
2124+                if not data:
2125+                    salt = data = ""
2126+                else:
2127+                    salt_and_data = results[self.shnum][0]
2128+                    salt = salt_and_data[:SALT_SIZE]
2129+                    data = salt_and_data[SALT_SIZE:]
2130+            return data, salt
2131+        d.addCallback(_process_results)
2132+        return d
2133+
2134+
2135+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
2136+        """
2137+        I return the block hash tree
2138+
2139+        I take an optional argument, needed, which is a set of indices
2140+        correspond to hashes that I should fetch. If this argument is
2141+        missing, I will fetch the entire block hash tree; otherwise, I
2142+        may attempt to fetch fewer hashes, based on what needed says
2143+        that I should do. Note that I may fetch as many hashes as I
2144+        want, so long as the set of hashes that I do fetch is a superset
2145+        of the ones that I am asked for, so callers should be prepared
2146+        to tolerate additional hashes.
2147+        """
2148+        # TODO: Return only the parts of the block hash tree necessary
2149+        # to validate the blocknum provided?
2150+        # This is a good idea, but it is hard to implement correctly. It
2151+        # is bad to fetch any one block hash more than once, so we
2152+        # probably just want to fetch the whole thing at once and then
2153+        # serve it.
2154+        if needed == set([]):
2155+            return defer.succeed([])
2156+        d = self._maybe_fetch_offsets_and_header()
2157+        def _then(ignored):
2158+            blockhashes_offset = self._offsets['block_hash_tree']
2159+            if self._version_number == 1:
2160+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
2161+            else:
2162+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
2163+            readvs = [(blockhashes_offset, blockhashes_length)]
2164+            return readvs
2165+        d.addCallback(_then)
2166+        d.addCallback(lambda readvs:
2167+            self._read(readvs, queue=queue, force_remote=force_remote))
2168+        def _build_block_hash_tree(results):
2169+            assert self.shnum in results
2170+
2171+            rawhashes = results[self.shnum][0]
2172+            results = [rawhashes[i:i+HASH_SIZE]
2173+                       for i in range(0, len(rawhashes), HASH_SIZE)]
2174+            return results
2175+        d.addCallback(_build_block_hash_tree)
2176+        return d
2177+
2178+
2179+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
2180+        """
2181+        I return the part of the share hash chain placed to validate
2182+        this share.
2183+
2184+        I take an optional argument, needed. Needed is a set of indices
2185+        that correspond to the hashes that I should fetch. If needed is
2186+        not present, I will fetch and return the entire share hash
2187+        chain. Otherwise, I may fetch and return any part of the share
2188+        hash chain that is a superset of the part that I am asked to
2189+        fetch. Callers should be prepared to deal with more hashes than
2190+        they've asked for.
2191+        """
2192+        if needed == set([]):
2193+            return defer.succeed([])
2194+        d = self._maybe_fetch_offsets_and_header()
2195+
2196+        def _make_readvs(ignored):
2197+            sharehashes_offset = self._offsets['share_hash_chain']
2198+            if self._version_number == 0:
2199+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
2200+            else:
2201+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
2202+            readvs = [(sharehashes_offset, sharehashes_length)]
2203+            return readvs
2204+        d.addCallback(_make_readvs)
2205+        d.addCallback(lambda readvs:
2206+            self._read(readvs, queue=queue, force_remote=force_remote))
2207+        def _build_share_hash_chain(results):
2208+            assert self.shnum in results
2209+
2210+            sharehashes = results[self.shnum][0]
2211+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
2212+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
2213+            results = dict([struct.unpack(">H32s", data)
2214+                            for data in results])
2215+            return results
2216+        d.addCallback(_build_share_hash_chain)
2217+        return d
2218+
2219+
2220+    def get_encprivkey(self, queue=False):
2221+        """
2222+        I return the encrypted private key.
2223+        """
2224+        d = self._maybe_fetch_offsets_and_header()
2225+
2226+        def _make_readvs(ignored):
2227+            privkey_offset = self._offsets['enc_privkey']
2228+            if self._version_number == 0:
2229+                privkey_length = self._offsets['EOF'] - privkey_offset
2230+            else:
2231+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
2232+            readvs = [(privkey_offset, privkey_length)]
2233+            return readvs
2234+        d.addCallback(_make_readvs)
2235+        d.addCallback(lambda readvs:
2236+            self._read(readvs, queue=queue))
2237+        def _process_results(results):
2238+            assert self.shnum in results
2239+            privkey = results[self.shnum][0]
2240+            return privkey
2241+        d.addCallback(_process_results)
2242+        return d
2243+
2244+
2245+    def get_signature(self, queue=False):
2246+        """
2247+        I return the signature of my share.
2248+        """
2249+        d = self._maybe_fetch_offsets_and_header()
2250+
2251+        def _make_readvs(ignored):
2252+            signature_offset = self._offsets['signature']
2253+            if self._version_number == 1:
2254+                signature_length = self._offsets['verification_key'] - signature_offset
2255+            else:
2256+                signature_length = self._offsets['share_hash_chain'] - signature_offset
2257+            readvs = [(signature_offset, signature_length)]
2258+            return readvs
2259+        d.addCallback(_make_readvs)
2260+        d.addCallback(lambda readvs:
2261+            self._read(readvs, queue=queue))
2262+        def _process_results(results):
2263+            assert self.shnum in results
2264+            signature = results[self.shnum][0]
2265+            return signature
2266+        d.addCallback(_process_results)
2267+        return d
2268+
2269+
2270+    def get_verification_key(self, queue=False):
2271+        """
2272+        I return the verification key.
2273+        """
2274+        d = self._maybe_fetch_offsets_and_header()
2275+
2276+        def _make_readvs(ignored):
2277+            if self._version_number == 1:
2278+                vk_offset = self._offsets['verification_key']
2279+                vk_length = self._offsets['EOF'] - vk_offset
2280+            else:
2281+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
2282+                vk_length = self._offsets['signature'] - vk_offset
2283+            readvs = [(vk_offset, vk_length)]
2284+            return readvs
2285+        d.addCallback(_make_readvs)
2286+        d.addCallback(lambda readvs:
2287+            self._read(readvs, queue=queue))
2288+        def _process_results(results):
2289+            assert self.shnum in results
2290+            verification_key = results[self.shnum][0]
2291+            return verification_key
2292+        d.addCallback(_process_results)
2293+        return d
2294+
2295+
2296+    def get_encoding_parameters(self):
2297+        """
2298+        I return (k, n, segsize, datalen)
2299+        """
2300+        d = self._maybe_fetch_offsets_and_header()
2301+        d.addCallback(lambda ignored:
2302+            (self._required_shares,
2303+             self._total_shares,
2304+             self._segment_size,
2305+             self._data_length))
2306+        return d
2307+
2308+
2309+    def get_seqnum(self):
2310+        """
2311+        I return the sequence number for this share.
2312+        """
2313+        d = self._maybe_fetch_offsets_and_header()
2314+        d.addCallback(lambda ignored:
2315+            self._sequence_number)
2316+        return d
2317+
2318+
2319+    def get_root_hash(self):
2320+        """
2321+        I return the root of the block hash tree
2322+        """
2323+        d = self._maybe_fetch_offsets_and_header()
2324+        d.addCallback(lambda ignored: self._root_hash)
2325+        return d
2326+
2327+
2328+    def get_checkstring(self):
2329+        """
2330+        I return the packed representation of the following:
2331+
2332+            - version number
2333+            - sequence number
2334+            - root hash
2335+            - salt hash
2336+
2337+        which my users use as a checkstring to detect other writers.
2338+        """
2339+        d = self._maybe_fetch_offsets_and_header()
2340+        def _build_checkstring(ignored):
2341+            if self._salt:
2342+                checkstring = strut.pack(PREFIX,
2343+                                         self._version_number,
2344+                                         self._sequence_number,
2345+                                         self._root_hash,
2346+                                         self._salt)
2347+            else:
2348+                checkstring = struct.pack(MDMFCHECKSTRING,
2349+                                          self._version_number,
2350+                                          self._sequence_number,
2351+                                          self._root_hash)
2352+
2353+            return checkstring
2354+        d.addCallback(_build_checkstring)
2355+        return d
2356+
2357+
2358+    def get_prefix(self, force_remote):
2359+        d = self._maybe_fetch_offsets_and_header(force_remote)
2360+        d.addCallback(lambda ignored:
2361+            self._build_prefix())
2362+        return d
2363+
2364+
2365+    def _build_prefix(self):
2366+        # The prefix is another name for the part of the remote share
2367+        # that gets signed. It consists of everything up to and
2368+        # including the datalength, packed by struct.
2369+        if self._version_number == SDMF_VERSION:
2370+            return struct.pack(SIGNED_PREFIX,
2371+                           self._version_number,
2372+                           self._sequence_number,
2373+                           self._root_hash,
2374+                           self._salt,
2375+                           self._required_shares,
2376+                           self._total_shares,
2377+                           self._segment_size,
2378+                           self._data_length)
2379+
2380+        else:
2381+            return struct.pack(MDMFSIGNABLEHEADER,
2382+                           self._version_number,
2383+                           self._sequence_number,
2384+                           self._root_hash,
2385+                           self._required_shares,
2386+                           self._total_shares,
2387+                           self._segment_size,
2388+                           self._data_length)
2389+
2390+
2391+    def _get_offsets_tuple(self):
2392+        # The offsets tuple is another component of the version
2393+        # information tuple. It is basically our offsets dictionary,
2394+        # itemized and in a tuple.
2395+        return self._offsets.copy()
2396+
2397+
2398+    def get_verinfo(self):
2399+        """
2400+        I return my verinfo tuple. This is used by the ServermapUpdater
2401+        to keep track of versions of mutable files.
2402+
2403+        The verinfo tuple for MDMF files contains:
2404+            - seqnum
2405+            - root hash
2406+            - a blank (nothing)
2407+            - segsize
2408+            - datalen
2409+            - k
2410+            - n
2411+            - prefix (the thing that you sign)
2412+            - a tuple of offsets
2413+
2414+        We include the nonce in MDMF to simplify processing of version
2415+        information tuples.
2416+
2417+        The verinfo tuple for SDMF files is the same, but contains a
2418+        16-byte IV instead of a hash of salts.
2419+        """
2420+        d = self._maybe_fetch_offsets_and_header()
2421+        def _build_verinfo(ignored):
2422+            if self._version_number == SDMF_VERSION:
2423+                salt_to_use = self._salt
2424+            else:
2425+                salt_to_use = None
2426+            return (self._sequence_number,
2427+                    self._root_hash,
2428+                    salt_to_use,
2429+                    self._segment_size,
2430+                    self._data_length,
2431+                    self._required_shares,
2432+                    self._total_shares,
2433+                    self._build_prefix(),
2434+                    self._get_offsets_tuple())
2435+        d.addCallback(_build_verinfo)
2436+        return d
2437+
2438+
2439+    def flush(self):
2440+        """
2441+        I flush my queue of read vectors.
2442+        """
2443+        d = self._read(self._readvs)
2444+        def _then(results):
2445+            self._readvs = []
2446+            if isinstance(results, failure.Failure):
2447+                self._queue_errbacks.notify(results)
2448+            else:
2449+                self._queue_observers.notify(results)
2450+            self._queue_observers = observer.ObserverList()
2451+            self._queue_errbacks = observer.ObserverList()
2452+        d.addBoth(_then)
2453+
2454+
2455+    def _read(self, readvs, force_remote=False, queue=False):
2456+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
2457+        # TODO: It's entirely possible to tweak this so that it just
2458+        # fulfills the requests that it can, and not demand that all
2459+        # requests are satisfiable before running it.
2460+        if not unsatisfiable and not force_remote:
2461+            results = [self._data[offset:offset+length]
2462+                       for (offset, length) in readvs]
2463+            results = {self.shnum: results}
2464+            return defer.succeed(results)
2465+        else:
2466+            if queue:
2467+                start = len(self._readvs)
2468+                self._readvs += readvs
2469+                end = len(self._readvs)
2470+                def _get_results(results, start, end):
2471+                    if not self.shnum in results:
2472+                        return {self._shnum: [""]}
2473+                    return {self.shnum: results[self.shnum][start:end]}
2474+                d = defer.Deferred()
2475+                d.addCallback(_get_results, start, end)
2476+                self._queue_observers.subscribe(d.callback)
2477+                self._queue_errbacks.subscribe(d.errback)
2478+                return d
2479+            return self._rref.callRemote("slot_readv",
2480+                                         self._storage_index,
2481+                                         [self.shnum],
2482+                                         readvs)
2483+
2484+
2485+    def is_sdmf(self):
2486+        """I tell my caller whether or not my remote file is SDMF or MDMF
2487+        """
2488+        d = self._maybe_fetch_offsets_and_header()
2489+        d.addCallback(lambda ignored:
2490+            self._version_number == 0)
2491+        return d
2492+
2493+
2494+class LayoutInvalid(Exception):
2495+    """
2496+    This isn't a valid MDMF mutable file
2497+    """
2498hunk ./src/allmydata/test/test_storage.py 2
2499 
2500-import time, os.path, stat, re, simplejson, struct
2501+import time, os.path, stat, re, simplejson, struct, shutil
2502 
2503 from twisted.trial import unittest
2504 
2505hunk ./src/allmydata/test/test_storage.py 22
2506 from allmydata.storage.expirer import LeaseCheckingCrawler
2507 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
2508      ReadBucketProxy
2509-from allmydata.interfaces import BadWriteEnablerError
2510-from allmydata.test.common import LoggingServiceParent
2511+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
2512+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
2513+                                     SIGNED_PREFIX, MDMFHEADER, \
2514+                                     MDMFOFFSETS, SDMFSlotWriteProxy
2515+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
2516+                                 SDMF_VERSION
2517+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
2518 from allmydata.test.common_web import WebRenderingMixin
2519 from allmydata.web.storage import StorageStatus, remove_prefix
2520 
2521hunk ./src/allmydata/test/test_storage.py 106
2522 
2523 class RemoteBucket:
2524 
2525+    def __init__(self):
2526+        self.read_count = 0
2527+        self.write_count = 0
2528+
2529     def callRemote(self, methname, *args, **kwargs):
2530         def _call():
2531             meth = getattr(self.target, "remote_" + methname)
2532hunk ./src/allmydata/test/test_storage.py 114
2533             return meth(*args, **kwargs)
2534+
2535+        if methname == "slot_readv":
2536+            self.read_count += 1
2537+        if "writev" in methname:
2538+            self.write_count += 1
2539+
2540         return defer.maybeDeferred(_call)
2541 
2542hunk ./src/allmydata/test/test_storage.py 122
2543+
2544 class BucketProxy(unittest.TestCase):
2545     def make_bucket(self, name, size):
2546         basedir = os.path.join("storage", "BucketProxy", name)
2547hunk ./src/allmydata/test/test_storage.py 1313
2548         self.failUnless(os.path.exists(prefixdir), prefixdir)
2549         self.failIf(os.path.exists(bucketdir), bucketdir)
2550 
2551+
2552+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
2553+    def setUp(self):
2554+        self.sparent = LoggingServiceParent()
2555+        self._lease_secret = itertools.count()
2556+        self.ss = self.create("MDMFProxies storage test server")
2557+        self.rref = RemoteBucket()
2558+        self.rref.target = self.ss
2559+        self.secrets = (self.write_enabler("we_secret"),
2560+                        self.renew_secret("renew_secret"),
2561+                        self.cancel_secret("cancel_secret"))
2562+        self.segment = "aaaaaa"
2563+        self.block = "aa"
2564+        self.salt = "a" * 16
2565+        self.block_hash = "a" * 32
2566+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
2567+        self.share_hash = self.block_hash
2568+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
2569+        self.signature = "foobarbaz"
2570+        self.verification_key = "vvvvvv"
2571+        self.encprivkey = "private"
2572+        self.root_hash = self.block_hash
2573+        self.salt_hash = self.root_hash
2574+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
2575+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
2576+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
2577+        # blockhashes and salt hashes are serialized in the same way,
2578+        # only we lop off the first element and store that in the
2579+        # header.
2580+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
2581+
2582+
2583+    def tearDown(self):
2584+        self.sparent.stopService()
2585+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
2586+
2587+
2588+    def write_enabler(self, we_tag):
2589+        return hashutil.tagged_hash("we_blah", we_tag)
2590+
2591+
2592+    def renew_secret(self, tag):
2593+        return hashutil.tagged_hash("renew_blah", str(tag))
2594+
2595+
2596+    def cancel_secret(self, tag):
2597+        return hashutil.tagged_hash("cancel_blah", str(tag))
2598+
2599+
2600+    def workdir(self, name):
2601+        basedir = os.path.join("storage", "MutableServer", name)
2602+        return basedir
2603+
2604+
2605+    def create(self, name):
2606+        workdir = self.workdir(name)
2607+        ss = StorageServer(workdir, "\x00" * 20)
2608+        ss.setServiceParent(self.sparent)
2609+        return ss
2610+
2611+
2612+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
2613+        # Start with the checkstring
2614+        data = struct.pack(">BQ32s",
2615+                           1,
2616+                           0,
2617+                           self.root_hash)
2618+        self.checkstring = data
2619+        # Next, the encoding parameters
2620+        if tail_segment:
2621+            data += struct.pack(">BBQQ",
2622+                                3,
2623+                                10,
2624+                                6,
2625+                                33)
2626+        elif empty:
2627+            data += struct.pack(">BBQQ",
2628+                                3,
2629+                                10,
2630+                                0,
2631+                                0)
2632+        else:
2633+            data += struct.pack(">BBQQ",
2634+                                3,
2635+                                10,
2636+                                6,
2637+                                36)
2638+        # Now we'll build the offsets.
2639+        sharedata = ""
2640+        if not tail_segment and not empty:
2641+            for i in xrange(6):
2642+                sharedata += self.salt + self.block
2643+        elif tail_segment:
2644+            for i in xrange(5):
2645+                sharedata += self.salt + self.block
2646+            sharedata += self.salt + "a"
2647+
2648+        # The encrypted private key comes after the shares + salts
2649+        offset_size = struct.calcsize(MDMFOFFSETS)
2650+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
2651+        # The blockhashes come after the private key
2652+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
2653+        # The sharehashes come after the salt hashes
2654+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
2655+        # The signature comes after the share hash chain
2656+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
2657+        # The verification key comes after the signature
2658+        verification_offset = signature_offset + len(self.signature)
2659+        # The EOF comes after the verification key
2660+        eof_offset = verification_offset + len(self.verification_key)
2661+        data += struct.pack(MDMFOFFSETS,
2662+                            encrypted_private_key_offset,
2663+                            blockhashes_offset,
2664+                            sharehashes_offset,
2665+                            signature_offset,
2666+                            verification_offset,
2667+                            eof_offset)
2668+        self.offsets = {}
2669+        self.offsets['enc_privkey'] = encrypted_private_key_offset
2670+        self.offsets['block_hash_tree'] = blockhashes_offset
2671+        self.offsets['share_hash_chain'] = sharehashes_offset
2672+        self.offsets['signature'] = signature_offset
2673+        self.offsets['verification_key'] = verification_offset
2674+        self.offsets['EOF'] = eof_offset
2675+        # Next, we'll add in the salts and share data,
2676+        data += sharedata
2677+        # the private key,
2678+        data += self.encprivkey
2679+        # the block hash tree,
2680+        data += self.block_hash_tree_s
2681+        # the share hash chain,
2682+        data += self.share_hash_chain_s
2683+        # the signature,
2684+        data += self.signature
2685+        # and the verification key
2686+        data += self.verification_key
2687+        return data
2688+
2689+
2690+    def write_test_share_to_server(self,
2691+                                   storage_index,
2692+                                   tail_segment=False,
2693+                                   empty=False):
2694+        """
2695+        I write some data for the read tests to read to self.ss
2696+
2697+        If tail_segment=True, then I will write a share that has a
2698+        smaller tail segment than other segments.
2699+        """
2700+        write = self.ss.remote_slot_testv_and_readv_and_writev
2701+        data = self.build_test_mdmf_share(tail_segment, empty)
2702+        # Finally, we write the whole thing to the storage server in one
2703+        # pass.
2704+        testvs = [(0, 1, "eq", "")]
2705+        tws = {}
2706+        tws[0] = (testvs, [(0, data)], None)
2707+        readv = [(0, 1)]
2708+        results = write(storage_index, self.secrets, tws, readv)
2709+        self.failUnless(results[0])
2710+
2711+
2712+    def build_test_sdmf_share(self, empty=False):
2713+        if empty:
2714+            sharedata = ""
2715+        else:
2716+            sharedata = self.segment * 6
2717+        self.sharedata = sharedata
2718+        blocksize = len(sharedata) / 3
2719+        block = sharedata[:blocksize]
2720+        self.blockdata = block
2721+        prefix = struct.pack(">BQ32s16s BBQQ",
2722+                             0, # version,
2723+                             0,
2724+                             self.root_hash,
2725+                             self.salt,
2726+                             3,
2727+                             10,
2728+                             len(sharedata),
2729+                             len(sharedata),
2730+                            )
2731+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
2732+        signature_offset = post_offset + len(self.verification_key)
2733+        sharehashes_offset = signature_offset + len(self.signature)
2734+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
2735+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
2736+        encprivkey_offset = sharedata_offset + len(block)
2737+        eof_offset = encprivkey_offset + len(self.encprivkey)
2738+        offsets = struct.pack(">LLLLQQ",
2739+                              signature_offset,
2740+                              sharehashes_offset,
2741+                              blockhashes_offset,
2742+                              sharedata_offset,
2743+                              encprivkey_offset,
2744+                              eof_offset)
2745+        final_share = "".join([prefix,
2746+                           offsets,
2747+                           self.verification_key,
2748+                           self.signature,
2749+                           self.share_hash_chain_s,
2750+                           self.block_hash_tree_s,
2751+                           block,
2752+                           self.encprivkey])
2753+        self.offsets = {}
2754+        self.offsets['signature'] = signature_offset
2755+        self.offsets['share_hash_chain'] = sharehashes_offset
2756+        self.offsets['block_hash_tree'] = blockhashes_offset
2757+        self.offsets['share_data'] = sharedata_offset
2758+        self.offsets['enc_privkey'] = encprivkey_offset
2759+        self.offsets['EOF'] = eof_offset
2760+        return final_share
2761+
2762+
2763+    def write_sdmf_share_to_server(self,
2764+                                   storage_index,
2765+                                   empty=False):
2766+        # Some tests need SDMF shares to verify that we can still
2767+        # read them. This method writes one, which resembles but is not
2768+        assert self.rref
2769+        write = self.ss.remote_slot_testv_and_readv_and_writev
2770+        share = self.build_test_sdmf_share(empty)
2771+        testvs = [(0, 1, "eq", "")]
2772+        tws = {}
2773+        tws[0] = (testvs, [(0, share)], None)
2774+        readv = []
2775+        results = write(storage_index, self.secrets, tws, readv)
2776+        self.failUnless(results[0])
2777+
2778+
2779+    def test_read(self):
2780+        self.write_test_share_to_server("si1")
2781+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2782+        # Check that every method equals what we expect it to.
2783+        d = defer.succeed(None)
2784+        def _check_block_and_salt((block, salt)):
2785+            self.failUnlessEqual(block, self.block)
2786+            self.failUnlessEqual(salt, self.salt)
2787+
2788+        for i in xrange(6):
2789+            d.addCallback(lambda ignored, i=i:
2790+                mr.get_block_and_salt(i))
2791+            d.addCallback(_check_block_and_salt)
2792+
2793+        d.addCallback(lambda ignored:
2794+            mr.get_encprivkey())
2795+        d.addCallback(lambda encprivkey:
2796+            self.failUnlessEqual(self.encprivkey, encprivkey))
2797+
2798+        d.addCallback(lambda ignored:
2799+            mr.get_blockhashes())
2800+        d.addCallback(lambda blockhashes:
2801+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
2802+
2803+        d.addCallback(lambda ignored:
2804+            mr.get_sharehashes())
2805+        d.addCallback(lambda sharehashes:
2806+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
2807+
2808+        d.addCallback(lambda ignored:
2809+            mr.get_signature())
2810+        d.addCallback(lambda signature:
2811+            self.failUnlessEqual(signature, self.signature))
2812+
2813+        d.addCallback(lambda ignored:
2814+            mr.get_verification_key())
2815+        d.addCallback(lambda verification_key:
2816+            self.failUnlessEqual(verification_key, self.verification_key))
2817+
2818+        d.addCallback(lambda ignored:
2819+            mr.get_seqnum())
2820+        d.addCallback(lambda seqnum:
2821+            self.failUnlessEqual(seqnum, 0))
2822+
2823+        d.addCallback(lambda ignored:
2824+            mr.get_root_hash())
2825+        d.addCallback(lambda root_hash:
2826+            self.failUnlessEqual(self.root_hash, root_hash))
2827+
2828+        d.addCallback(lambda ignored:
2829+            mr.get_seqnum())
2830+        d.addCallback(lambda seqnum:
2831+            self.failUnlessEqual(0, seqnum))
2832+
2833+        d.addCallback(lambda ignored:
2834+            mr.get_encoding_parameters())
2835+        def _check_encoding_parameters((k, n, segsize, datalen)):
2836+            self.failUnlessEqual(k, 3)
2837+            self.failUnlessEqual(n, 10)
2838+            self.failUnlessEqual(segsize, 6)
2839+            self.failUnlessEqual(datalen, 36)
2840+        d.addCallback(_check_encoding_parameters)
2841+
2842+        d.addCallback(lambda ignored:
2843+            mr.get_checkstring())
2844+        d.addCallback(lambda checkstring:
2845+            self.failUnlessEqual(checkstring, checkstring))
2846+        return d
2847+
2848+
2849+    def test_read_with_different_tail_segment_size(self):
2850+        self.write_test_share_to_server("si1", tail_segment=True)
2851+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2852+        d = mr.get_block_and_salt(5)
2853+        def _check_tail_segment(results):
2854+            block, salt = results
2855+            self.failUnlessEqual(len(block), 1)
2856+            self.failUnlessEqual(block, "a")
2857+        d.addCallback(_check_tail_segment)
2858+        return d
2859+
2860+
2861+    def test_get_block_with_invalid_segnum(self):
2862+        self.write_test_share_to_server("si1")
2863+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2864+        d = defer.succeed(None)
2865+        d.addCallback(lambda ignored:
2866+            self.shouldFail(LayoutInvalid, "test invalid segnum",
2867+                            None,
2868+                            mr.get_block_and_salt, 7))
2869+        return d
2870+
2871+
2872+    def test_get_encoding_parameters_first(self):
2873+        self.write_test_share_to_server("si1")
2874+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2875+        d = mr.get_encoding_parameters()
2876+        def _check_encoding_parameters((k, n, segment_size, datalen)):
2877+            self.failUnlessEqual(k, 3)
2878+            self.failUnlessEqual(n, 10)
2879+            self.failUnlessEqual(segment_size, 6)
2880+            self.failUnlessEqual(datalen, 36)
2881+        d.addCallback(_check_encoding_parameters)
2882+        return d
2883+
2884+
2885+    def test_get_seqnum_first(self):
2886+        self.write_test_share_to_server("si1")
2887+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2888+        d = mr.get_seqnum()
2889+        d.addCallback(lambda seqnum:
2890+            self.failUnlessEqual(seqnum, 0))
2891+        return d
2892+
2893+
2894+    def test_get_root_hash_first(self):
2895+        self.write_test_share_to_server("si1")
2896+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2897+        d = mr.get_root_hash()
2898+        d.addCallback(lambda root_hash:
2899+            self.failUnlessEqual(root_hash, self.root_hash))
2900+        return d
2901+
2902+
2903+    def test_get_checkstring_first(self):
2904+        self.write_test_share_to_server("si1")
2905+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2906+        d = mr.get_checkstring()
2907+        d.addCallback(lambda checkstring:
2908+            self.failUnlessEqual(checkstring, self.checkstring))
2909+        return d
2910+
2911+
2912+    def test_write_read_vectors(self):
2913+        # When writing for us, the storage server will return to us a
2914+        # read vector, along with its result. If a write fails because
2915+        # the test vectors failed, this read vector can help us to
2916+        # diagnose the problem. This test ensures that the read vector
2917+        # is working appropriately.
2918+        mw = self._make_new_mw("si1", 0)
2919+
2920+        for i in xrange(6):
2921+            mw.put_block(self.block, i, self.salt)
2922+        mw.put_encprivkey(self.encprivkey)
2923+        mw.put_blockhashes(self.block_hash_tree)
2924+        mw.put_sharehashes(self.share_hash_chain)
2925+        mw.put_root_hash(self.root_hash)
2926+        mw.put_signature(self.signature)
2927+        mw.put_verification_key(self.verification_key)
2928+        d = mw.finish_publishing()
2929+        def _then(results):
2930+            self.failUnless(len(results), 2)
2931+            result, readv = results
2932+            self.failUnless(result)
2933+            self.failIf(readv)
2934+            self.old_checkstring = mw.get_checkstring()
2935+            mw.set_checkstring("")
2936+        d.addCallback(_then)
2937+        d.addCallback(lambda ignored:
2938+            mw.finish_publishing())
2939+        def _then_again(results):
2940+            self.failUnlessEqual(len(results), 2)
2941+            result, readvs = results
2942+            self.failIf(result)
2943+            self.failUnlessIn(0, readvs)
2944+            readv = readvs[0][0]
2945+            self.failUnlessEqual(readv, self.old_checkstring)
2946+        d.addCallback(_then_again)
2947+        # The checkstring remains the same for the rest of the process.
2948+        return d
2949+
2950+
2951+    def test_blockhashes_after_share_hash_chain(self):
2952+        mw = self._make_new_mw("si1", 0)
2953+        d = defer.succeed(None)
2954+        # Put everything up to and including the share hash chain
2955+        for i in xrange(6):
2956+            d.addCallback(lambda ignored, i=i:
2957+                mw.put_block(self.block, i, self.salt))
2958+        d.addCallback(lambda ignored:
2959+            mw.put_encprivkey(self.encprivkey))
2960+        d.addCallback(lambda ignored:
2961+            mw.put_blockhashes(self.block_hash_tree))
2962+        d.addCallback(lambda ignored:
2963+            mw.put_sharehashes(self.share_hash_chain))
2964+
2965+        # Now try to put the block hash tree again.
2966+        d.addCallback(lambda ignored:
2967+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
2968+                            None,
2969+                            mw.put_blockhashes, self.block_hash_tree))
2970+        return d
2971+
2972+
2973+    def test_encprivkey_after_blockhashes(self):
2974+        mw = self._make_new_mw("si1", 0)
2975+        d = defer.succeed(None)
2976+        # Put everything up to and including the block hash tree
2977+        for i in xrange(6):
2978+            d.addCallback(lambda ignored, i=i:
2979+                mw.put_block(self.block, i, self.salt))
2980+        d.addCallback(lambda ignored:
2981+            mw.put_encprivkey(self.encprivkey))
2982+        d.addCallback(lambda ignored:
2983+            mw.put_blockhashes(self.block_hash_tree))
2984+        d.addCallback(lambda ignored:
2985+            self.shouldFail(LayoutInvalid, "out of order private key",
2986+                            None,
2987+                            mw.put_encprivkey, self.encprivkey))
2988+        return d
2989+
2990+
2991+    def test_share_hash_chain_after_signature(self):
2992+        mw = self._make_new_mw("si1", 0)
2993+        d = defer.succeed(None)
2994+        # Put everything up to and including the signature
2995+        for i in xrange(6):
2996+            d.addCallback(lambda ignored, i=i:
2997+                mw.put_block(self.block, i, self.salt))
2998+        d.addCallback(lambda ignored:
2999+            mw.put_encprivkey(self.encprivkey))
3000+        d.addCallback(lambda ignored:
3001+            mw.put_blockhashes(self.block_hash_tree))
3002+        d.addCallback(lambda ignored:
3003+            mw.put_sharehashes(self.share_hash_chain))
3004+        d.addCallback(lambda ignored:
3005+            mw.put_root_hash(self.root_hash))
3006+        d.addCallback(lambda ignored:
3007+            mw.put_signature(self.signature))
3008+        # Now try to put the share hash chain again. This should fail
3009+        d.addCallback(lambda ignored:
3010+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
3011+                            None,
3012+                            mw.put_sharehashes, self.share_hash_chain))
3013+        return d
3014+
3015+
3016+    def test_signature_after_verification_key(self):
3017+        mw = self._make_new_mw("si1", 0)
3018+        d = defer.succeed(None)
3019+        # Put everything up to and including the verification key.
3020+        for i in xrange(6):
3021+            d.addCallback(lambda ignored, i=i:
3022+                mw.put_block(self.block, i, self.salt))
3023+        d.addCallback(lambda ignored:
3024+            mw.put_encprivkey(self.encprivkey))
3025+        d.addCallback(lambda ignored:
3026+            mw.put_blockhashes(self.block_hash_tree))
3027+        d.addCallback(lambda ignored:
3028+            mw.put_sharehashes(self.share_hash_chain))
3029+        d.addCallback(lambda ignored:
3030+            mw.put_root_hash(self.root_hash))
3031+        d.addCallback(lambda ignored:
3032+            mw.put_signature(self.signature))
3033+        d.addCallback(lambda ignored:
3034+            mw.put_verification_key(self.verification_key))
3035+        # Now try to put the signature again. This should fail
3036+        d.addCallback(lambda ignored:
3037+            self.shouldFail(LayoutInvalid, "signature after verification",
3038+                            None,
3039+                            mw.put_signature, self.signature))
3040+        return d
3041+
3042+
3043+    def test_uncoordinated_write(self):
3044+        # Make two mutable writers, both pointing to the same storage
3045+        # server, both at the same storage index, and try writing to the
3046+        # same share.
3047+        mw1 = self._make_new_mw("si1", 0)
3048+        mw2 = self._make_new_mw("si1", 0)
3049+
3050+        def _check_success(results):
3051+            result, readvs = results
3052+            self.failUnless(result)
3053+
3054+        def _check_failure(results):
3055+            result, readvs = results
3056+            self.failIf(result)
3057+
3058+        def _write_share(mw):
3059+            for i in xrange(6):
3060+                mw.put_block(self.block, i, self.salt)
3061+            mw.put_encprivkey(self.encprivkey)
3062+            mw.put_blockhashes(self.block_hash_tree)
3063+            mw.put_sharehashes(self.share_hash_chain)
3064+            mw.put_root_hash(self.root_hash)
3065+            mw.put_signature(self.signature)
3066+            mw.put_verification_key(self.verification_key)
3067+            return mw.finish_publishing()
3068+        d = _write_share(mw1)
3069+        d.addCallback(_check_success)
3070+        d.addCallback(lambda ignored:
3071+            _write_share(mw2))
3072+        d.addCallback(_check_failure)
3073+        return d
3074+
3075+
3076+    def test_invalid_salt_size(self):
3077+        # Salts need to be 16 bytes in size. Writes that attempt to
3078+        # write more or less than this should be rejected.
3079+        mw = self._make_new_mw("si1", 0)
3080+        invalid_salt = "a" * 17 # 17 bytes
3081+        another_invalid_salt = "b" * 15 # 15 bytes
3082+        d = defer.succeed(None)
3083+        d.addCallback(lambda ignored:
3084+            self.shouldFail(LayoutInvalid, "salt too big",
3085+                            None,
3086+                            mw.put_block, self.block, 0, invalid_salt))
3087+        d.addCallback(lambda ignored:
3088+            self.shouldFail(LayoutInvalid, "salt too small",
3089+                            None,
3090+                            mw.put_block, self.block, 0,
3091+                            another_invalid_salt))
3092+        return d
3093+
3094+
3095+    def test_write_test_vectors(self):
3096+        # If we give the write proxy a bogus test vector at
3097+        # any point during the process, it should fail to write when we
3098+        # tell it to write.
3099+        def _check_failure(results):
3100+            self.failUnlessEqual(len(results), 2)
3101+            res, d = results
3102+            self.failIf(res)
3103+
3104+        def _check_success(results):
3105+            self.failUnlessEqual(len(results), 2)
3106+            res, d = results
3107+            self.failUnless(results)
3108+
3109+        mw = self._make_new_mw("si1", 0)
3110+        mw.set_checkstring("this is a lie")
3111+        for i in xrange(6):
3112+            mw.put_block(self.block, i, self.salt)
3113+        mw.put_encprivkey(self.encprivkey)
3114+        mw.put_blockhashes(self.block_hash_tree)
3115+        mw.put_sharehashes(self.share_hash_chain)
3116+        mw.put_root_hash(self.root_hash)
3117+        mw.put_signature(self.signature)
3118+        mw.put_verification_key(self.verification_key)
3119+        d = mw.finish_publishing()
3120+        d.addCallback(_check_failure)
3121+        d.addCallback(lambda ignored:
3122+            mw.set_checkstring(""))
3123+        d.addCallback(lambda ignored:
3124+            mw.finish_publishing())
3125+        d.addCallback(_check_success)
3126+        return d
3127+
3128+
3129+    def serialize_blockhashes(self, blockhashes):
3130+        return "".join(blockhashes)
3131+
3132+
3133+    def serialize_sharehashes(self, sharehashes):
3134+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
3135+                        for i in sorted(sharehashes.keys())])
3136+        return ret
3137+
3138+
3139+    def test_write(self):
3140+        # This translates to a file with 6 6-byte segments, and with 2-byte
3141+        # blocks.
3142+        mw = self._make_new_mw("si1", 0)
3143+        # Test writing some blocks.
3144+        read = self.ss.remote_slot_readv
3145+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
3146+        written_block_size = 2 + len(self.salt)
3147+        written_block = self.block + self.salt
3148+        for i in xrange(6):
3149+            mw.put_block(self.block, i, self.salt)
3150+
3151+        mw.put_encprivkey(self.encprivkey)
3152+        mw.put_blockhashes(self.block_hash_tree)
3153+        mw.put_sharehashes(self.share_hash_chain)
3154+        mw.put_root_hash(self.root_hash)
3155+        mw.put_signature(self.signature)
3156+        mw.put_verification_key(self.verification_key)
3157+        d = mw.finish_publishing()
3158+        def _check_publish(results):
3159+            self.failUnlessEqual(len(results), 2)
3160+            result, ign = results
3161+            self.failUnless(result, "publish failed")
3162+            for i in xrange(6):
3163+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
3164+                                {0: [written_block]})
3165+
3166+            expected_private_key_offset = expected_sharedata_offset + \
3167+                                      len(written_block) * 6
3168+            self.failUnlessEqual(len(self.encprivkey), 7)
3169+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
3170+                                 {0: [self.encprivkey]})
3171+
3172+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
3173+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
3174+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
3175+                                 {0: [self.block_hash_tree_s]})
3176+
3177+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
3178+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
3179+                                 {0: [self.share_hash_chain_s]})
3180+
3181+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
3182+                                 {0: [self.root_hash]})
3183+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
3184+            self.failUnlessEqual(len(self.signature), 9)
3185+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
3186+                                 {0: [self.signature]})
3187+
3188+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
3189+            self.failUnlessEqual(len(self.verification_key), 6)
3190+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
3191+                                 {0: [self.verification_key]})
3192+
3193+            signable = mw.get_signable()
3194+            verno, seq, roothash, k, n, segsize, datalen = \
3195+                                            struct.unpack(">BQ32sBBQQ",
3196+                                                          signable)
3197+            self.failUnlessEqual(verno, 1)
3198+            self.failUnlessEqual(seq, 0)
3199+            self.failUnlessEqual(roothash, self.root_hash)
3200+            self.failUnlessEqual(k, 3)
3201+            self.failUnlessEqual(n, 10)
3202+            self.failUnlessEqual(segsize, 6)
3203+            self.failUnlessEqual(datalen, 36)
3204+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
3205+
3206+            # Check the version number to make sure that it is correct.
3207+            expected_version_number = struct.pack(">B", 1)
3208+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
3209+                                 {0: [expected_version_number]})
3210+            # Check the sequence number to make sure that it is correct
3211+            expected_sequence_number = struct.pack(">Q", 0)
3212+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
3213+                                 {0: [expected_sequence_number]})
3214+            # Check that the encoding parameters (k, N, segement size, data
3215+            # length) are what they should be. These are  3, 10, 6, 36
3216+            expected_k = struct.pack(">B", 3)
3217+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
3218+                                 {0: [expected_k]})
3219+            expected_n = struct.pack(">B", 10)
3220+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
3221+                                 {0: [expected_n]})
3222+            expected_segment_size = struct.pack(">Q", 6)
3223+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
3224+                                 {0: [expected_segment_size]})
3225+            expected_data_length = struct.pack(">Q", 36)
3226+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
3227+                                 {0: [expected_data_length]})
3228+            expected_offset = struct.pack(">Q", expected_private_key_offset)
3229+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
3230+                                 {0: [expected_offset]})
3231+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
3232+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
3233+                                 {0: [expected_offset]})
3234+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
3235+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
3236+                                 {0: [expected_offset]})
3237+            expected_offset = struct.pack(">Q", expected_signature_offset)
3238+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
3239+                                 {0: [expected_offset]})
3240+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
3241+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
3242+                                 {0: [expected_offset]})
3243+            expected_offset = struct.pack(">Q", expected_eof_offset)
3244+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
3245+                                 {0: [expected_offset]})
3246+        d.addCallback(_check_publish)
3247+        return d
3248+
3249+    def _make_new_mw(self, si, share, datalength=36):
3250+        # This is a file of size 36 bytes. Since it has a segment
3251+        # size of 6, we know that it has 6 byte segments, which will
3252+        # be split into blocks of 2 bytes because our FEC k
3253+        # parameter is 3.
3254+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
3255+                                6, datalength)
3256+        return mw
3257+
3258+
3259+    def test_write_rejected_with_too_many_blocks(self):
3260+        mw = self._make_new_mw("si0", 0)
3261+
3262+        # Try writing too many blocks. We should not be able to write
3263+        # more than 6
3264+        # blocks into each share.
3265+        d = defer.succeed(None)
3266+        for i in xrange(6):
3267+            d.addCallback(lambda ignored, i=i:
3268+                mw.put_block(self.block, i, self.salt))
3269+        d.addCallback(lambda ignored:
3270+            self.shouldFail(LayoutInvalid, "too many blocks",
3271+                            None,
3272+                            mw.put_block, self.block, 7, self.salt))
3273+        return d
3274+
3275+
3276+    def test_write_rejected_with_invalid_salt(self):
3277+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
3278+        # less should cause an error.
3279+        mw = self._make_new_mw("si1", 0)
3280+        bad_salt = "a" * 17 # 17 bytes
3281+        d = defer.succeed(None)
3282+        d.addCallback(lambda ignored:
3283+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
3284+                            None, mw.put_block, self.block, 7, bad_salt))
3285+        return d
3286+
3287+
3288+    def test_write_rejected_with_invalid_root_hash(self):
3289+        # Try writing an invalid root hash. This should be SHA256d, and
3290+        # 32 bytes long as a result.
3291+        mw = self._make_new_mw("si2", 0)
3292+        # 17 bytes != 32 bytes
3293+        invalid_root_hash = "a" * 17
3294+        d = defer.succeed(None)
3295+        # Before this test can work, we need to put some blocks + salts,
3296+        # a block hash tree, and a share hash tree. Otherwise, we'll see
3297+        # failures that match what we are looking for, but are caused by
3298+        # the constraints imposed on operation ordering.
3299+        for i in xrange(6):
3300+            d.addCallback(lambda ignored, i=i:
3301+                mw.put_block(self.block, i, self.salt))
3302+        d.addCallback(lambda ignored:
3303+            mw.put_encprivkey(self.encprivkey))
3304+        d.addCallback(lambda ignored:
3305+            mw.put_blockhashes(self.block_hash_tree))
3306+        d.addCallback(lambda ignored:
3307+            mw.put_sharehashes(self.share_hash_chain))
3308+        d.addCallback(lambda ignored:
3309+            self.shouldFail(LayoutInvalid, "invalid root hash",
3310+                            None, mw.put_root_hash, invalid_root_hash))
3311+        return d
3312+
3313+
3314+    def test_write_rejected_with_invalid_blocksize(self):
3315+        # The blocksize implied by the writer that we get from
3316+        # _make_new_mw is 2bytes -- any more or any less than this
3317+        # should be cause for failure, unless it is the tail segment, in
3318+        # which case it may not be failure.
3319+        invalid_block = "a"
3320+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
3321+                                             # one byte blocks
3322+        # 1 bytes != 2 bytes
3323+        d = defer.succeed(None)
3324+        d.addCallback(lambda ignored, invalid_block=invalid_block:
3325+            self.shouldFail(LayoutInvalid, "test blocksize too small",
3326+                            None, mw.put_block, invalid_block, 0,
3327+                            self.salt))
3328+        invalid_block = invalid_block * 3
3329+        # 3 bytes != 2 bytes
3330+        d.addCallback(lambda ignored:
3331+            self.shouldFail(LayoutInvalid, "test blocksize too large",
3332+                            None,
3333+                            mw.put_block, invalid_block, 0, self.salt))
3334+        for i in xrange(5):
3335+            d.addCallback(lambda ignored, i=i:
3336+                mw.put_block(self.block, i, self.salt))
3337+        # Try to put an invalid tail segment
3338+        d.addCallback(lambda ignored:
3339+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
3340+                            None,
3341+                            mw.put_block, self.block, 5, self.salt))
3342+        valid_block = "a"
3343+        d.addCallback(lambda ignored:
3344+            mw.put_block(valid_block, 5, self.salt))
3345+        return d
3346+
3347+
3348+    def test_write_enforces_order_constraints(self):
3349+        # We require that the MDMFSlotWriteProxy be interacted with in a
3350+        # specific way.
3351+        # That way is:
3352+        # 0: __init__
3353+        # 1: write blocks and salts
3354+        # 2: Write the encrypted private key
3355+        # 3: Write the block hashes
3356+        # 4: Write the share hashes
3357+        # 5: Write the root hash and salt hash
3358+        # 6: Write the signature and verification key
3359+        # 7: Write the file.
3360+        #
3361+        # Some of these can be performed out-of-order, and some can't.
3362+        # The dependencies that I want to test here are:
3363+        #  - Private key before block hashes
3364+        #  - share hashes and block hashes before root hash
3365+        #  - root hash before signature
3366+        #  - signature before verification key
3367+        mw0 = self._make_new_mw("si0", 0)
3368+        # Write some shares
3369+        d = defer.succeed(None)
3370+        for i in xrange(6):
3371+            d.addCallback(lambda ignored, i=i:
3372+                mw0.put_block(self.block, i, self.salt))
3373+        # Try to write the block hashes before writing the encrypted
3374+        # private key
3375+        d.addCallback(lambda ignored:
3376+            self.shouldFail(LayoutInvalid, "block hashes before key",
3377+                            None, mw0.put_blockhashes,
3378+                            self.block_hash_tree))
3379+
3380+        # Write the private key.
3381+        d.addCallback(lambda ignored:
3382+            mw0.put_encprivkey(self.encprivkey))
3383+
3384+
3385+        # Try to write the share hash chain without writing the block
3386+        # hash tree
3387+        d.addCallback(lambda ignored:
3388+            self.shouldFail(LayoutInvalid, "share hash chain before "
3389+                                           "salt hash tree",
3390+                            None,
3391+                            mw0.put_sharehashes, self.share_hash_chain))
3392+
3393+        # Try to write the root hash and without writing either the
3394+        # block hashes or the or the share hashes
3395+        d.addCallback(lambda ignored:
3396+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
3397+                            None,
3398+                            mw0.put_root_hash, self.root_hash))
3399+
3400+        # Now write the block hashes and try again
3401+        d.addCallback(lambda ignored:
3402+            mw0.put_blockhashes(self.block_hash_tree))
3403+
3404+        d.addCallback(lambda ignored:
3405+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
3406+                            None, mw0.put_root_hash, self.root_hash))
3407+
3408+        # We haven't yet put the root hash on the share, so we shouldn't
3409+        # be able to sign it.
3410+        d.addCallback(lambda ignored:
3411+            self.shouldFail(LayoutInvalid, "signature before root hash",
3412+                            None, mw0.put_signature, self.signature))
3413+
3414+        d.addCallback(lambda ignored:
3415+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
3416+
3417+        # ..and, since that fails, we also shouldn't be able to put the
3418+        # verification key.
3419+        d.addCallback(lambda ignored:
3420+            self.shouldFail(LayoutInvalid, "key before signature",
3421+                            None, mw0.put_verification_key,
3422+                            self.verification_key))
3423+
3424+        # Now write the share hashes.
3425+        d.addCallback(lambda ignored:
3426+            mw0.put_sharehashes(self.share_hash_chain))
3427+        # We should be able to write the root hash now too
3428+        d.addCallback(lambda ignored:
3429+            mw0.put_root_hash(self.root_hash))
3430+
3431+        # We should still be unable to put the verification key
3432+        d.addCallback(lambda ignored:
3433+            self.shouldFail(LayoutInvalid, "key before signature",
3434+                            None, mw0.put_verification_key,
3435+                            self.verification_key))
3436+
3437+        d.addCallback(lambda ignored:
3438+            mw0.put_signature(self.signature))
3439+
3440+        # We shouldn't be able to write the offsets to the remote server
3441+        # until the offset table is finished; IOW, until we have written
3442+        # the verification key.
3443+        d.addCallback(lambda ignored:
3444+            self.shouldFail(LayoutInvalid, "offsets before verification key",
3445+                            None,
3446+                            mw0.finish_publishing))
3447+
3448+        d.addCallback(lambda ignored:
3449+            mw0.put_verification_key(self.verification_key))
3450+        return d
3451+
3452+
3453+    def test_end_to_end(self):
3454+        mw = self._make_new_mw("si1", 0)
3455+        # Write a share using the mutable writer, and make sure that the
3456+        # reader knows how to read everything back to us.
3457+        d = defer.succeed(None)
3458+        for i in xrange(6):
3459+            d.addCallback(lambda ignored, i=i:
3460+                mw.put_block(self.block, i, self.salt))
3461+        d.addCallback(lambda ignored:
3462+            mw.put_encprivkey(self.encprivkey))
3463+        d.addCallback(lambda ignored:
3464+            mw.put_blockhashes(self.block_hash_tree))
3465+        d.addCallback(lambda ignored:
3466+            mw.put_sharehashes(self.share_hash_chain))
3467+        d.addCallback(lambda ignored:
3468+            mw.put_root_hash(self.root_hash))
3469+        d.addCallback(lambda ignored:
3470+            mw.put_signature(self.signature))
3471+        d.addCallback(lambda ignored:
3472+            mw.put_verification_key(self.verification_key))
3473+        d.addCallback(lambda ignored:
3474+            mw.finish_publishing())
3475+
3476+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3477+        def _check_block_and_salt((block, salt)):
3478+            self.failUnlessEqual(block, self.block)
3479+            self.failUnlessEqual(salt, self.salt)
3480+
3481+        for i in xrange(6):
3482+            d.addCallback(lambda ignored, i=i:
3483+                mr.get_block_and_salt(i))
3484+            d.addCallback(_check_block_and_salt)
3485+
3486+        d.addCallback(lambda ignored:
3487+            mr.get_encprivkey())
3488+        d.addCallback(lambda encprivkey:
3489+            self.failUnlessEqual(self.encprivkey, encprivkey))
3490+
3491+        d.addCallback(lambda ignored:
3492+            mr.get_blockhashes())
3493+        d.addCallback(lambda blockhashes:
3494+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
3495+
3496+        d.addCallback(lambda ignored:
3497+            mr.get_sharehashes())
3498+        d.addCallback(lambda sharehashes:
3499+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
3500+
3501+        d.addCallback(lambda ignored:
3502+            mr.get_signature())
3503+        d.addCallback(lambda signature:
3504+            self.failUnlessEqual(signature, self.signature))
3505+
3506+        d.addCallback(lambda ignored:
3507+            mr.get_verification_key())
3508+        d.addCallback(lambda verification_key:
3509+            self.failUnlessEqual(verification_key, self.verification_key))
3510+
3511+        d.addCallback(lambda ignored:
3512+            mr.get_seqnum())
3513+        d.addCallback(lambda seqnum:
3514+            self.failUnlessEqual(seqnum, 0))
3515+
3516+        d.addCallback(lambda ignored:
3517+            mr.get_root_hash())
3518+        d.addCallback(lambda root_hash:
3519+            self.failUnlessEqual(self.root_hash, root_hash))
3520+
3521+        d.addCallback(lambda ignored:
3522+            mr.get_encoding_parameters())
3523+        def _check_encoding_parameters((k, n, segsize, datalen)):
3524+            self.failUnlessEqual(k, 3)
3525+            self.failUnlessEqual(n, 10)
3526+            self.failUnlessEqual(segsize, 6)
3527+            self.failUnlessEqual(datalen, 36)
3528+        d.addCallback(_check_encoding_parameters)
3529+
3530+        d.addCallback(lambda ignored:
3531+            mr.get_checkstring())
3532+        d.addCallback(lambda checkstring:
3533+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
3534+        return d
3535+
3536+
3537+    def test_is_sdmf(self):
3538+        # The MDMFSlotReadProxy should also know how to read SDMF files,
3539+        # since it will encounter them on the grid. Callers use the
3540+        # is_sdmf method to test this.
3541+        self.write_sdmf_share_to_server("si1")
3542+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3543+        d = mr.is_sdmf()
3544+        d.addCallback(lambda issdmf:
3545+            self.failUnless(issdmf))
3546+        return d
3547+
3548+
3549+    def test_reads_sdmf(self):
3550+        # The slot read proxy should, naturally, know how to tell us
3551+        # about data in the SDMF format
3552+        self.write_sdmf_share_to_server("si1")
3553+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3554+        d = defer.succeed(None)
3555+        d.addCallback(lambda ignored:
3556+            mr.is_sdmf())
3557+        d.addCallback(lambda issdmf:
3558+            self.failUnless(issdmf))
3559+
3560+        # What do we need to read?
3561+        #  - The sharedata
3562+        #  - The salt
3563+        d.addCallback(lambda ignored:
3564+            mr.get_block_and_salt(0))
3565+        def _check_block_and_salt(results):
3566+            block, salt = results
3567+            # Our original file is 36 bytes long. Then each share is 12
3568+            # bytes in size. The share is composed entirely of the
3569+            # letter a. self.block contains 2 as, so 6 * self.block is
3570+            # what we are looking for.
3571+            self.failUnlessEqual(block, self.block * 6)
3572+            self.failUnlessEqual(salt, self.salt)
3573+        d.addCallback(_check_block_and_salt)
3574+
3575+        #  - The blockhashes
3576+        d.addCallback(lambda ignored:
3577+            mr.get_blockhashes())
3578+        d.addCallback(lambda blockhashes:
3579+            self.failUnlessEqual(self.block_hash_tree,
3580+                                 blockhashes,
3581+                                 blockhashes))
3582+        #  - The sharehashes
3583+        d.addCallback(lambda ignored:
3584+            mr.get_sharehashes())
3585+        d.addCallback(lambda sharehashes:
3586+            self.failUnlessEqual(self.share_hash_chain,
3587+                                 sharehashes))
3588+        #  - The keys
3589+        d.addCallback(lambda ignored:
3590+            mr.get_encprivkey())
3591+        d.addCallback(lambda encprivkey:
3592+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
3593+        d.addCallback(lambda ignored:
3594+            mr.get_verification_key())
3595+        d.addCallback(lambda verification_key:
3596+            self.failUnlessEqual(verification_key,
3597+                                 self.verification_key,
3598+                                 verification_key))
3599+        #  - The signature
3600+        d.addCallback(lambda ignored:
3601+            mr.get_signature())
3602+        d.addCallback(lambda signature:
3603+            self.failUnlessEqual(signature, self.signature, signature))
3604+
3605+        #  - The sequence number
3606+        d.addCallback(lambda ignored:
3607+            mr.get_seqnum())
3608+        d.addCallback(lambda seqnum:
3609+            self.failUnlessEqual(seqnum, 0, seqnum))
3610+
3611+        #  - The root hash
3612+        d.addCallback(lambda ignored:
3613+            mr.get_root_hash())
3614+        d.addCallback(lambda root_hash:
3615+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
3616+        return d
3617+
3618+
3619+    def test_only_reads_one_segment_sdmf(self):
3620+        # SDMF shares have only one segment, so it doesn't make sense to
3621+        # read more segments than that. The reader should know this and
3622+        # complain if we try to do that.
3623+        self.write_sdmf_share_to_server("si1")
3624+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3625+        d = defer.succeed(None)
3626+        d.addCallback(lambda ignored:
3627+            mr.is_sdmf())
3628+        d.addCallback(lambda issdmf:
3629+            self.failUnless(issdmf))
3630+        d.addCallback(lambda ignored:
3631+            self.shouldFail(LayoutInvalid, "test bad segment",
3632+                            None,
3633+                            mr.get_block_and_salt, 1))
3634+        return d
3635+
3636+
3637+    def test_read_with_prefetched_mdmf_data(self):
3638+        # The MDMFSlotReadProxy will prefill certain fields if you pass
3639+        # it data that you have already fetched. This is useful for
3640+        # cases like the Servermap, which prefetches ~2kb of data while
3641+        # finding out which shares are on the remote peer so that it
3642+        # doesn't waste round trips.
3643+        mdmf_data = self.build_test_mdmf_share()
3644+        self.write_test_share_to_server("si1")
3645+        def _make_mr(ignored, length):
3646+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
3647+            return mr
3648+
3649+        d = defer.succeed(None)
3650+        # This should be enough to fill in both the encoding parameters
3651+        # and the table of offsets, which will complete the version
3652+        # information tuple.
3653+        d.addCallback(_make_mr, 107)
3654+        d.addCallback(lambda mr:
3655+            mr.get_verinfo())
3656+        def _check_verinfo(verinfo):
3657+            self.failUnless(verinfo)
3658+            self.failUnlessEqual(len(verinfo), 9)
3659+            (seqnum,
3660+             root_hash,
3661+             salt_hash,
3662+             segsize,
3663+             datalen,
3664+             k,
3665+             n,
3666+             prefix,
3667+             offsets) = verinfo
3668+            self.failUnlessEqual(seqnum, 0)
3669+            self.failUnlessEqual(root_hash, self.root_hash)
3670+            self.failUnlessEqual(segsize, 6)
3671+            self.failUnlessEqual(datalen, 36)
3672+            self.failUnlessEqual(k, 3)
3673+            self.failUnlessEqual(n, 10)
3674+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
3675+                                          1,
3676+                                          seqnum,
3677+                                          root_hash,
3678+                                          k,
3679+                                          n,
3680+                                          segsize,
3681+                                          datalen)
3682+            self.failUnlessEqual(expected_prefix, prefix)
3683+            self.failUnlessEqual(self.rref.read_count, 0)
3684+        d.addCallback(_check_verinfo)
3685+        # This is not enough data to read a block and a share, so the
3686+        # wrapper should attempt to read this from the remote server.
3687+        d.addCallback(_make_mr, 107)
3688+        d.addCallback(lambda mr:
3689+            mr.get_block_and_salt(0))
3690+        def _check_block_and_salt((block, salt)):
3691+            self.failUnlessEqual(block, self.block)
3692+            self.failUnlessEqual(salt, self.salt)
3693+            self.failUnlessEqual(self.rref.read_count, 1)
3694+        # This should be enough data to read one block.
3695+        d.addCallback(_make_mr, 249)
3696+        d.addCallback(lambda mr:
3697+            mr.get_block_and_salt(0))
3698+        d.addCallback(_check_block_and_salt)
3699+        return d
3700+
3701+
3702+    def test_read_with_prefetched_sdmf_data(self):
3703+        sdmf_data = self.build_test_sdmf_share()
3704+        self.write_sdmf_share_to_server("si1")
3705+        def _make_mr(ignored, length):
3706+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
3707+            return mr
3708+
3709+        d = defer.succeed(None)
3710+        # This should be enough to get us the encoding parameters,
3711+        # offset table, and everything else we need to build a verinfo
3712+        # string.
3713+        d.addCallback(_make_mr, 107)
3714+        d.addCallback(lambda mr:
3715+            mr.get_verinfo())
3716+        def _check_verinfo(verinfo):
3717+            self.failUnless(verinfo)
3718+            self.failUnlessEqual(len(verinfo), 9)
3719+            (seqnum,
3720+             root_hash,
3721+             salt,
3722+             segsize,
3723+             datalen,
3724+             k,
3725+             n,
3726+             prefix,
3727+             offsets) = verinfo
3728+            self.failUnlessEqual(seqnum, 0)
3729+            self.failUnlessEqual(root_hash, self.root_hash)
3730+            self.failUnlessEqual(salt, self.salt)
3731+            self.failUnlessEqual(segsize, 36)
3732+            self.failUnlessEqual(datalen, 36)
3733+            self.failUnlessEqual(k, 3)
3734+            self.failUnlessEqual(n, 10)
3735+            expected_prefix = struct.pack(SIGNED_PREFIX,
3736+                                          0,
3737+                                          seqnum,
3738+                                          root_hash,
3739+                                          salt,
3740+                                          k,
3741+                                          n,
3742+                                          segsize,
3743+                                          datalen)
3744+            self.failUnlessEqual(expected_prefix, prefix)
3745+            self.failUnlessEqual(self.rref.read_count, 0)
3746+        d.addCallback(_check_verinfo)
3747+        # This shouldn't be enough to read any share data.
3748+        d.addCallback(_make_mr, 107)
3749+        d.addCallback(lambda mr:
3750+            mr.get_block_and_salt(0))
3751+        def _check_block_and_salt((block, salt)):
3752+            self.failUnlessEqual(block, self.block * 6)
3753+            self.failUnlessEqual(salt, self.salt)
3754+            # TODO: Fix the read routine so that it reads only the data
3755+            #       that it has cached if it can't read all of it.
3756+            self.failUnlessEqual(self.rref.read_count, 2)
3757+
3758+        # This should be enough to read share data.
3759+        d.addCallback(_make_mr, self.offsets['share_data'])
3760+        d.addCallback(lambda mr:
3761+            mr.get_block_and_salt(0))
3762+        d.addCallback(_check_block_and_salt)
3763+        return d
3764+
3765+
3766+    def test_read_with_empty_mdmf_file(self):
3767+        # Some tests upload a file with no contents to test things
3768+        # unrelated to the actual handling of the content of the file.
3769+        # The reader should behave intelligently in these cases.
3770+        self.write_test_share_to_server("si1", empty=True)
3771+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3772+        # We should be able to get the encoding parameters, and they
3773+        # should be correct.
3774+        d = defer.succeed(None)
3775+        d.addCallback(lambda ignored:
3776+            mr.get_encoding_parameters())
3777+        def _check_encoding_parameters(params):
3778+            self.failUnlessEqual(len(params), 4)
3779+            k, n, segsize, datalen = params
3780+            self.failUnlessEqual(k, 3)
3781+            self.failUnlessEqual(n, 10)
3782+            self.failUnlessEqual(segsize, 0)
3783+            self.failUnlessEqual(datalen, 0)
3784+        d.addCallback(_check_encoding_parameters)
3785+
3786+        # We should not be able to fetch a block, since there are no
3787+        # blocks to fetch
3788+        d.addCallback(lambda ignored:
3789+            self.shouldFail(LayoutInvalid, "get block on empty file",
3790+                            None,
3791+                            mr.get_block_and_salt, 0))
3792+        return d
3793+
3794+
3795+    def test_read_with_empty_sdmf_file(self):
3796+        self.write_sdmf_share_to_server("si1", empty=True)
3797+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3798+        # We should be able to get the encoding parameters, and they
3799+        # should be correct
3800+        d = defer.succeed(None)
3801+        d.addCallback(lambda ignored:
3802+            mr.get_encoding_parameters())
3803+        def _check_encoding_parameters(params):
3804+            self.failUnlessEqual(len(params), 4)
3805+            k, n, segsize, datalen = params
3806+            self.failUnlessEqual(k, 3)
3807+            self.failUnlessEqual(n, 10)
3808+            self.failUnlessEqual(segsize, 0)
3809+            self.failUnlessEqual(datalen, 0)
3810+        d.addCallback(_check_encoding_parameters)
3811+
3812+        # It does not make sense to get a block in this format, so we
3813+        # should not be able to.
3814+        d.addCallback(lambda ignored:
3815+            self.shouldFail(LayoutInvalid, "get block on an empty file",
3816+                            None,
3817+                            mr.get_block_and_salt, 0))
3818+        return d
3819+
3820+
3821+    def test_verinfo_with_sdmf_file(self):
3822+        self.write_sdmf_share_to_server("si1")
3823+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3824+        # We should be able to get the version information.
3825+        d = defer.succeed(None)
3826+        d.addCallback(lambda ignored:
3827+            mr.get_verinfo())
3828+        def _check_verinfo(verinfo):
3829+            self.failUnless(verinfo)
3830+            self.failUnlessEqual(len(verinfo), 9)
3831+            (seqnum,
3832+             root_hash,
3833+             salt,
3834+             segsize,
3835+             datalen,
3836+             k,
3837+             n,
3838+             prefix,
3839+             offsets) = verinfo
3840+            self.failUnlessEqual(seqnum, 0)
3841+            self.failUnlessEqual(root_hash, self.root_hash)
3842+            self.failUnlessEqual(salt, self.salt)
3843+            self.failUnlessEqual(segsize, 36)
3844+            self.failUnlessEqual(datalen, 36)
3845+            self.failUnlessEqual(k, 3)
3846+            self.failUnlessEqual(n, 10)
3847+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
3848+                                          0,
3849+                                          seqnum,
3850+                                          root_hash,
3851+                                          salt,
3852+                                          k,
3853+                                          n,
3854+                                          segsize,
3855+                                          datalen)
3856+            self.failUnlessEqual(prefix, expected_prefix)
3857+            self.failUnlessEqual(offsets, self.offsets)
3858+        d.addCallback(_check_verinfo)
3859+        return d
3860+
3861+
3862+    def test_verinfo_with_mdmf_file(self):
3863+        self.write_test_share_to_server("si1")
3864+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3865+        d = defer.succeed(None)
3866+        d.addCallback(lambda ignored:
3867+            mr.get_verinfo())
3868+        def _check_verinfo(verinfo):
3869+            self.failUnless(verinfo)
3870+            self.failUnlessEqual(len(verinfo), 9)
3871+            (seqnum,
3872+             root_hash,
3873+             IV,
3874+             segsize,
3875+             datalen,
3876+             k,
3877+             n,
3878+             prefix,
3879+             offsets) = verinfo
3880+            self.failUnlessEqual(seqnum, 0)
3881+            self.failUnlessEqual(root_hash, self.root_hash)
3882+            self.failIf(IV)
3883+            self.failUnlessEqual(segsize, 6)
3884+            self.failUnlessEqual(datalen, 36)
3885+            self.failUnlessEqual(k, 3)
3886+            self.failUnlessEqual(n, 10)
3887+            expected_prefix = struct.pack(">BQ32s BBQQ",
3888+                                          1,
3889+                                          seqnum,
3890+                                          root_hash,
3891+                                          k,
3892+                                          n,
3893+                                          segsize,
3894+                                          datalen)
3895+            self.failUnlessEqual(prefix, expected_prefix)
3896+            self.failUnlessEqual(offsets, self.offsets)
3897+        d.addCallback(_check_verinfo)
3898+        return d
3899+
3900+
3901+    def test_reader_queue(self):
3902+        self.write_test_share_to_server('si1')
3903+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3904+        d1 = mr.get_block_and_salt(0, queue=True)
3905+        d2 = mr.get_blockhashes(queue=True)
3906+        d3 = mr.get_sharehashes(queue=True)
3907+        d4 = mr.get_signature(queue=True)
3908+        d5 = mr.get_verification_key(queue=True)
3909+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
3910+        mr.flush()
3911+        def _print(results):
3912+            self.failUnlessEqual(len(results), 5)
3913+            # We have one read for version information and offsets, and
3914+            # one for everything else.
3915+            self.failUnlessEqual(self.rref.read_count, 2)
3916+            block, salt = results[0][1] # results[0] is a boolean that says
3917+                                           # whether or not the operation
3918+                                           # worked.
3919+            self.failUnlessEqual(self.block, block)
3920+            self.failUnlessEqual(self.salt, salt)
3921+
3922+            blockhashes = results[1][1]
3923+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
3924+
3925+            sharehashes = results[2][1]
3926+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
3927+
3928+            signature = results[3][1]
3929+            self.failUnlessEqual(self.signature, signature)
3930+
3931+            verification_key = results[4][1]
3932+            self.failUnlessEqual(self.verification_key, verification_key)
3933+        dl.addCallback(_print)
3934+        return dl
3935+
3936+
3937+    def test_sdmf_writer(self):
3938+        # Go through the motions of writing an SDMF share to the storage
3939+        # server. Then read the storage server to see that the share got
3940+        # written in the way that we think it should have.
3941+
3942+        # We do this first so that the necessary instance variables get
3943+        # set the way we want them for the tests below.
3944+        data = self.build_test_sdmf_share()
3945+        sdmfr = SDMFSlotWriteProxy(0,
3946+                                   self.rref,
3947+                                   "si1",
3948+                                   self.secrets,
3949+                                   0, 3, 10, 36, 36)
3950+        # Put the block and salt.
3951+        sdmfr.put_block(self.blockdata, 0, self.salt)
3952+
3953+        # Put the encprivkey
3954+        sdmfr.put_encprivkey(self.encprivkey)
3955+
3956+        # Put the block and share hash chains
3957+        sdmfr.put_blockhashes(self.block_hash_tree)
3958+        sdmfr.put_sharehashes(self.share_hash_chain)
3959+        sdmfr.put_root_hash(self.root_hash)
3960+
3961+        # Put the signature
3962+        sdmfr.put_signature(self.signature)
3963+
3964+        # Put the verification key
3965+        sdmfr.put_verification_key(self.verification_key)
3966+
3967+        # Now check to make sure that nothing has been written yet.
3968+        self.failUnlessEqual(self.rref.write_count, 0)
3969+
3970+        # Now finish publishing
3971+        d = sdmfr.finish_publishing()
3972+        def _then(ignored):
3973+            self.failUnlessEqual(self.rref.write_count, 1)
3974+            read = self.ss.remote_slot_readv
3975+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
3976+                                 {0: [data]})
3977+        d.addCallback(_then)
3978+        return d
3979+
3980+
3981+    def test_sdmf_writer_preexisting_share(self):
3982+        data = self.build_test_sdmf_share()
3983+        self.write_sdmf_share_to_server("si1")
3984+
3985+        # Now there is a share on the storage server. To successfully
3986+        # write, we need to set the checkstring correctly. When we
3987+        # don't, no write should occur.
3988+        sdmfw = SDMFSlotWriteProxy(0,
3989+                                   self.rref,
3990+                                   "si1",
3991+                                   self.secrets,
3992+                                   1, 3, 10, 36, 36)
3993+        sdmfw.put_block(self.blockdata, 0, self.salt)
3994+
3995+        # Put the encprivkey
3996+        sdmfw.put_encprivkey(self.encprivkey)
3997+
3998+        # Put the block and share hash chains
3999+        sdmfw.put_blockhashes(self.block_hash_tree)
4000+        sdmfw.put_sharehashes(self.share_hash_chain)
4001+
4002+        # Put the root hash
4003+        sdmfw.put_root_hash(self.root_hash)
4004+
4005+        # Put the signature
4006+        sdmfw.put_signature(self.signature)
4007+
4008+        # Put the verification key
4009+        sdmfw.put_verification_key(self.verification_key)
4010+
4011+        # We shouldn't have a checkstring yet
4012+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
4013+
4014+        d = sdmfw.finish_publishing()
4015+        def _then(results):
4016+            self.failIf(results[0])
4017+            # this is the correct checkstring
4018+            self._expected_checkstring = results[1][0][0]
4019+            return self._expected_checkstring
4020+
4021+        d.addCallback(_then)
4022+        d.addCallback(sdmfw.set_checkstring)
4023+        d.addCallback(lambda ignored:
4024+            sdmfw.get_checkstring())
4025+        d.addCallback(lambda checkstring:
4026+            self.failUnlessEqual(checkstring, self._expected_checkstring))
4027+        d.addCallback(lambda ignored:
4028+            sdmfw.finish_publishing())
4029+        def _then_again(results):
4030+            self.failUnless(results[0])
4031+            read = self.ss.remote_slot_readv
4032+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
4033+                                 {0: [struct.pack(">Q", 1)]})
4034+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
4035+                                 {0: [data[9:]]})
4036+        d.addCallback(_then_again)
4037+        return d
4038+
4039+
4040 class Stats(unittest.TestCase):
4041 
4042     def setUp(self):
4043}
4044[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
4045Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
4046 Ignore-this: 93e536c0f8efb705310f13ff64621527
4047] {
4048hunk ./src/allmydata/immutable/filenode.py 8
4049 now = time.time
4050 from zope.interface import implements, Interface
4051 from twisted.internet import defer
4052-from twisted.internet.interfaces import IConsumer
4053 
4054hunk ./src/allmydata/immutable/filenode.py 9
4055-from allmydata.interfaces import IImmutableFileNode, IUploadResults
4056 from allmydata import uri
4057hunk ./src/allmydata/immutable/filenode.py 10
4058+from twisted.internet.interfaces import IConsumer
4059+from twisted.protocols import basic
4060+from foolscap.api import eventually
4061+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
4062+     IDownloadTarget, IUploadResults
4063+from allmydata.util import dictutil, log, base32, consumer
4064+from allmydata.immutable.checker import Checker
4065 from allmydata.check_results import CheckResults, CheckAndRepairResults
4066 from allmydata.util.dictutil import DictOfSets
4067 from pycryptopp.cipher.aes import AES
4068hunk ./src/allmydata/immutable/filenode.py 296
4069         return self._cnode.check_and_repair(monitor, verify, add_lease)
4070     def check(self, monitor, verify=False, add_lease=False):
4071         return self._cnode.check(monitor, verify, add_lease)
4072+
4073+    def get_best_readable_version(self):
4074+        """
4075+        Return an IReadable of the best version of this file. Since
4076+        immutable files can have only one version, we just return the
4077+        current filenode.
4078+        """
4079+        return defer.succeed(self)
4080+
4081+
4082+    def download_best_version(self):
4083+        """
4084+        Download the best version of this file, returning its contents
4085+        as a bytestring. Since there is only one version of an immutable
4086+        file, we download and return the contents of this file.
4087+        """
4088+        d = consumer.download_to_data(self)
4089+        return d
4090+
4091+    # for an immutable file, download_to_data (specified in IReadable)
4092+    # is the same as download_best_version (specified in IFileNode). For
4093+    # mutable files, the difference is more meaningful, since they can
4094+    # have multiple versions.
4095+    download_to_data = download_best_version
4096+
4097+
4098+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
4099+    # get_size_of_best_version(IFileNode) are all the same for immutable
4100+    # files.
4101+    get_size_of_best_version = get_current_size
4102}
4103[immutable/literal.py: implement the same interfaces as other filenodes
4104Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
4105 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
4106] hunk ./src/allmydata/immutable/literal.py 106
4107         d.addCallback(lambda lastSent: consumer)
4108         return d
4109 
4110+    # IReadable, IFileNode, IFilesystemNode
4111+    def get_best_readable_version(self):
4112+        return defer.succeed(self)
4113+
4114+
4115+    def download_best_version(self):
4116+        return defer.succeed(self.u.data)
4117+
4118+
4119+    download_to_data = download_best_version
4120+    get_size_of_best_version = get_current_size
4121+
4122[mutable/publish.py: Modify the publish process to support MDMF
4123Kevan Carstensen <kevan@isnotajoke.com>**20100811233101
4124 Ignore-this: c2eb57cf67da7af5ad02be793e918bc6
4125 
4126 The inner workings of the publishing process needed to be reworked to a
4127 large extend to cope with segmented mutable files, and to cope with
4128 partial-file updates of mutable files. This patch does that. It also
4129 introduces wrappers for uploadable data, allowing the use of
4130 filehandle-like objects as data sources, in addition to strings. This
4131 reduces memory inefficiency when dealing with large files through the
4132 webapi, and clarifies update code there.
4133] {
4134hunk ./src/allmydata/mutable/publish.py 4
4135 
4136 
4137 import os, struct, time
4138+from StringIO import StringIO
4139 from itertools import count
4140 from zope.interface import implements
4141 from twisted.internet import defer
4142hunk ./src/allmydata/mutable/publish.py 9
4143 from twisted.python import failure
4144-from allmydata.interfaces import IPublishStatus
4145+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
4146+                                 IMutableUploadable
4147 from allmydata.util import base32, hashutil, mathutil, idlib, log
4148 from allmydata import hashtree, codec
4149 from allmydata.storage.server import si_b2a
4150hunk ./src/allmydata/mutable/publish.py 21
4151      UncoordinatedWriteError, NotEnoughServersError
4152 from allmydata.mutable.servermap import ServerMap
4153 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
4154-     unpack_checkstring, SIGNED_PREFIX
4155+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
4156+     SDMFSlotWriteProxy
4157+
4158+KiB = 1024
4159+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
4160+PUSHING_BLOCKS_STATE = 0
4161+PUSHING_EVERYTHING_ELSE_STATE = 1
4162+DONE_STATE = 2
4163 
4164 class PublishStatus:
4165     implements(IPublishStatus)
4166hunk ./src/allmydata/mutable/publish.py 118
4167         self._status.set_helper(False)
4168         self._status.set_progress(0.0)
4169         self._status.set_active(True)
4170+        self._version = self._node.get_version()
4171+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
4172+
4173 
4174     def get_status(self):
4175         return self._status
4176hunk ./src/allmydata/mutable/publish.py 132
4177             kwargs["facility"] = "tahoe.mutable.publish"
4178         return log.msg(*args, **kwargs)
4179 
4180+
4181+    def update(self, data, offset, blockhashes, version):
4182+        """
4183+        I replace the contents of this file with the contents of data,
4184+        starting at offset. I return a Deferred that fires with None
4185+        when the replacement has been completed, or with an error if
4186+        something went wrong during the process.
4187+
4188+        Note that this process will not upload new shares. If the file
4189+        being updated is in need of repair, callers will have to repair
4190+        it on their own.
4191+        """
4192+        # How this works:
4193+        # 1: Make peer assignments. We'll assign each share that we know
4194+        # about on the grid to that peer that currently holds that
4195+        # share, and will not place any new shares.
4196+        # 2: Setup encoding parameters. Most of these will stay the same
4197+        # -- datalength will change, as will some of the offsets.
4198+        # 3. Upload the new segments.
4199+        # 4. Be done.
4200+        assert IMutableUploadable.providedBy(data)
4201+
4202+        self.data = data
4203+
4204+        # XXX: Use the MutableFileVersion instead.
4205+        self.datalength = self._node.get_size()
4206+        if data.get_size() > self.datalength:
4207+            self.datalength = data.get_size()
4208+
4209+        self.log("starting update")
4210+        self.log("adding new data of length %d at offset %d" % \
4211+                    (data.get_size(), offset))
4212+        self.log("new data length is %d" % self.datalength)
4213+        self._status.set_size(self.datalength)
4214+        self._status.set_status("Started")
4215+        self._started = time.time()
4216+
4217+        self.done_deferred = defer.Deferred()
4218+
4219+        self._writekey = self._node.get_writekey()
4220+        assert self._writekey, "need write capability to publish"
4221+
4222+        # first, which servers will we publish to? We require that the
4223+        # servermap was updated in MODE_WRITE, so we can depend upon the
4224+        # peerlist computed by that process instead of computing our own.
4225+        assert self._servermap
4226+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
4227+        # we will push a version that is one larger than anything present
4228+        # in the grid, according to the servermap.
4229+        self._new_seqnum = self._servermap.highest_seqnum() + 1
4230+        self._status.set_servermap(self._servermap)
4231+
4232+        self.log(format="new seqnum will be %(seqnum)d",
4233+                 seqnum=self._new_seqnum, level=log.NOISY)
4234+
4235+        # We're updating an existing file, so all of the following
4236+        # should be available.
4237+        self.readkey = self._node.get_readkey()
4238+        self.required_shares = self._node.get_required_shares()
4239+        assert self.required_shares is not None
4240+        self.total_shares = self._node.get_total_shares()
4241+        assert self.total_shares is not None
4242+        self._status.set_encoding(self.required_shares, self.total_shares)
4243+
4244+        self._pubkey = self._node.get_pubkey()
4245+        assert self._pubkey
4246+        self._privkey = self._node.get_privkey()
4247+        assert self._privkey
4248+        self._encprivkey = self._node.get_encprivkey()
4249+
4250+        sb = self._storage_broker
4251+        full_peerlist = sb.get_servers_for_index(self._storage_index)
4252+        self.full_peerlist = full_peerlist # for use later, immutable
4253+        self.bad_peers = set() # peerids who have errbacked/refused requests
4254+
4255+        # This will set self.segment_size, self.num_segments, and
4256+        # self.fec. TODO: Does it know how to do the offset? Probably
4257+        # not. So do that part next.
4258+        self.setup_encoding_parameters(offset=offset)
4259+
4260+        # if we experience any surprises (writes which were rejected because
4261+        # our test vector did not match, or shares which we didn't expect to
4262+        # see), we set this flag and report an UncoordinatedWriteError at the
4263+        # end of the publish process.
4264+        self.surprised = False
4265+
4266+        # we keep track of three tables. The first is our goal: which share
4267+        # we want to see on which servers. This is initially populated by the
4268+        # existing servermap.
4269+        self.goal = set() # pairs of (peerid, shnum) tuples
4270+
4271+        # the second table is our list of outstanding queries: those which
4272+        # are in flight and may or may not be delivered, accepted, or
4273+        # acknowledged. Items are added to this table when the request is
4274+        # sent, and removed when the response returns (or errbacks).
4275+        self.outstanding = set() # (peerid, shnum) tuples
4276+
4277+        # the third is a table of successes: share which have actually been
4278+        # placed. These are populated when responses come back with success.
4279+        # When self.placed == self.goal, we're done.
4280+        self.placed = set() # (peerid, shnum) tuples
4281+
4282+        # we also keep a mapping from peerid to RemoteReference. Each time we
4283+        # pull a connection out of the full peerlist, we add it to this for
4284+        # use later.
4285+        self.connections = {}
4286+
4287+        self.bad_share_checkstrings = {}
4288+
4289+        # This is set at the last step of the publishing process.
4290+        self.versioninfo = ""
4291+
4292+        # we use the servermap to populate the initial goal: this way we will
4293+        # try to update each existing share in place. Since we're
4294+        # updating, we ignore damaged and missing shares -- callers must
4295+        # do a repair to repair and recreate these.
4296+        for (peerid, shnum) in self._servermap.servermap:
4297+            self.goal.add( (peerid, shnum) )
4298+            self.connections[peerid] = self._servermap.connections[peerid]
4299+        self.writers = {}
4300+
4301+        # SDMF files are updated differently.
4302+        self._version = MDMF_VERSION
4303+        writer_class = MDMFSlotWriteProxy
4304+
4305+        # For each (peerid, shnum) in self.goal, we make a
4306+        # write proxy for that peer. We'll use this to write
4307+        # shares to the peer.
4308+        for key in self.goal:
4309+            peerid, shnum = key
4310+            write_enabler = self._node.get_write_enabler(peerid)
4311+            renew_secret = self._node.get_renewal_secret(peerid)
4312+            cancel_secret = self._node.get_cancel_secret(peerid)
4313+            secrets = (write_enabler, renew_secret, cancel_secret)
4314+
4315+            self.writers[shnum] =  writer_class(shnum,
4316+                                                self.connections[peerid],
4317+                                                self._storage_index,
4318+                                                secrets,
4319+                                                self._new_seqnum,
4320+                                                self.required_shares,
4321+                                                self.total_shares,
4322+                                                self.segment_size,
4323+                                                self.datalength)
4324+            self.writers[shnum].peerid = peerid
4325+            assert (peerid, shnum) in self._servermap.servermap
4326+            old_versionid, old_timestamp = self._servermap.servermap[key]
4327+            (old_seqnum, old_root_hash, old_salt, old_segsize,
4328+             old_datalength, old_k, old_N, old_prefix,
4329+             old_offsets_tuple) = old_versionid
4330+            self.writers[shnum].set_checkstring(old_seqnum,
4331+                                                old_root_hash,
4332+                                                old_salt)
4333+
4334+        # Our remote shares will not have a complete checkstring until
4335+        # after we are done writing share data and have started to write
4336+        # blocks. In the meantime, we need to know what to look for when
4337+        # writing, so that we can detect UncoordinatedWriteErrors.
4338+        self._checkstring = self.writers.values()[0].get_checkstring()
4339+
4340+        # Now, we start pushing shares.
4341+        self._status.timings["setup"] = time.time() - self._started
4342+        # First, we encrypt, encode, and publish the shares that we need
4343+        # to encrypt, encode, and publish.
4344+
4345+        # Our update process fetched these for us. We need to update
4346+        # them in place as publishing happens.
4347+        self.blockhashes = {} # (shnum, [blochashes])
4348+        for (i, bht) in blockhashes.iteritems():
4349+            # We need to extract the leaves from our old hash tree.
4350+            old_segcount = mathutil.div_ceil(version[4],
4351+                                             version[3])
4352+            h = hashtree.IncompleteHashTree(old_segcount)
4353+            bht = dict(enumerate(bht))
4354+            h.set_hashes(bht)
4355+            leaves = h[h.get_leaf_index(0):]
4356+            for j in xrange(self.num_segments - len(leaves)):
4357+                leaves.append(None)
4358+
4359+            assert len(leaves) >= self.num_segments
4360+            self.blockhashes[i] = leaves
4361+            # This list will now be the leaves that were set during the
4362+            # initial upload + enough empty hashes to make it a
4363+            # power-of-two. If we exceed a power of two boundary, we
4364+            # should be encoding the file over again, and should not be
4365+            # here. So, we have
4366+            #assert len(self.blockhashes[i]) == \
4367+            #    hashtree.roundup_pow2(self.num_segments), \
4368+            #        len(self.blockhashes[i])
4369+            # XXX: Except this doesn't work. Figure out why.
4370+
4371+        # These are filled in later, after we've modified the block hash
4372+        # tree suitably.
4373+        self.sharehash_leaves = None # eventually [sharehashes]
4374+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
4375+                              # validate the share]
4376+
4377+        d = defer.succeed(None)
4378+        self.log("Starting push")
4379+
4380+        self._state = PUSHING_BLOCKS_STATE
4381+        self._push()
4382+
4383+        return self.done_deferred
4384+
4385+
4386     def publish(self, newdata):
4387         """Publish the filenode's current contents.  Returns a Deferred that
4388         fires (with None) when the publish has done as much work as it's ever
4389hunk ./src/allmydata/mutable/publish.py 345
4390         simultaneous write.
4391         """
4392 
4393-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
4394-        # 2: perform peer selection, get candidate servers
4395-        #  2a: send queries to n+epsilon servers, to determine current shares
4396-        #  2b: based upon responses, create target map
4397-        # 3: send slot_testv_and_readv_and_writev messages
4398-        # 4: as responses return, update share-dispatch table
4399-        # 4a: may need to run recovery algorithm
4400-        # 5: when enough responses are back, we're done
4401+        # 0. Setup encoding parameters, encoder, and other such things.
4402+        # 1. Encrypt, encode, and publish segments.
4403+        assert IMutableUploadable.providedBy(newdata)
4404 
4405hunk ./src/allmydata/mutable/publish.py 349
4406-        self.log("starting publish, datalen is %s" % len(newdata))
4407-        self._status.set_size(len(newdata))
4408+        self.data = newdata
4409+        self.datalength = newdata.get_size()
4410+        #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
4411+        #    self._version = MDMF_VERSION
4412+        #else:
4413+        #    self._version = SDMF_VERSION
4414+
4415+        self.log("starting publish, datalen is %s" % self.datalength)
4416+        self._status.set_size(self.datalength)
4417         self._status.set_status("Started")
4418         self._started = time.time()
4419 
4420hunk ./src/allmydata/mutable/publish.py 405
4421         self.full_peerlist = full_peerlist # for use later, immutable
4422         self.bad_peers = set() # peerids who have errbacked/refused requests
4423 
4424-        self.newdata = newdata
4425-        self.salt = os.urandom(16)
4426-
4427+        # This will set self.segment_size, self.num_segments, and
4428+        # self.fec.
4429         self.setup_encoding_parameters()
4430 
4431         # if we experience any surprises (writes which were rejected because
4432hunk ./src/allmydata/mutable/publish.py 415
4433         # end of the publish process.
4434         self.surprised = False
4435 
4436-        # as a failsafe, refuse to iterate through self.loop more than a
4437-        # thousand times.
4438-        self.looplimit = 1000
4439-
4440         # we keep track of three tables. The first is our goal: which share
4441         # we want to see on which servers. This is initially populated by the
4442         # existing servermap.
4443hunk ./src/allmydata/mutable/publish.py 438
4444 
4445         self.bad_share_checkstrings = {}
4446 
4447+        # This is set at the last step of the publishing process.
4448+        self.versioninfo = ""
4449+
4450         # we use the servermap to populate the initial goal: this way we will
4451         # try to update each existing share in place.
4452         for (peerid, shnum) in self._servermap.servermap:
4453hunk ./src/allmydata/mutable/publish.py 454
4454             self.bad_share_checkstrings[key] = old_checkstring
4455             self.connections[peerid] = self._servermap.connections[peerid]
4456 
4457-        # create the shares. We'll discard these as they are delivered. SDMF:
4458-        # we're allowed to hold everything in memory.
4459+        # TODO: Make this part do peer selection.
4460+        self.update_goal()
4461+        self.writers = {}
4462+        if self._version == MDMF_VERSION:
4463+            writer_class = MDMFSlotWriteProxy
4464+        else:
4465+            writer_class = SDMFSlotWriteProxy
4466 
4467hunk ./src/allmydata/mutable/publish.py 462
4468+        # For each (peerid, shnum) in self.goal, we make a
4469+        # write proxy for that peer. We'll use this to write
4470+        # shares to the peer.
4471+        for key in self.goal:
4472+            peerid, shnum = key
4473+            write_enabler = self._node.get_write_enabler(peerid)
4474+            renew_secret = self._node.get_renewal_secret(peerid)
4475+            cancel_secret = self._node.get_cancel_secret(peerid)
4476+            secrets = (write_enabler, renew_secret, cancel_secret)
4477+
4478+            self.writers[shnum] =  writer_class(shnum,
4479+                                                self.connections[peerid],
4480+                                                self._storage_index,
4481+                                                secrets,
4482+                                                self._new_seqnum,
4483+                                                self.required_shares,
4484+                                                self.total_shares,
4485+                                                self.segment_size,
4486+                                                self.datalength)
4487+            self.writers[shnum].peerid = peerid
4488+            if (peerid, shnum) in self._servermap.servermap:
4489+                old_versionid, old_timestamp = self._servermap.servermap[key]
4490+                (old_seqnum, old_root_hash, old_salt, old_segsize,
4491+                 old_datalength, old_k, old_N, old_prefix,
4492+                 old_offsets_tuple) = old_versionid
4493+                self.writers[shnum].set_checkstring(old_seqnum,
4494+                                                    old_root_hash,
4495+                                                    old_salt)
4496+            elif (peerid, shnum) in self.bad_share_checkstrings:
4497+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
4498+                self.writers[shnum].set_checkstring(old_checkstring)
4499+
4500+        # Our remote shares will not have a complete checkstring until
4501+        # after we are done writing share data and have started to write
4502+        # blocks. In the meantime, we need to know what to look for when
4503+        # writing, so that we can detect UncoordinatedWriteErrors.
4504+        self._checkstring = self.writers.values()[0].get_checkstring()
4505+
4506+        # Now, we start pushing shares.
4507         self._status.timings["setup"] = time.time() - self._started
4508hunk ./src/allmydata/mutable/publish.py 502
4509-        d = self._encrypt_and_encode()
4510-        d.addCallback(self._generate_shares)
4511-        def _start_pushing(res):
4512-            self._started_pushing = time.time()
4513-            return res
4514-        d.addCallback(_start_pushing)
4515-        d.addCallback(self.loop) # trigger delivery
4516-        d.addErrback(self._fatal_error)
4517+        # First, we encrypt, encode, and publish the shares that we need
4518+        # to encrypt, encode, and publish.
4519+
4520+        # This will eventually hold the block hash chain for each share
4521+        # that we publish. We define it this way so that empty publishes
4522+        # will still have something to write to the remote slot.
4523+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
4524+        for i in xrange(self.total_shares):
4525+            blocks = self.blockhashes[i]
4526+            for j in xrange(self.num_segments):
4527+                blocks.append(None)
4528+        self.sharehash_leaves = None # eventually [sharehashes]
4529+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
4530+                              # validate the share]
4531+
4532+        d = defer.succeed(None)
4533+        self.log("Starting push")
4534+
4535+        self._state = PUSHING_BLOCKS_STATE
4536+        self._push()
4537 
4538         return self.done_deferred
4539 
4540hunk ./src/allmydata/mutable/publish.py 525
4541-    def setup_encoding_parameters(self):
4542-        segment_size = len(self.newdata)
4543+
4544+    def _update_status(self):
4545+        self._status.set_status("Sending Shares: %d placed out of %d, "
4546+                                "%d messages outstanding" %
4547+                                (len(self.placed),
4548+                                 len(self.goal),
4549+                                 len(self.outstanding)))
4550+        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
4551+
4552+
4553+    def setup_encoding_parameters(self, offset=0):
4554+        if self._version == MDMF_VERSION:
4555+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
4556+        else:
4557+            segment_size = self.datalength # SDMF is only one segment
4558         # this must be a multiple of self.required_shares
4559         segment_size = mathutil.next_multiple(segment_size,
4560                                               self.required_shares)
4561hunk ./src/allmydata/mutable/publish.py 544
4562         self.segment_size = segment_size
4563+
4564+        # Calculate the starting segment for the upload.
4565         if segment_size:
4566hunk ./src/allmydata/mutable/publish.py 547
4567-            self.num_segments = mathutil.div_ceil(len(self.newdata),
4568+            self.num_segments = mathutil.div_ceil(self.datalength,
4569                                                   segment_size)
4570hunk ./src/allmydata/mutable/publish.py 549
4571+            self.starting_segment = mathutil.div_ceil(offset,
4572+                                                      segment_size)
4573+            self.starting_segment -= 1
4574+            if offset == 0:
4575+                self.starting_segment = 0
4576+
4577         else:
4578             self.num_segments = 0
4579hunk ./src/allmydata/mutable/publish.py 557
4580-        assert self.num_segments in [0, 1,] # SDMF restrictions
4581+            self.starting_segment = 0
4582+
4583+
4584+        self.log("building encoding parameters for file")
4585+        self.log("got segsize %d" % self.segment_size)
4586+        self.log("got %d segments" % self.num_segments)
4587+
4588+        if self._version == SDMF_VERSION:
4589+            assert self.num_segments in (0, 1) # SDMF
4590+        # calculate the tail segment size.
4591+
4592+        if segment_size and self.datalength:
4593+            self.tail_segment_size = self.datalength % segment_size
4594+            self.log("got tail segment size %d" % self.tail_segment_size)
4595+        else:
4596+            self.tail_segment_size = 0
4597+
4598+        if self.tail_segment_size == 0 and segment_size:
4599+            # The tail segment is the same size as the other segments.
4600+            self.tail_segment_size = segment_size
4601+
4602+        # Make FEC encoders
4603+        fec = codec.CRSEncoder()
4604+        fec.set_params(self.segment_size,
4605+                       self.required_shares, self.total_shares)
4606+        self.piece_size = fec.get_block_size()
4607+        self.fec = fec
4608+
4609+        if self.tail_segment_size == self.segment_size:
4610+            self.tail_fec = self.fec
4611+        else:
4612+            tail_fec = codec.CRSEncoder()
4613+            tail_fec.set_params(self.tail_segment_size,
4614+                                self.required_shares,
4615+                                self.total_shares)
4616+            self.tail_fec = tail_fec
4617+
4618+        self._current_segment = self.starting_segment
4619+        self.end_segment = self.num_segments - 1
4620+        # Now figure out where the last segment should be.
4621+        if self.data.get_size() != self.datalength:
4622+            end = self.data.get_size()
4623+            self.end_segment = mathutil.div_ceil(end,
4624+                                                 segment_size)
4625+            self.end_segment -= 1
4626+        self.log("got start segment %d" % self.starting_segment)
4627+        self.log("got end segment %d" % self.end_segment)
4628+
4629+
4630+    def _push(self, ignored=None):
4631+        """
4632+        I manage state transitions. In particular, I see that we still
4633+        have a good enough number of writers to complete the upload
4634+        successfully.
4635+        """
4636+        # Can we still successfully publish this file?
4637+        # TODO: Keep track of outstanding queries before aborting the
4638+        #       process.
4639+        if len(self.writers) <= self.required_shares or self.surprised:
4640+            return self._failure()
4641+
4642+        # Figure out what we need to do next. Each of these needs to
4643+        # return a deferred so that we don't block execution when this
4644+        # is first called in the upload method.
4645+        if self._state == PUSHING_BLOCKS_STATE:
4646+            return self.push_segment(self._current_segment)
4647+
4648+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
4649+            return self.push_everything_else()
4650+
4651+        # If we make it to this point, we were successful in placing the
4652+        # file.
4653+        return self._done(None)
4654+
4655+
4656+    def push_segment(self, segnum):
4657+        if self.num_segments == 0 and self._version == SDMF_VERSION:
4658+            self._add_dummy_salts()
4659 
4660hunk ./src/allmydata/mutable/publish.py 636
4661-    def _fatal_error(self, f):
4662-        self.log("error during loop", failure=f, level=log.UNUSUAL)
4663-        self._done(f)
4664+        if segnum > self.end_segment:
4665+            # We don't have any more segments to push.
4666+            self._state = PUSHING_EVERYTHING_ELSE_STATE
4667+            return self._push()
4668+
4669+        d = self._encode_segment(segnum)
4670+        d.addCallback(self._push_segment, segnum)
4671+        def _increment_segnum(ign):
4672+            self._current_segment += 1
4673+        # XXX: I don't think we need to do addBoth here -- any errBacks
4674+        # should be handled within push_segment.
4675+        d.addBoth(_increment_segnum)
4676+        d.addBoth(self._turn_barrier)
4677+        d.addBoth(self._push)
4678+
4679+
4680+    def _turn_barrier(self, result):
4681+        """
4682+        I help the publish process avoid the recursion limit issues
4683+        described in #237.
4684+        """
4685+        return fireEventually(result)
4686+
4687+
4688+    def _add_dummy_salts(self):
4689+        """
4690+        SDMF files need a salt even if they're empty, or the signature
4691+        won't make sense. This method adds a dummy salt to each of our
4692+        SDMF writers so that they can write the signature later.
4693+        """
4694+        salt = os.urandom(16)
4695+        assert self._version == SDMF_VERSION
4696+
4697+        for writer in self.writers.itervalues():
4698+            writer.put_salt(salt)
4699+
4700+
4701+    def _encode_segment(self, segnum):
4702+        """
4703+        I encrypt and encode the segment segnum.
4704+        """
4705+        started = time.time()
4706+
4707+        if segnum + 1 == self.num_segments:
4708+            segsize = self.tail_segment_size
4709+        else:
4710+            segsize = self.segment_size
4711+
4712+
4713+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
4714+        data = self.data.read(segsize)
4715+        # XXX: This is dumb. Why return a list?
4716+        data = "".join(data)
4717+
4718+        assert len(data) == segsize, len(data)
4719+
4720+        salt = os.urandom(16)
4721+
4722+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
4723+        self._status.set_status("Encrypting")
4724+        enc = AES(key)
4725+        crypttext = enc.process(data)
4726+        assert len(crypttext) == len(data)
4727+
4728+        now = time.time()
4729+        self._status.timings["encrypt"] = now - started
4730+        started = now
4731+
4732+        # now apply FEC
4733+        if segnum + 1 == self.num_segments:
4734+            fec = self.tail_fec
4735+        else:
4736+            fec = self.fec
4737+
4738+        self._status.set_status("Encoding")
4739+        crypttext_pieces = [None] * self.required_shares
4740+        piece_size = fec.get_block_size()
4741+        for i in range(len(crypttext_pieces)):
4742+            offset = i * piece_size
4743+            piece = crypttext[offset:offset+piece_size]
4744+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
4745+            crypttext_pieces[i] = piece
4746+            assert len(piece) == piece_size
4747+        d = fec.encode(crypttext_pieces)
4748+        def _done_encoding(res):
4749+            elapsed = time.time() - started
4750+            self._status.timings["encode"] = elapsed
4751+            return (res, salt)
4752+        d.addCallback(_done_encoding)
4753+        return d
4754+
4755+
4756+    def _push_segment(self, encoded_and_salt, segnum):
4757+        """
4758+        I push (data, salt) as segment number segnum.
4759+        """
4760+        results, salt = encoded_and_salt
4761+        shares, shareids = results
4762+        started = time.time()
4763+        self._status.set_status("Pushing segment")
4764+        for i in xrange(len(shares)):
4765+            sharedata = shares[i]
4766+            shareid = shareids[i]
4767+            if self._version == MDMF_VERSION:
4768+                hashed = salt + sharedata
4769+            else:
4770+                hashed = sharedata
4771+            block_hash = hashutil.block_hash(hashed)
4772+            old_hash = self.blockhashes[shareid][segnum]
4773+            self.blockhashes[shareid][segnum] = block_hash
4774+            # find the writer for this share
4775+            writer = self.writers[shareid]
4776+            writer.put_block(sharedata, segnum, salt)
4777+
4778+
4779+    def push_everything_else(self):
4780+        """
4781+        I put everything else associated with a share.
4782+        """
4783+        self._pack_started = time.time()
4784+        self.push_encprivkey()
4785+        self.push_blockhashes()
4786+        self.push_sharehashes()
4787+        self.push_toplevel_hashes_and_signature()
4788+        d = self.finish_publishing()
4789+        def _change_state(ignored):
4790+            self._state = DONE_STATE
4791+        d.addCallback(_change_state)
4792+        d.addCallback(self._push)
4793+        return d
4794+
4795+
4796+    def push_encprivkey(self):
4797+        encprivkey = self._encprivkey
4798+        self._status.set_status("Pushing encrypted private key")
4799+        for writer in self.writers.itervalues():
4800+            writer.put_encprivkey(encprivkey)
4801+
4802+
4803+    def push_blockhashes(self):
4804+        self.sharehash_leaves = [None] * len(self.blockhashes)
4805+        self._status.set_status("Building and pushing block hash tree")
4806+        for shnum, blockhashes in self.blockhashes.iteritems():
4807+            t = hashtree.HashTree(blockhashes)
4808+            self.blockhashes[shnum] = list(t)
4809+            # set the leaf for future use.
4810+            self.sharehash_leaves[shnum] = t[0]
4811+
4812+            writer = self.writers[shnum]
4813+            writer.put_blockhashes(self.blockhashes[shnum])
4814+
4815+
4816+    def push_sharehashes(self):
4817+        self._status.set_status("Building and pushing share hash chain")
4818+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
4819+        share_hash_chain = {}
4820+        for shnum in xrange(len(self.sharehash_leaves)):
4821+            needed_indices = share_hash_tree.needed_hashes(shnum)
4822+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
4823+                                             for i in needed_indices] )
4824+            writer = self.writers[shnum]
4825+            writer.put_sharehashes(self.sharehashes[shnum])
4826+        self.root_hash = share_hash_tree[0]
4827+
4828+
4829+    def push_toplevel_hashes_and_signature(self):
4830+        # We need to to three things here:
4831+        #   - Push the root hash and salt hash
4832+        #   - Get the checkstring of the resulting layout; sign that.
4833+        #   - Push the signature
4834+        self._status.set_status("Pushing root hashes and signature")
4835+        for shnum in xrange(self.total_shares):
4836+            writer = self.writers[shnum]
4837+            writer.put_root_hash(self.root_hash)
4838+        self._update_checkstring()
4839+        self._make_and_place_signature()
4840+
4841+
4842+    def _update_checkstring(self):
4843+        """
4844+        After putting the root hash, MDMF files will have the
4845+        checkstring written to the storage server. This means that we
4846+        can update our copy of the checkstring so we can detect
4847+        uncoordinated writes. SDMF files will have the same checkstring,
4848+        so we need not do anything.
4849+        """
4850+        self._checkstring = self.writers.values()[0].get_checkstring()
4851+
4852+
4853+    def _make_and_place_signature(self):
4854+        """
4855+        I create and place the signature.
4856+        """
4857+        started = time.time()
4858+        self._status.set_status("Signing prefix")
4859+        signable = self.writers[0].get_signable()
4860+        self.signature = self._privkey.sign(signable)
4861+
4862+        for (shnum, writer) in self.writers.iteritems():
4863+            writer.put_signature(self.signature)
4864+        self._status.timings['sign'] = time.time() - started
4865+
4866+
4867+    def finish_publishing(self):
4868+        # We're almost done -- we just need to put the verification key
4869+        # and the offsets
4870+        started = time.time()
4871+        self._status.set_status("Pushing shares")
4872+        self._started_pushing = started
4873+        ds = []
4874+        verification_key = self._pubkey.serialize()
4875+
4876+
4877+        # TODO: Bad, since we remove from this same dict. We need to
4878+        # make a copy, or just use a non-iterated value.
4879+        for (shnum, writer) in self.writers.iteritems():
4880+            writer.put_verification_key(verification_key)
4881+            d = writer.finish_publishing()
4882+            # Add the (peerid, shnum) tuple to our list of outstanding
4883+            # queries. This gets used by _loop if some of our queries
4884+            # fail to place shares.
4885+            self.outstanding.add((writer.peerid, writer.shnum))
4886+            d.addCallback(self._got_write_answer, writer, started)
4887+            d.addErrback(self._connection_problem, writer)
4888+            ds.append(d)
4889+        self._record_verinfo()
4890+        self._status.timings['pack'] = time.time() - started
4891+        return defer.DeferredList(ds)
4892+
4893+
4894+    def _record_verinfo(self):
4895+        self.versioninfo = self.writers.values()[0].get_verinfo()
4896+
4897+
4898+    def _connection_problem(self, f, writer):
4899+        """
4900+        We ran into a connection problem while working with writer, and
4901+        need to deal with that.
4902+        """
4903+        self.log("found problem: %s" % str(f))
4904+        self._last_failure = f
4905+        del(self.writers[writer.shnum])
4906 
4907hunk ./src/allmydata/mutable/publish.py 879
4908-    def _update_status(self):
4909-        self._status.set_status("Sending Shares: %d placed out of %d, "
4910-                                "%d messages outstanding" %
4911-                                (len(self.placed),
4912-                                 len(self.goal),
4913-                                 len(self.outstanding)))
4914-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
4915 
4916hunk ./src/allmydata/mutable/publish.py 880
4917-    def loop(self, ignored=None):
4918-        self.log("entering loop", level=log.NOISY)
4919-        if not self._running:
4920-            return
4921-
4922-        self.looplimit -= 1
4923-        if self.looplimit <= 0:
4924-            raise LoopLimitExceededError("loop limit exceeded")
4925-
4926-        if self.surprised:
4927-            # don't send out any new shares, just wait for the outstanding
4928-            # ones to be retired.
4929-            self.log("currently surprised, so don't send any new shares",
4930-                     level=log.NOISY)
4931-        else:
4932-            self.update_goal()
4933-            # how far are we from our goal?
4934-            needed = self.goal - self.placed - self.outstanding
4935-            self._update_status()
4936-
4937-            if needed:
4938-                # we need to send out new shares
4939-                self.log(format="need to send %(needed)d new shares",
4940-                         needed=len(needed), level=log.NOISY)
4941-                self._send_shares(needed)
4942-                return
4943-
4944-        if self.outstanding:
4945-            # queries are still pending, keep waiting
4946-            self.log(format="%(outstanding)d queries still outstanding",
4947-                     outstanding=len(self.outstanding),
4948-                     level=log.NOISY)
4949-            return
4950-
4951-        # no queries outstanding, no placements needed: we're done
4952-        self.log("no queries outstanding, no placements needed: done",
4953-                 level=log.OPERATIONAL)
4954-        now = time.time()
4955-        elapsed = now - self._started_pushing
4956-        self._status.timings["push"] = elapsed
4957-        return self._done(None)
4958-
4959     def log_goal(self, goal, message=""):
4960         logmsg = [message]
4961         for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]):
4962hunk ./src/allmydata/mutable/publish.py 961
4963             self.log_goal(self.goal, "after update: ")
4964 
4965 
4966+    def _got_write_answer(self, answer, writer, started):
4967+        if not answer:
4968+            # SDMF writers only pretend to write when readers set their
4969+            # blocks, salts, and so on -- they actually just write once,
4970+            # at the end of the upload process. In fake writes, they
4971+            # return defer.succeed(None). If we see that, we shouldn't
4972+            # bother checking it.
4973+            return
4974 
4975hunk ./src/allmydata/mutable/publish.py 970
4976-    def _encrypt_and_encode(self):
4977-        # this returns a Deferred that fires with a list of (sharedata,
4978-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
4979-        # shares that we care about.
4980-        self.log("_encrypt_and_encode")
4981-
4982-        self._status.set_status("Encrypting")
4983-        started = time.time()
4984-
4985-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
4986-        enc = AES(key)
4987-        crypttext = enc.process(self.newdata)
4988-        assert len(crypttext) == len(self.newdata)
4989+        peerid = writer.peerid
4990+        lp = self.log("_got_write_answer from %s, share %d" %
4991+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
4992 
4993         now = time.time()
4994hunk ./src/allmydata/mutable/publish.py 975
4995-        self._status.timings["encrypt"] = now - started
4996-        started = now
4997-
4998-        # now apply FEC
4999-
5000-        self._status.set_status("Encoding")
5001-        fec = codec.CRSEncoder()
5002-        fec.set_params(self.segment_size,
5003-                       self.required_shares, self.total_shares)
5004-        piece_size = fec.get_block_size()
5005-        crypttext_pieces = [None] * self.required_shares
5006-        for i in range(len(crypttext_pieces)):
5007-            offset = i * piece_size
5008-            piece = crypttext[offset:offset+piece_size]
5009-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
5010-            crypttext_pieces[i] = piece
5011-            assert len(piece) == piece_size
5012-
5013-        d = fec.encode(crypttext_pieces)
5014-        def _done_encoding(res):
5015-            elapsed = time.time() - started
5016-            self._status.timings["encode"] = elapsed
5017-            return res
5018-        d.addCallback(_done_encoding)
5019-        return d
5020-
5021-    def _generate_shares(self, shares_and_shareids):
5022-        # this sets self.shares and self.root_hash
5023-        self.log("_generate_shares")
5024-        self._status.set_status("Generating Shares")
5025-        started = time.time()
5026-
5027-        # we should know these by now
5028-        privkey = self._privkey
5029-        encprivkey = self._encprivkey
5030-        pubkey = self._pubkey
5031-
5032-        (shares, share_ids) = shares_and_shareids
5033-
5034-        assert len(shares) == len(share_ids)
5035-        assert len(shares) == self.total_shares
5036-        all_shares = {}
5037-        block_hash_trees = {}
5038-        share_hash_leaves = [None] * len(shares)
5039-        for i in range(len(shares)):
5040-            share_data = shares[i]
5041-            shnum = share_ids[i]
5042-            all_shares[shnum] = share_data
5043-
5044-            # build the block hash tree. SDMF has only one leaf.
5045-            leaves = [hashutil.block_hash(share_data)]
5046-            t = hashtree.HashTree(leaves)
5047-            block_hash_trees[shnum] = list(t)
5048-            share_hash_leaves[shnum] = t[0]
5049-        for leaf in share_hash_leaves:
5050-            assert leaf is not None
5051-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
5052-        share_hash_chain = {}
5053-        for shnum in range(self.total_shares):
5054-            needed_hashes = share_hash_tree.needed_hashes(shnum)
5055-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
5056-                                              for i in needed_hashes ] )
5057-        root_hash = share_hash_tree[0]
5058-        assert len(root_hash) == 32
5059-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
5060-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
5061-
5062-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
5063-                             self.required_shares, self.total_shares,
5064-                             self.segment_size, len(self.newdata))
5065-
5066-        # now pack the beginning of the share. All shares are the same up
5067-        # to the signature, then they have divergent share hash chains,
5068-        # then completely different block hash trees + salt + share data,
5069-        # then they all share the same encprivkey at the end. The sizes
5070-        # of everything are the same for all shares.
5071-
5072-        sign_started = time.time()
5073-        signature = privkey.sign(prefix)
5074-        self._status.timings["sign"] = time.time() - sign_started
5075-
5076-        verification_key = pubkey.serialize()
5077-
5078-        final_shares = {}
5079-        for shnum in range(self.total_shares):
5080-            final_share = pack_share(prefix,
5081-                                     verification_key,
5082-                                     signature,
5083-                                     share_hash_chain[shnum],
5084-                                     block_hash_trees[shnum],
5085-                                     all_shares[shnum],
5086-                                     encprivkey)
5087-            final_shares[shnum] = final_share
5088-        elapsed = time.time() - started
5089-        self._status.timings["pack"] = elapsed
5090-        self.shares = final_shares
5091-        self.root_hash = root_hash
5092-
5093-        # we also need to build up the version identifier for what we're
5094-        # pushing. Extract the offsets from one of our shares.
5095-        assert final_shares
5096-        offsets = unpack_header(final_shares.values()[0])[-1]
5097-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
5098-        verinfo = (self._new_seqnum, root_hash, self.salt,
5099-                   self.segment_size, len(self.newdata),
5100-                   self.required_shares, self.total_shares,
5101-                   prefix, offsets_tuple)
5102-        self.versioninfo = verinfo
5103-
5104-
5105-
5106-    def _send_shares(self, needed):
5107-        self.log("_send_shares")
5108-
5109-        # we're finally ready to send out our shares. If we encounter any
5110-        # surprises here, it's because somebody else is writing at the same
5111-        # time. (Note: in the future, when we remove the _query_peers() step
5112-        # and instead speculate about [or remember] which shares are where,
5113-        # surprises here are *not* indications of UncoordinatedWriteError,
5114-        # and we'll need to respond to them more gracefully.)
5115-
5116-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
5117-        # organize it by peerid.
5118-
5119-        peermap = DictOfSets()
5120-        for (peerid, shnum) in needed:
5121-            peermap.add(peerid, shnum)
5122-
5123-        # the next thing is to build up a bunch of test vectors. The
5124-        # semantics of Publish are that we perform the operation if the world
5125-        # hasn't changed since the ServerMap was constructed (more or less).
5126-        # For every share we're trying to place, we create a test vector that
5127-        # tests to see if the server*share still corresponds to the
5128-        # map.
5129-
5130-        all_tw_vectors = {} # maps peerid to tw_vectors
5131-        sm = self._servermap.servermap
5132-
5133-        for key in needed:
5134-            (peerid, shnum) = key
5135-
5136-            if key in sm:
5137-                # an old version of that share already exists on the
5138-                # server, according to our servermap. We will create a
5139-                # request that attempts to replace it.
5140-                old_versionid, old_timestamp = sm[key]
5141-                (old_seqnum, old_root_hash, old_salt, old_segsize,
5142-                 old_datalength, old_k, old_N, old_prefix,
5143-                 old_offsets_tuple) = old_versionid
5144-                old_checkstring = pack_checkstring(old_seqnum,
5145-                                                   old_root_hash,
5146-                                                   old_salt)
5147-                testv = (0, len(old_checkstring), "eq", old_checkstring)
5148-
5149-            elif key in self.bad_share_checkstrings:
5150-                old_checkstring = self.bad_share_checkstrings[key]
5151-                testv = (0, len(old_checkstring), "eq", old_checkstring)
5152-
5153-            else:
5154-                # add a testv that requires the share not exist
5155-
5156-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
5157-                # constraints are handled. If the same object is referenced
5158-                # multiple times inside the arguments, foolscap emits a
5159-                # 'reference' token instead of a distinct copy of the
5160-                # argument. The bug is that these 'reference' tokens are not
5161-                # accepted by the inbound constraint code. To work around
5162-                # this, we need to prevent python from interning the
5163-                # (constant) tuple, by creating a new copy of this vector
5164-                # each time.
5165-
5166-                # This bug is fixed in foolscap-0.2.6, and even though this
5167-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
5168-                # supposed to be able to interoperate with older versions of
5169-                # Tahoe which are allowed to use older versions of foolscap,
5170-                # including foolscap-0.2.5 . In addition, I've seen other
5171-                # foolscap problems triggered by 'reference' tokens (see #541
5172-                # for details). So we must keep this workaround in place.
5173-
5174-                #testv = (0, 1, 'eq', "")
5175-                testv = tuple([0, 1, 'eq', ""])
5176-
5177-            testvs = [testv]
5178-            # the write vector is simply the share
5179-            writev = [(0, self.shares[shnum])]
5180-
5181-            if peerid not in all_tw_vectors:
5182-                all_tw_vectors[peerid] = {}
5183-                # maps shnum to (testvs, writevs, new_length)
5184-            assert shnum not in all_tw_vectors[peerid]
5185-
5186-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
5187-
5188-        # we read the checkstring back from each share, however we only use
5189-        # it to detect whether there was a new share that we didn't know
5190-        # about. The success or failure of the write will tell us whether
5191-        # there was a collision or not. If there is a collision, the first
5192-        # thing we'll do is update the servermap, which will find out what
5193-        # happened. We could conceivably reduce a roundtrip by using the
5194-        # readv checkstring to populate the servermap, but really we'd have
5195-        # to read enough data to validate the signatures too, so it wouldn't
5196-        # be an overall win.
5197-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
5198-
5199-        # ok, send the messages!
5200-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
5201-        started = time.time()
5202-        for (peerid, tw_vectors) in all_tw_vectors.items():
5203-
5204-            write_enabler = self._node.get_write_enabler(peerid)
5205-            renew_secret = self._node.get_renewal_secret(peerid)
5206-            cancel_secret = self._node.get_cancel_secret(peerid)
5207-            secrets = (write_enabler, renew_secret, cancel_secret)
5208-            shnums = tw_vectors.keys()
5209-
5210-            for shnum in shnums:
5211-                self.outstanding.add( (peerid, shnum) )
5212+        elapsed = now - started
5213 
5214hunk ./src/allmydata/mutable/publish.py 977
5215-            d = self._do_testreadwrite(peerid, secrets,
5216-                                       tw_vectors, read_vector)
5217-            d.addCallbacks(self._got_write_answer, self._got_write_error,
5218-                           callbackArgs=(peerid, shnums, started),
5219-                           errbackArgs=(peerid, shnums, started))
5220-            # tolerate immediate errback, like with DeadReferenceError
5221-            d.addBoth(fireEventually)
5222-            d.addCallback(self.loop)
5223-            d.addErrback(self._fatal_error)
5224+        self._status.add_per_server_time(peerid, elapsed)
5225 
5226hunk ./src/allmydata/mutable/publish.py 979
5227-        self._update_status()
5228-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
5229+        wrote, read_data = answer
5230 
5231hunk ./src/allmydata/mutable/publish.py 981
5232-    def _do_testreadwrite(self, peerid, secrets,
5233-                          tw_vectors, read_vector):
5234-        storage_index = self._storage_index
5235-        ss = self.connections[peerid]
5236+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
5237 
5238hunk ./src/allmydata/mutable/publish.py 983
5239-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
5240-        d = ss.callRemote("slot_testv_and_readv_and_writev",
5241-                          storage_index,
5242-                          secrets,
5243-                          tw_vectors,
5244-                          read_vector)
5245-        return d
5246+        # We need to remove from surprise_shares any shares that we are
5247+        # knowingly also writing to that peer from other writers.
5248 
5249hunk ./src/allmydata/mutable/publish.py 986
5250-    def _got_write_answer(self, answer, peerid, shnums, started):
5251-        lp = self.log("_got_write_answer from %s" %
5252-                      idlib.shortnodeid_b2a(peerid))
5253-        for shnum in shnums:
5254-            self.outstanding.discard( (peerid, shnum) )
5255+        # TODO: Precompute this.
5256+        known_shnums = [x.shnum for x in self.writers.values()
5257+                        if x.peerid == peerid]
5258+        surprise_shares -= set(known_shnums)
5259+        self.log("found the following surprise shares: %s" %
5260+                 str(surprise_shares))
5261 
5262hunk ./src/allmydata/mutable/publish.py 993
5263-        now = time.time()
5264-        elapsed = now - started
5265-        self._status.add_per_server_time(peerid, elapsed)
5266-
5267-        wrote, read_data = answer
5268-
5269-        surprise_shares = set(read_data.keys()) - set(shnums)
5270+        # Now surprise shares contains all of the shares that we did not
5271+        # expect to be there.
5272 
5273         surprised = False
5274         for shnum in surprise_shares:
5275hunk ./src/allmydata/mutable/publish.py 1000
5276             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
5277             checkstring = read_data[shnum][0]
5278-            their_version_info = unpack_checkstring(checkstring)
5279-            if their_version_info == self._new_version_info:
5280+            # What we want to do here is to see if their (seqnum,
5281+            # roothash, salt) is the same as our (seqnum, roothash,
5282+            # salt), or the equivalent for MDMF. The best way to do this
5283+            # is to store a packed representation of our checkstring
5284+            # somewhere, then not bother unpacking the other
5285+            # checkstring.
5286+            if checkstring == self._checkstring:
5287                 # they have the right share, somehow
5288 
5289                 if (peerid,shnum) in self.goal:
5290hunk ./src/allmydata/mutable/publish.py 1085
5291             self.log("our testv failed, so the write did not happen",
5292                      parent=lp, level=log.WEIRD, umid="8sc26g")
5293             self.surprised = True
5294-            self.bad_peers.add(peerid) # don't ask them again
5295+            self.bad_peers.add(writer) # don't ask them again
5296             # use the checkstring to add information to the log message
5297             for (shnum,readv) in read_data.items():
5298                 checkstring = readv[0]
5299hunk ./src/allmydata/mutable/publish.py 1107
5300                 # if expected_version==None, then we didn't expect to see a
5301                 # share on that peer, and the 'surprise_shares' clause above
5302                 # will have logged it.
5303-            # self.loop() will take care of finding new homes
5304             return
5305 
5306hunk ./src/allmydata/mutable/publish.py 1109
5307-        for shnum in shnums:
5308-            self.placed.add( (peerid, shnum) )
5309-            # and update the servermap
5310-            self._servermap.add_new_share(peerid, shnum,
5311+        # and update the servermap
5312+        # self.versioninfo is set during the last phase of publishing.
5313+        # If we get there, we know that responses correspond to placed
5314+        # shares, and can safely execute these statements.
5315+        if self.versioninfo:
5316+            self.log("wrote successfully: adding new share to servermap")
5317+            self._servermap.add_new_share(peerid, writer.shnum,
5318                                           self.versioninfo, started)
5319hunk ./src/allmydata/mutable/publish.py 1117
5320-
5321-        # self.loop() will take care of checking to see if we're done
5322+            self.placed.add( (peerid, writer.shnum) )
5323+        self._update_status()
5324+        # the next method in the deferred chain will check to see if
5325+        # we're done and successful.
5326         return
5327 
5328hunk ./src/allmydata/mutable/publish.py 1123
5329-    def _got_write_error(self, f, peerid, shnums, started):
5330-        for shnum in shnums:
5331-            self.outstanding.discard( (peerid, shnum) )
5332-        self.bad_peers.add(peerid)
5333-        if self._first_write_error is None:
5334-            self._first_write_error = f
5335-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
5336-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
5337-                 failure=f,
5338-                 level=log.UNUSUAL)
5339-        # self.loop() will take care of checking to see if we're done
5340-        return
5341-
5342 
5343     def _done(self, res):
5344         if not self._running:
5345hunk ./src/allmydata/mutable/publish.py 1130
5346         self._running = False
5347         now = time.time()
5348         self._status.timings["total"] = now - self._started
5349+
5350+        elapsed = now - self._started_pushing
5351+        self._status.timings['push'] = elapsed
5352+
5353         self._status.set_active(False)
5354hunk ./src/allmydata/mutable/publish.py 1135
5355-        if isinstance(res, failure.Failure):
5356-            self.log("Publish done, with failure", failure=res,
5357-                     level=log.WEIRD, umid="nRsR9Q")
5358-            self._status.set_status("Failed")
5359-        elif self.surprised:
5360-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
5361-            self._status.set_status("UncoordinatedWriteError")
5362-            # deliver a failure
5363-            res = failure.Failure(UncoordinatedWriteError())
5364-            # TODO: recovery
5365-        else:
5366-            self.log("Publish done, success")
5367-            self._status.set_status("Finished")
5368-            self._status.set_progress(1.0)
5369+        self.log("Publish done, success")
5370+        self._status.set_status("Finished")
5371+        self._status.set_progress(1.0)
5372         eventually(self.done_deferred.callback, res)
5373 
5374hunk ./src/allmydata/mutable/publish.py 1140
5375+    def _failure(self):
5376+
5377+        if not self.surprised:
5378+            # We ran out of servers
5379+            self.log("Publish ran out of good servers, "
5380+                     "last failure was: %s" % str(self._last_failure))
5381+            e = NotEnoughServersError("Ran out of non-bad servers, "
5382+                                      "last failure was %s" %
5383+                                      str(self._last_failure))
5384+        else:
5385+            # We ran into shares that we didn't recognize, which means
5386+            # that we need to return an UncoordinatedWriteError.
5387+            self.log("Publish failed with UncoordinatedWriteError")
5388+            e = UncoordinatedWriteError()
5389+        f = failure.Failure(e)
5390+        eventually(self.done_deferred.callback, f)
5391+
5392+
5393+class MutableFileHandle:
5394+    """
5395+    I am a mutable uploadable built around a filehandle-like object,
5396+    usually either a StringIO instance or a handle to an actual file.
5397+    """
5398+    implements(IMutableUploadable)
5399+
5400+    def __init__(self, filehandle):
5401+        # The filehandle is defined as a generally file-like object that
5402+        # has these two methods. We don't care beyond that.
5403+        assert hasattr(filehandle, "read")
5404+        assert hasattr(filehandle, "close")
5405+
5406+        self._filehandle = filehandle
5407+        # We must start reading at the beginning of the file, or we risk
5408+        # encountering errors when the data read does not match the size
5409+        # reported to the uploader.
5410+        self._filehandle.seek(0)
5411+
5412+        # We have not yet read anything, so our position is 0.
5413+        self._marker = 0
5414+
5415+
5416+    def get_size(self):
5417+        """
5418+        I return the amount of data in my filehandle.
5419+        """
5420+        if not hasattr(self, "_size"):
5421+            old_position = self._filehandle.tell()
5422+            # Seek to the end of the file by seeking 0 bytes from the
5423+            # file's end
5424+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
5425+            self._size = self._filehandle.tell()
5426+            # Restore the previous position, in case this was called
5427+            # after a read.
5428+            self._filehandle.seek(old_position)
5429+            assert self._filehandle.tell() == old_position
5430+
5431+        assert hasattr(self, "_size")
5432+        return self._size
5433+
5434+
5435+    def pos(self):
5436+        """
5437+        I return the position of my read marker -- i.e., how much data I
5438+        have already read and returned to callers.
5439+        """
5440+        return self._marker
5441+
5442+
5443+    def read(self, length):
5444+        """
5445+        I return some data (up to length bytes) from my filehandle.
5446+
5447+        In most cases, I return length bytes, but sometimes I won't --
5448+        for example, if I am asked to read beyond the end of a file, or
5449+        an error occurs.
5450+        """
5451+        results = self._filehandle.read(length)
5452+        self._marker += len(results)
5453+        return [results]
5454+
5455+
5456+    def close(self):
5457+        """
5458+        I close the underlying filehandle. Any further operations on the
5459+        filehandle fail at this point.
5460+        """
5461+        self._filehandle.close()
5462+
5463+
5464+class MutableData(MutableFileHandle):
5465+    """
5466+    I am a mutable uploadable built around a string, which I then cast
5467+    into a StringIO and treat as a filehandle.
5468+    """
5469+
5470+    def __init__(self, s):
5471+        # Take a string and return a file-like uploadable.
5472+        assert isinstance(s, str)
5473+
5474+        MutableFileHandle.__init__(self, StringIO(s))
5475+
5476+
5477+class TransformingUploadable:
5478+    """
5479+    I am an IMutableUploadable that wraps another IMutableUploadable,
5480+    and some segments that are already on the grid. When I am called to
5481+    read, I handle merging of boundary segments.
5482+    """
5483+    implements(IMutableUploadable)
5484+
5485+
5486+    def __init__(self, data, offset, segment_size, start, end):
5487+        assert IMutableUploadable.providedBy(data)
5488+
5489+        self._newdata = data
5490+        self._offset = offset
5491+        self._segment_size = segment_size
5492+        self._start = start
5493+        self._end = end
5494+
5495+        self._read_marker = 0
5496+
5497+        self._first_segment_offset = offset % segment_size
5498+
5499+        num = self.log("TransformingUploadable: starting", parent=None)
5500+        self._log_number = num
5501+        self.log("got fso: %d" % self._first_segment_offset)
5502+        self.log("got offset: %d" % self._offset)
5503+
5504+
5505+    def log(self, *args, **kwargs):
5506+        if 'parent' not in kwargs:
5507+            kwargs['parent'] = self._log_number
5508+        if "facility" not in kwargs:
5509+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
5510+        return log.msg(*args, **kwargs)
5511+
5512+
5513+    def get_size(self):
5514+        return self._offset + self._newdata.get_size()
5515+
5516+
5517+    def read(self, length):
5518+        # We can get data from 3 sources here.
5519+        #   1. The first of the segments provided to us.
5520+        #   2. The data that we're replacing things with.
5521+        #   3. The last of the segments provided to us.
5522+
5523+        # are we in state 0?
5524+        self.log("reading %d bytes" % length)
5525+
5526+        old_start_data = ""
5527+        old_data_length = self._first_segment_offset - self._read_marker
5528+        if old_data_length > 0:
5529+            if old_data_length > length:
5530+                old_data_length = length
5531+            self.log("returning %d bytes of old start data" % old_data_length)
5532+
5533+            old_data_end = old_data_length + self._read_marker
5534+            old_start_data = self._start[self._read_marker:old_data_end]
5535+            length -= old_data_length
5536+        else:
5537+            # otherwise calculations later get screwed up.
5538+            old_data_length = 0
5539+
5540+        # Is there enough new data to satisfy this read? If not, we need
5541+        # to pad the end of the data with data from our last segment.
5542+        old_end_length = length - \
5543+            (self._newdata.get_size() - self._newdata.pos())
5544+        old_end_data = ""
5545+        if old_end_length > 0:
5546+            self.log("reading %d bytes of old end data" % old_end_length)
5547+
5548+            # TODO: We're not explicitly checking for tail segment size
5549+            # here. Is that a problem?
5550+            old_data_offset = (length - old_end_length + \
5551+                               old_data_length) % self._segment_size
5552+            self.log("reading at offset %d" % old_data_offset)
5553+            old_end = old_data_offset + old_end_length
5554+            old_end_data = self._end[old_data_offset:old_end]
5555+            length -= old_end_length
5556+            assert length == self._newdata.get_size() - self._newdata.pos()
5557+
5558+        self.log("reading %d bytes of new data" % length)
5559+        new_data = self._newdata.read(length)
5560+        new_data = "".join(new_data)
5561+
5562+        self._read_marker += len(old_start_data + new_data + old_end_data)
5563+
5564+        return old_start_data + new_data + old_end_data
5565 
5566hunk ./src/allmydata/mutable/publish.py 1331
5567+    def close(self):
5568+        pass
5569}
5570[mutable/retrieve.py: Modify the retrieval process to support MDMF
5571Kevan Carstensen <kevan@isnotajoke.com>**20100811233125
5572 Ignore-this: bb5f95e1d0e8bb734d43d5ed1550ce
5573 
5574 The logic behind a mutable file download had to be adapted to work with
5575 segmented mutable files; this patch performs those adaptations. It also
5576 exposes some decoding and decrypting functionality to make partial-file
5577 updates a little easier, and supports efficient random-access downloads
5578 of parts of an MDMF file.
5579] {
5580hunk ./src/allmydata/mutable/retrieve.py 7
5581 from zope.interface import implements
5582 from twisted.internet import defer
5583 from twisted.python import failure
5584+from twisted.internet.interfaces import IPushProducer, IConsumer
5585 from foolscap.api import DeadReferenceError, eventually, fireEventually
5586hunk ./src/allmydata/mutable/retrieve.py 9
5587-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
5588-from allmydata.util import hashutil, idlib, log
5589+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
5590+                                 MDMF_VERSION, SDMF_VERSION
5591+from allmydata.util import hashutil, idlib, log, mathutil
5592 from allmydata import hashtree, codec
5593 from allmydata.storage.server import si_b2a
5594 from pycryptopp.cipher.aes import AES
5595hunk ./src/allmydata/mutable/retrieve.py 18
5596 from pycryptopp.publickey import rsa
5597 
5598 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
5599-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
5600+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
5601+                                     MDMFSlotReadProxy
5602 
5603 class RetrieveStatus:
5604     implements(IRetrieveStatus)
5605hunk ./src/allmydata/mutable/retrieve.py 86
5606     # times, and each will have a separate response chain. However the
5607     # Retrieve object will remain tied to a specific version of the file, and
5608     # will use a single ServerMap instance.
5609+    implements(IPushProducer)
5610 
5611hunk ./src/allmydata/mutable/retrieve.py 88
5612-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
5613+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
5614+                 verify=False):
5615         self._node = filenode
5616         assert self._node.get_pubkey()
5617         self._storage_index = filenode.get_storage_index()
5618hunk ./src/allmydata/mutable/retrieve.py 107
5619         self.verinfo = verinfo
5620         # during repair, we may be called upon to grab the private key, since
5621         # it wasn't picked up during a verify=False checker run, and we'll
5622-        # need it for repair to generate the a new version.
5623-        self._need_privkey = fetch_privkey
5624-        if self._node.get_privkey():
5625+        # need it for repair to generate a new version.
5626+        self._need_privkey = fetch_privkey or verify
5627+        if self._node.get_privkey() and not verify:
5628             self._need_privkey = False
5629 
5630hunk ./src/allmydata/mutable/retrieve.py 112
5631+        if self._need_privkey:
5632+            # TODO: Evaluate the need for this. We'll use it if we want
5633+            # to limit how many queries are on the wire for the privkey
5634+            # at once.
5635+            self._privkey_query_markers = [] # one Marker for each time we've
5636+                                             # tried to get the privkey.
5637+
5638+        # verify means that we are using the downloader logic to verify all
5639+        # of our shares. This tells the downloader a few things.
5640+        #
5641+        # 1. We need to download all of the shares.
5642+        # 2. We don't need to decode or decrypt the shares, since our
5643+        #    caller doesn't care about the plaintext, only the
5644+        #    information about which shares are or are not valid.
5645+        # 3. When we are validating readers, we need to validate the
5646+        #    signature on the prefix. Do we? We already do this in the
5647+        #    servermap update?
5648+        self._verify = False
5649+        if verify:
5650+            self._verify = True
5651+
5652         self._status = RetrieveStatus()
5653         self._status.set_storage_index(self._storage_index)
5654         self._status.set_helper(False)
5655hunk ./src/allmydata/mutable/retrieve.py 142
5656          offsets_tuple) = self.verinfo
5657         self._status.set_size(datalength)
5658         self._status.set_encoding(k, N)
5659+        self.readers = {}
5660+        self._paused = False
5661+        self._paused_deferred = None
5662+        self._offset = None
5663+        self._read_length = None
5664+        self.log("got seqnum %d" % self.verinfo[0])
5665+
5666 
5667     def get_status(self):
5668         return self._status
5669hunk ./src/allmydata/mutable/retrieve.py 160
5670             kwargs["facility"] = "tahoe.mutable.retrieve"
5671         return log.msg(*args, **kwargs)
5672 
5673-    def download(self):
5674+
5675+    ###################
5676+    # IPushProducer
5677+
5678+    def pauseProducing(self):
5679+        """
5680+        I am called by my download target if we have produced too much
5681+        data for it to handle. I make the downloader stop producing new
5682+        data until my resumeProducing method is called.
5683+        """
5684+        if self._paused:
5685+            return
5686+
5687+        # fired when the download is unpaused.
5688+        self._old_status = self._status.get_status()
5689+        self._status.set_status("Paused")
5690+
5691+        self._pause_deferred = defer.Deferred()
5692+        self._paused = True
5693+
5694+
5695+    def resumeProducing(self):
5696+        """
5697+        I am called by my download target once it is ready to begin
5698+        receiving data again.
5699+        """
5700+        if not self._paused:
5701+            return
5702+
5703+        self._paused = False
5704+        p = self._pause_deferred
5705+        self._pause_deferred = None
5706+        self._status.set_status(self._old_status)
5707+
5708+        eventually(p.callback, None)
5709+
5710+
5711+    def _check_for_paused(self, res):
5712+        """
5713+        I am called just before a write to the consumer. I return a
5714+        Deferred that eventually fires with the data that is to be
5715+        written to the consumer. If the download has not been paused,
5716+        the Deferred fires immediately. Otherwise, the Deferred fires
5717+        when the downloader is unpaused.
5718+        """
5719+        if self._paused:
5720+            d = defer.Deferred()
5721+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
5722+            return d
5723+        return defer.succeed(res)
5724+
5725+
5726+    def download(self, consumer=None, offset=0, size=None):
5727+        assert IConsumer.providedBy(consumer) or self._verify
5728+
5729+        if consumer:
5730+            self._consumer = consumer
5731+            # we provide IPushProducer, so streaming=True, per
5732+            # IConsumer.
5733+            self._consumer.registerProducer(self, streaming=True)
5734+
5735         self._done_deferred = defer.Deferred()
5736         self._started = time.time()
5737         self._status.set_status("Retrieving Shares")
5738hunk ./src/allmydata/mutable/retrieve.py 225
5739 
5740+        self._offset = offset
5741+        self._read_length = size
5742+
5743         # first, which servers can we use?
5744         versionmap = self.servermap.make_versionmap()
5745         shares = versionmap[self.verinfo]
5746hunk ./src/allmydata/mutable/retrieve.py 235
5747         self.remaining_sharemap = DictOfSets()
5748         for (shnum, peerid, timestamp) in shares:
5749             self.remaining_sharemap.add(shnum, peerid)
5750+            # If the servermap update fetched anything, it fetched at least 1
5751+            # KiB, so we ask for that much.
5752+            # TODO: Change the cache methods to allow us to fetch all of the
5753+            # data that they have, then change this method to do that.
5754+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
5755+                                                               shnum,
5756+                                                               0,
5757+                                                               1000)
5758+            ss = self.servermap.connections[peerid]
5759+            reader = MDMFSlotReadProxy(ss,
5760+                                       self._storage_index,
5761+                                       shnum,
5762+                                       any_cache)
5763+            reader.peerid = peerid
5764+            self.readers[shnum] = reader
5765+
5766 
5767         self.shares = {} # maps shnum to validated blocks
5768hunk ./src/allmydata/mutable/retrieve.py 253
5769+        self._active_readers = [] # list of active readers for this dl.
5770+        self._validated_readers = set() # set of readers that we have
5771+                                        # validated the prefix of
5772+        self._block_hash_trees = {} # shnum => hashtree
5773 
5774         # how many shares do we need?
5775hunk ./src/allmydata/mutable/retrieve.py 259
5776-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
5777+        (seqnum,
5778+         root_hash,
5779+         IV,
5780+         segsize,
5781+         datalength,
5782+         k,
5783+         N,
5784+         prefix,
5785          offsets_tuple) = self.verinfo
5786hunk ./src/allmydata/mutable/retrieve.py 268
5787-        assert len(self.remaining_sharemap) >= k
5788-        # we start with the lowest shnums we have available, since FEC is
5789-        # faster if we're using "primary shares"
5790-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
5791-        for shnum in self.active_shnums:
5792-            # we use an arbitrary peer who has the share. If shares are
5793-            # doubled up (more than one share per peer), we could make this
5794-            # run faster by spreading the load among multiple peers. But the
5795-            # algorithm to do that is more complicated than I want to write
5796-            # right now, and a well-provisioned grid shouldn't have multiple
5797-            # shares per peer.
5798-            peerid = list(self.remaining_sharemap[shnum])[0]
5799-            self.get_data(shnum, peerid)
5800 
5801hunk ./src/allmydata/mutable/retrieve.py 269
5802-        # control flow beyond this point: state machine. Receiving responses
5803-        # from queries is the input. We might send out more queries, or we
5804-        # might produce a result.
5805 
5806hunk ./src/allmydata/mutable/retrieve.py 270
5807+        # We need one share hash tree for the entire file; its leaves
5808+        # are the roots of the block hash trees for the shares that
5809+        # comprise it, and its root is in the verinfo.
5810+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
5811+        self.share_hash_tree.set_hashes({0: root_hash})
5812+
5813+        # This will set up both the segment decoder and the tail segment
5814+        # decoder, as well as a variety of other instance variables that
5815+        # the download process will use.
5816+        self._setup_encoding_parameters()
5817+        assert len(self.remaining_sharemap) >= k
5818+
5819+        self.log("starting download")
5820+        self._paused = False
5821+        self._started_fetching = time.time()
5822+
5823+        self._add_active_peers()
5824+        # The download process beyond this is a state machine.
5825+        # _add_active_peers will select the peers that we want to use
5826+        # for the download, and then attempt to start downloading. After
5827+        # each segment, it will check for doneness, reacting to broken
5828+        # peers and corrupt shares as necessary. If it runs out of good
5829+        # peers before downloading all of the segments, _done_deferred
5830+        # will errback.  Otherwise, it will eventually callback with the
5831+        # contents of the mutable file.
5832         return self._done_deferred
5833 
5834hunk ./src/allmydata/mutable/retrieve.py 297
5835-    def get_data(self, shnum, peerid):
5836-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
5837-                 shnum=shnum,
5838-                 peerid=idlib.shortnodeid_b2a(peerid),
5839-                 level=log.NOISY)
5840-        ss = self.servermap.connections[peerid]
5841-        started = time.time()
5842-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
5843+
5844+    def decode(self, blocks_and_salts, segnum):
5845+        """
5846+        I am a helper method that the mutable file update process uses
5847+        as a shortcut to decode and decrypt the segments that it needs
5848+        to fetch in order to perform a file update. I take in a
5849+        collection of blocks and salts, and pick some of those to make a
5850+        segment with. I return the plaintext associated with that
5851+        segment.
5852+        """
5853+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
5854+        # want to set this.
5855+        # XXX: Make it so that it won't set this if we're just decoding.
5856+        self._block_hash_trees = {}
5857+        self._setup_encoding_parameters()
5858+        # This is the form expected by decode.
5859+        blocks_and_salts = blocks_and_salts.items()
5860+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
5861+
5862+        d = self._decode_blocks(blocks_and_salts, segnum)
5863+        d.addCallback(self._decrypt_segment)
5864+        return d
5865+
5866+
5867+    def _setup_encoding_parameters(self):
5868+        """
5869+        I set up the encoding parameters, including k, n, the number
5870+        of segments associated with this file, and the segment decoder.
5871+        """
5872+        (seqnum,
5873+         root_hash,
5874+         IV,
5875+         segsize,
5876+         datalength,
5877+         k,
5878+         n,
5879+         known_prefix,
5880          offsets_tuple) = self.verinfo
5881hunk ./src/allmydata/mutable/retrieve.py 335
5882-        offsets = dict(offsets_tuple)
5883+        self._required_shares = k
5884+        self._total_shares = n
5885+        self._segment_size = segsize
5886+        self._data_length = datalength
5887 
5888hunk ./src/allmydata/mutable/retrieve.py 340
5889-        # we read the checkstring, to make sure that the data we grab is from
5890-        # the right version.
5891-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
5892+        if not IV:
5893+            self._version = MDMF_VERSION
5894+        else:
5895+            self._version = SDMF_VERSION
5896 
5897hunk ./src/allmydata/mutable/retrieve.py 345
5898-        # We also read the data, and the hashes necessary to validate them
5899-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
5900-        # signature or the pubkey, since that was handled during the
5901-        # servermap phase, and we'll be comparing the share hash chain
5902-        # against the roothash that was validated back then.
5903+        if datalength and segsize:
5904+            self._num_segments = mathutil.div_ceil(datalength, segsize)
5905+            self._tail_data_size = datalength % segsize
5906+        else:
5907+            self._num_segments = 0
5908+            self._tail_data_size = 0
5909 
5910hunk ./src/allmydata/mutable/retrieve.py 352
5911-        readv.append( (offsets['share_hash_chain'],
5912-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
5913+        self._segment_decoder = codec.CRSDecoder()
5914+        self._segment_decoder.set_params(segsize, k, n)
5915 
5916hunk ./src/allmydata/mutable/retrieve.py 355
5917-        # if we need the private key (for repair), we also fetch that
5918-        if self._need_privkey:
5919-            readv.append( (offsets['enc_privkey'],
5920-                           offsets['EOF'] - offsets['enc_privkey']) )
5921+        if  not self._tail_data_size:
5922+            self._tail_data_size = segsize
5923+
5924+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
5925+                                                         self._required_shares)
5926+        if self._tail_segment_size == self._segment_size:
5927+            self._tail_decoder = self._segment_decoder
5928+        else:
5929+            self._tail_decoder = codec.CRSDecoder()
5930+            self._tail_decoder.set_params(self._tail_segment_size,
5931+                                          self._required_shares,
5932+                                          self._total_shares)
5933 
5934hunk ./src/allmydata/mutable/retrieve.py 368
5935-        m = Marker()
5936-        self._outstanding_queries[m] = (peerid, shnum, started)
5937+        self.log("got encoding parameters: "
5938+                 "k: %d "
5939+                 "n: %d "
5940+                 "%d segments of %d bytes each (%d byte tail segment)" % \
5941+                 (k, n, self._num_segments, self._segment_size,
5942+                  self._tail_segment_size))
5943 
5944hunk ./src/allmydata/mutable/retrieve.py 375
5945-        # ask the cache first
5946-        got_from_cache = False
5947-        datavs = []
5948-        for (offset, length) in readv:
5949-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
5950-                                                            offset, length)
5951-            if data is not None:
5952-                datavs.append(data)
5953-        if len(datavs) == len(readv):
5954-            self.log("got data from cache")
5955-            got_from_cache = True
5956-            d = fireEventually({shnum: datavs})
5957-            # datavs is a dict mapping shnum to a pair of strings
5958+        for i in xrange(self._total_shares):
5959+            # So we don't have to do this later.
5960+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
5961+
5962+        # Our last task is to tell the downloader where to start and
5963+        # where to stop. We use three parameters for that:
5964+        #   - self._start_segment: the segment that we need to start
5965+        #     downloading from.
5966+        #   - self._current_segment: the next segment that we need to
5967+        #     download.
5968+        #   - self._last_segment: The last segment that we were asked to
5969+        #     download.
5970+        #
5971+        #  We say that the download is complete when
5972+        #  self._current_segment > self._last_segment. We use
5973+        #  self._start_segment and self._last_segment to know when to
5974+        #  strip things off of segments, and how much to strip.
5975+        if self._offset:
5976+            self.log("got offset: %d" % self._offset)
5977+            # our start segment is the first segment containing the
5978+            # offset we were given.
5979+            start = mathutil.div_ceil(self._offset,
5980+                                      self._segment_size)
5981+            # this gets us the first segment after self._offset. Then
5982+            # our start segment is the one before it.
5983+            start -= 1
5984+
5985+            assert start < self._num_segments
5986+            self._start_segment = start
5987+            self.log("got start segment: %d" % self._start_segment)
5988         else:
5989hunk ./src/allmydata/mutable/retrieve.py 406
5990-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
5991-        self.remaining_sharemap.discard(shnum, peerid)
5992+            self._start_segment = 0
5993 
5994hunk ./src/allmydata/mutable/retrieve.py 408
5995-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
5996-        d.addErrback(self._query_failed, m, peerid)
5997-        # errors that aren't handled by _query_failed (and errors caused by
5998-        # _query_failed) get logged, but we still want to check for doneness.
5999-        def _oops(f):
6000-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
6001-                     shnum=shnum,
6002-                     peerid=idlib.shortnodeid_b2a(peerid),
6003-                     failure=f,
6004-                     level=log.WEIRD, umid="W0xnQA")
6005-        d.addErrback(_oops)
6006-        d.addBoth(self._check_for_done)
6007-        # any error during _check_for_done means the download fails. If the
6008-        # download is successful, _check_for_done will fire _done by itself.
6009-        d.addErrback(self._done)
6010-        d.addErrback(log.err)
6011-        return d # purely for testing convenience
6012 
6013hunk ./src/allmydata/mutable/retrieve.py 409
6014-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
6015-        # isolate the callRemote to a separate method, so tests can subclass
6016-        # Publish and override it
6017-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
6018-        return d
6019+        if self._read_length:
6020+            # our end segment is the last segment containing part of the
6021+            # segment that we were asked to read.
6022+            self.log("got read length %d" % self._read_length)
6023+            end_data = self._offset + self._read_length
6024+            end = mathutil.div_ceil(end_data,
6025+                                    self._segment_size)
6026+            end -= 1
6027+            assert end < self._num_segments
6028+            self._last_segment = end
6029+            self.log("got end segment: %d" % self._last_segment)
6030+        else:
6031+            self._last_segment = self._num_segments - 1
6032 
6033hunk ./src/allmydata/mutable/retrieve.py 423
6034-    def remove_peer(self, peerid):
6035-        for shnum in list(self.remaining_sharemap.keys()):
6036-            self.remaining_sharemap.discard(shnum, peerid)
6037+        self._current_segment = self._start_segment
6038 
6039hunk ./src/allmydata/mutable/retrieve.py 425
6040-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
6041-        now = time.time()
6042-        elapsed = now - started
6043-        if not got_from_cache:
6044-            self._status.add_fetch_timing(peerid, elapsed)
6045-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
6046-                 shares=len(datavs),
6047-                 peerid=idlib.shortnodeid_b2a(peerid),
6048-                 level=log.NOISY)
6049-        self._outstanding_queries.pop(marker, None)
6050-        if not self._running:
6051-            return
6052+    def _add_active_peers(self):
6053+        """
6054+        I populate self._active_readers with enough active readers to
6055+        retrieve the contents of this mutable file. I am called before
6056+        downloading starts, and (eventually) after each validation
6057+        error, connection error, or other problem in the download.
6058+        """
6059+        # TODO: It would be cool to investigate other heuristics for
6060+        # reader selection. For instance, the cost (in time the user
6061+        # spends waiting for their file) of selecting a really slow peer
6062+        # that happens to have a primary share is probably more than
6063+        # selecting a really fast peer that doesn't have a primary
6064+        # share. Maybe the servermap could be extended to provide this
6065+        # information; it could keep track of latency information while
6066+        # it gathers more important data, and then this routine could
6067+        # use that to select active readers.
6068+        #
6069+        # (these and other questions would be easier to answer with a
6070+        #  robust, configurable tahoe-lafs simulator, which modeled node
6071+        #  failures, differences in node speed, and other characteristics
6072+        #  that we expect storage servers to have.  You could have
6073+        #  presets for really stable grids (like allmydata.com),
6074+        #  friendnets, make it easy to configure your own settings, and
6075+        #  then simulate the effect of big changes on these use cases
6076+        #  instead of just reasoning about what the effect might be. Out
6077+        #  of scope for MDMF, though.)
6078 
6079hunk ./src/allmydata/mutable/retrieve.py 452
6080-        # note that we only ask for a single share per query, so we only
6081-        # expect a single share back. On the other hand, we use the extra
6082-        # shares if we get them.. seems better than an assert().
6083+        # We need at least self._required_shares readers to download a
6084+        # segment.
6085+        if self._verify:
6086+            needed = self._total_shares
6087+        else:
6088+            needed = self._required_shares - len(self._active_readers)
6089+        # XXX: Why don't format= log messages work here?
6090+        self.log("adding %d peers to the active peers list" % needed)
6091 
6092hunk ./src/allmydata/mutable/retrieve.py 461
6093-        for shnum,datav in datavs.items():
6094-            (prefix, hash_and_data) = datav[:2]
6095-            try:
6096-                self._got_results_one_share(shnum, peerid,
6097-                                            prefix, hash_and_data)
6098-            except CorruptShareError, e:
6099-                # log it and give the other shares a chance to be processed
6100-                f = failure.Failure()
6101-                self.log(format="bad share: %(f_value)s",
6102-                         f_value=str(f.value), failure=f,
6103-                         level=log.WEIRD, umid="7fzWZw")
6104-                self.notify_server_corruption(peerid, shnum, str(e))
6105-                self.remove_peer(peerid)
6106-                self.servermap.mark_bad_share(peerid, shnum, prefix)
6107-                self._bad_shares.add( (peerid, shnum) )
6108-                self._status.problems[peerid] = f
6109-                self._last_failure = f
6110-                pass
6111-            if self._need_privkey and len(datav) > 2:
6112-                lp = None
6113-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
6114-        # all done!
6115+        # We favor lower numbered shares, since FEC is faster with
6116+        # primary shares than with other shares, and lower-numbered
6117+        # shares are more likely to be primary than higher numbered
6118+        # shares.
6119+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
6120+        # We shouldn't consider adding shares that we already have; this
6121+        # will cause problems later.
6122+        active_shnums -= set([reader.shnum for reader in self._active_readers])
6123+        active_shnums = list(active_shnums)[:needed]
6124+        if len(active_shnums) < needed and not self._verify:
6125+            # We don't have enough readers to retrieve the file; fail.
6126+            return self._failed()
6127 
6128hunk ./src/allmydata/mutable/retrieve.py 474
6129-    def notify_server_corruption(self, peerid, shnum, reason):
6130-        ss = self.servermap.connections[peerid]
6131-        ss.callRemoteOnly("advise_corrupt_share",
6132-                          "mutable", self._storage_index, shnum, reason)
6133+        for shnum in active_shnums:
6134+            self._active_readers.append(self.readers[shnum])
6135+            self.log("added reader for share %d" % shnum)
6136+        assert len(self._active_readers) >= self._required_shares
6137+        # Conceptually, this is part of the _add_active_peers step. It
6138+        # validates the prefixes of newly added readers to make sure
6139+        # that they match what we are expecting for self.verinfo. If
6140+        # validation is successful, _validate_active_prefixes will call
6141+        # _download_current_segment for us. If validation is
6142+        # unsuccessful, then _validate_prefixes will remove the peer and
6143+        # call _add_active_peers again, where we will attempt to rectify
6144+        # the problem by choosing another peer.
6145+        return self._validate_active_prefixes()
6146 
6147hunk ./src/allmydata/mutable/retrieve.py 488
6148-    def _got_results_one_share(self, shnum, peerid,
6149-                               got_prefix, got_hash_and_data):
6150-        self.log("_got_results: got shnum #%d from peerid %s"
6151-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
6152-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
6153-         offsets_tuple) = self.verinfo
6154-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
6155-        if got_prefix != prefix:
6156-            msg = "someone wrote to the data since we read the servermap: prefix changed"
6157-            raise UncoordinatedWriteError(msg)
6158-        (share_hash_chain, block_hash_tree,
6159-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
6160 
6161hunk ./src/allmydata/mutable/retrieve.py 489
6162-        assert isinstance(share_data, str)
6163-        # build the block hash tree. SDMF has only one leaf.
6164-        leaves = [hashutil.block_hash(share_data)]
6165-        t = hashtree.HashTree(leaves)
6166-        if list(t) != block_hash_tree:
6167-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
6168-        share_hash_leaf = t[0]
6169-        t2 = hashtree.IncompleteHashTree(N)
6170-        # root_hash was checked by the signature
6171-        t2.set_hashes({0: root_hash})
6172-        try:
6173-            t2.set_hashes(hashes=share_hash_chain,
6174-                          leaves={shnum: share_hash_leaf})
6175-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
6176-                IndexError), e:
6177-            msg = "corrupt hashes: %s" % (e,)
6178-            raise CorruptShareError(peerid, shnum, msg)
6179-        self.log(" data valid! len=%d" % len(share_data))
6180-        # each query comes down to this: placing validated share data into
6181-        # self.shares
6182-        self.shares[shnum] = share_data
6183+    def _validate_active_prefixes(self):
6184+        """
6185+        I check to make sure that the prefixes on the peers that I am
6186+        currently reading from match the prefix that we want to see, as
6187+        said in self.verinfo.
6188 
6189hunk ./src/allmydata/mutable/retrieve.py 495
6190-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
6191+        If I find that all of the active peers have acceptable prefixes,
6192+        I pass control to _download_current_segment, which will use
6193+        those peers to do cool things. If I find that some of the active
6194+        peers have unacceptable prefixes, I will remove them from active
6195+        peers (and from further consideration) and call
6196+        _add_active_peers to attempt to rectify the situation. I keep
6197+        track of which peers I have already validated so that I don't
6198+        need to do so again.
6199+        """
6200+        assert self._active_readers, "No more active readers"
6201 
6202hunk ./src/allmydata/mutable/retrieve.py 506
6203-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
6204-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
6205-        if alleged_writekey != self._node.get_writekey():
6206-            self.log("invalid privkey from %s shnum %d" %
6207-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
6208-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
6209-            return
6210+        ds = []
6211+        new_readers = set(self._active_readers) - self._validated_readers
6212+        self.log('validating %d newly-added active readers' % len(new_readers))
6213 
6214hunk ./src/allmydata/mutable/retrieve.py 510
6215-        # it's good
6216-        self.log("got valid privkey from shnum %d on peerid %s" %
6217-                 (shnum, idlib.shortnodeid_b2a(peerid)),
6218-                 parent=lp)
6219-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
6220-        self._node._populate_encprivkey(enc_privkey)
6221-        self._node._populate_privkey(privkey)
6222-        self._need_privkey = False
6223+        for reader in new_readers:
6224+            # We force a remote read here -- otherwise, we are relying
6225+            # on cached data that we already verified as valid, and we
6226+            # won't detect an uncoordinated write that has occurred
6227+            # since the last servermap update.
6228+            d = reader.get_prefix(force_remote=True)
6229+            d.addCallback(self._try_to_validate_prefix, reader)
6230+            ds.append(d)
6231+        dl = defer.DeferredList(ds, consumeErrors=True)
6232+        def _check_results(results):
6233+            # Each result in results will be of the form (success, msg).
6234+            # We don't care about msg, but success will tell us whether
6235+            # or not the checkstring validated. If it didn't, we need to
6236+            # remove the offending (peer,share) from our active readers,
6237+            # and ensure that active readers is again populated.
6238+            bad_readers = []
6239+            for i, result in enumerate(results):
6240+                if not result[0]:
6241+                    reader = self._active_readers[i]
6242+                    f = result[1]
6243+                    assert isinstance(f, failure.Failure)
6244 
6245hunk ./src/allmydata/mutable/retrieve.py 532
6246-    def _query_failed(self, f, marker, peerid):
6247-        self.log(format="query to [%(peerid)s] failed",
6248-                 peerid=idlib.shortnodeid_b2a(peerid),
6249-                 level=log.NOISY)
6250-        self._status.problems[peerid] = f
6251-        self._outstanding_queries.pop(marker, None)
6252-        if not self._running:
6253-            return
6254-        self._last_failure = f
6255-        self.remove_peer(peerid)
6256-        level = log.WEIRD
6257-        if f.check(DeadReferenceError):
6258-            level = log.UNUSUAL
6259-        self.log(format="error during query: %(f_value)s",
6260-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
6261+                    self.log("The reader %s failed to "
6262+                             "properly validate: %s" % \
6263+                             (reader, str(f.value)))
6264+                    bad_readers.append((reader, f))
6265+                else:
6266+                    reader = self._active_readers[i]
6267+                    self.log("the reader %s checks out, so we'll use it" % \
6268+                             reader)
6269+                    self._validated_readers.add(reader)
6270+                    # Each time we validate a reader, we check to see if
6271+                    # we need the private key. If we do, we politely ask
6272+                    # for it and then continue computing. If we find
6273+                    # that we haven't gotten it at the end of
6274+                    # segment decoding, then we'll take more drastic
6275+                    # measures.
6276+                    if self._need_privkey and not self._node.is_readonly():
6277+                        d = reader.get_encprivkey()
6278+                        d.addCallback(self._try_to_validate_privkey, reader)
6279+            if bad_readers:
6280+                # We do them all at once, or else we screw up list indexing.
6281+                for (reader, f) in bad_readers:
6282+                    self._mark_bad_share(reader, f)
6283+                if self._verify:
6284+                    if len(self._active_readers) >= self._required_shares:
6285+                        return self._download_current_segment()
6286+                    else:
6287+                        return self._failed()
6288+                else:
6289+                    return self._add_active_peers()
6290+            else:
6291+                return self._download_current_segment()
6292+            # The next step will assert that it has enough active
6293+            # readers to fetch shares; we just need to remove it.
6294+        dl.addCallback(_check_results)
6295+        return dl
6296 
6297hunk ./src/allmydata/mutable/retrieve.py 568
6298-    def _check_for_done(self, res):
6299-        # exit paths:
6300-        #  return : keep waiting, no new queries
6301-        #  return self._send_more_queries(outstanding) : send some more queries
6302-        #  fire self._done(plaintext) : download successful
6303-        #  raise exception : download fails
6304 
6305hunk ./src/allmydata/mutable/retrieve.py 569
6306-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
6307-                 running=self._running, decoding=self._decoding,
6308-                 level=log.NOISY)
6309-        if not self._running:
6310-            return
6311-        if self._decoding:
6312-            return
6313-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
6314+    def _try_to_validate_prefix(self, prefix, reader):
6315+        """
6316+        I check that the prefix returned by a candidate server for
6317+        retrieval matches the prefix that the servermap knows about
6318+        (and, hence, the prefix that was validated earlier). If it does,
6319+        I return True, which means that I approve of the use of the
6320+        candidate server for segment retrieval. If it doesn't, I return
6321+        False, which means that another server must be chosen.
6322+        """
6323+        (seqnum,
6324+         root_hash,
6325+         IV,
6326+         segsize,
6327+         datalength,
6328+         k,
6329+         N,
6330+         known_prefix,
6331          offsets_tuple) = self.verinfo
6332hunk ./src/allmydata/mutable/retrieve.py 587
6333+        if known_prefix != prefix:
6334+            self.log("prefix from share %d doesn't match" % reader.shnum)
6335+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
6336+                                          "indicate an uncoordinated write")
6337+        # Otherwise, we're okay -- no issues.
6338 
6339hunk ./src/allmydata/mutable/retrieve.py 593
6340-        if len(self.shares) < k:
6341-            # we don't have enough shares yet
6342-            return self._maybe_send_more_queries(k)
6343-        if self._need_privkey:
6344-            # we got k shares, but none of them had a valid privkey. TODO:
6345-            # look further. Adding code to do this is a bit complicated, and
6346-            # I want to avoid that complication, and this should be pretty
6347-            # rare (k shares with bitflips in the enc_privkey but not in the
6348-            # data blocks). If we actually do get here, the subsequent repair
6349-            # will fail for lack of a privkey.
6350-            self.log("got k shares but still need_privkey, bummer",
6351-                     level=log.WEIRD, umid="MdRHPA")
6352 
6353hunk ./src/allmydata/mutable/retrieve.py 594
6354-        # we have enough to finish. All the shares have had their hashes
6355-        # checked, so if something fails at this point, we don't know how
6356-        # to fix it, so the download will fail.
6357+    def _remove_reader(self, reader):
6358+        """
6359+        At various points, we will wish to remove a peer from
6360+        consideration and/or use. These include, but are not necessarily
6361+        limited to:
6362 
6363hunk ./src/allmydata/mutable/retrieve.py 600
6364-        self._decoding = True # avoid reentrancy
6365-        self._status.set_status("decoding")
6366-        now = time.time()
6367-        elapsed = now - self._started
6368-        self._status.timings["fetch"] = elapsed
6369+            - A connection error.
6370+            - A mismatched prefix (that is, a prefix that does not match
6371+              our conception of the version information string).
6372+            - A failing block hash, salt hash, or share hash, which can
6373+              indicate disk failure/bit flips, or network trouble.
6374 
6375hunk ./src/allmydata/mutable/retrieve.py 606
6376-        d = defer.maybeDeferred(self._decode)
6377-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
6378-        d.addBoth(self._done)
6379-        return d # purely for test convenience
6380+        This method will do that. I will make sure that the
6381+        (shnum,reader) combination represented by my reader argument is
6382+        not used for anything else during this download. I will not
6383+        advise the reader of any corruption, something that my callers
6384+        may wish to do on their own.
6385+        """
6386+        # TODO: When you're done writing this, see if this is ever
6387+        # actually used for something that _mark_bad_share isn't. I have
6388+        # a feeling that they will be used for very similar things, and
6389+        # that having them both here is just going to be an epic amount
6390+        # of code duplication.
6391+        #
6392+        # (well, okay, not epic, but meaningful)
6393+        self.log("removing reader %s" % reader)
6394+        # Remove the reader from _active_readers
6395+        self._active_readers.remove(reader)
6396+        # TODO: self.readers.remove(reader)?
6397+        for shnum in list(self.remaining_sharemap.keys()):
6398+            self.remaining_sharemap.discard(shnum, reader.peerid)
6399 
6400hunk ./src/allmydata/mutable/retrieve.py 626
6401-    def _maybe_send_more_queries(self, k):
6402-        # we don't have enough shares yet. Should we send out more queries?
6403-        # There are some number of queries outstanding, each for a single
6404-        # share. If we can generate 'needed_shares' additional queries, we do
6405-        # so. If we can't, then we know this file is a goner, and we raise
6406-        # NotEnoughSharesError.
6407-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
6408-                         "outstanding=%(outstanding)d"),
6409-                 have=len(self.shares), k=k,
6410-                 outstanding=len(self._outstanding_queries),
6411-                 level=log.NOISY)
6412 
6413hunk ./src/allmydata/mutable/retrieve.py 627
6414-        remaining_shares = k - len(self.shares)
6415-        needed = remaining_shares - len(self._outstanding_queries)
6416-        if not needed:
6417-            # we have enough queries in flight already
6418+    def _mark_bad_share(self, reader, f):
6419+        """
6420+        I mark the (peerid, shnum) encapsulated by my reader argument as
6421+        a bad share, which means that it will not be used anywhere else.
6422 
6423hunk ./src/allmydata/mutable/retrieve.py 632
6424-            # TODO: but if they've been in flight for a long time, and we
6425-            # have reason to believe that new queries might respond faster
6426-            # (i.e. we've seen other queries come back faster, then consider
6427-            # sending out new queries. This could help with peers which have
6428-            # silently gone away since the servermap was updated, for which
6429-            # we're still waiting for the 15-minute TCP disconnect to happen.
6430-            self.log("enough queries are in flight, no more are needed",
6431-                     level=log.NOISY)
6432-            return
6433+        There are several reasons to want to mark something as a bad
6434+        share. These include:
6435+
6436+            - A connection error to the peer.
6437+            - A mismatched prefix (that is, a prefix that does not match
6438+              our local conception of the version information string).
6439+            - A failing block hash, salt hash, share hash, or other
6440+              integrity check.
6441 
6442hunk ./src/allmydata/mutable/retrieve.py 641
6443-        outstanding_shnums = set([shnum
6444-                                  for (peerid, shnum, started)
6445-                                  in self._outstanding_queries.values()])
6446-        # prefer low-numbered shares, they are more likely to be primary
6447-        available_shnums = sorted(self.remaining_sharemap.keys())
6448-        for shnum in available_shnums:
6449-            if shnum in outstanding_shnums:
6450-                # skip ones that are already in transit
6451-                continue
6452-            if shnum not in self.remaining_sharemap:
6453-                # no servers for that shnum. note that DictOfSets removes
6454-                # empty sets from the dict for us.
6455-                continue
6456-            peerid = list(self.remaining_sharemap[shnum])[0]
6457-            # get_data will remove that peerid from the sharemap, and add the
6458-            # query to self._outstanding_queries
6459-            self._status.set_status("Retrieving More Shares")
6460-            self.get_data(shnum, peerid)
6461-            needed -= 1
6462-            if not needed:
6463+        This method will ensure that readers that we wish to mark bad
6464+        (for these reasons or other reasons) are not used for the rest
6465+        of the download. Additionally, it will attempt to tell the
6466+        remote peer (with no guarantee of success) that its share is
6467+        corrupt.
6468+        """
6469+        self.log("marking share %d on server %s as bad" % \
6470+                 (reader.shnum, reader))
6471+        prefix = self.verinfo[-2]
6472+        self.servermap.mark_bad_share(reader.peerid,
6473+                                      reader.shnum,
6474+                                      prefix)
6475+        self._remove_reader(reader)
6476+        self._bad_shares.add((reader.peerid, reader.shnum, f))
6477+        self._status.problems[reader.peerid] = f
6478+        self._last_failure = f
6479+        self.notify_server_corruption(reader.peerid, reader.shnum,
6480+                                      str(f.value))
6481+
6482+
6483+    def _download_current_segment(self):
6484+        """
6485+        I download, validate, decode, decrypt, and assemble the segment
6486+        that this Retrieve is currently responsible for downloading.
6487+        """
6488+        assert len(self._active_readers) >= self._required_shares
6489+        if self._current_segment <= self._last_segment:
6490+            d = self._process_segment(self._current_segment)
6491+        else:
6492+            d = defer.succeed(None)
6493+        d.addBoth(self._turn_barrier)
6494+        d.addCallback(self._check_for_done)
6495+        return d
6496+
6497+
6498+    def _turn_barrier(self, result):
6499+        """
6500+        I help the download process avoid the recursion limit issues
6501+        discussed in #237.
6502+        """
6503+        return fireEventually(result)
6504+
6505+
6506+    def _process_segment(self, segnum):
6507+        """
6508+        I download, validate, decode, and decrypt one segment of the
6509+        file that this Retrieve is retrieving. This means coordinating
6510+        the process of getting k blocks of that file, validating them,
6511+        assembling them into one segment with the decoder, and then
6512+        decrypting them.
6513+        """
6514+        self.log("processing segment %d" % segnum)
6515+
6516+        # TODO: The old code uses a marker. Should this code do that
6517+        # too? What did the Marker do?
6518+        assert len(self._active_readers) >= self._required_shares
6519+
6520+        # We need to ask each of our active readers for its block and
6521+        # salt. We will then validate those. If validation is
6522+        # successful, we will assemble the results into plaintext.
6523+        ds = []
6524+        for reader in self._active_readers:
6525+            started = time.time()
6526+            d = reader.get_block_and_salt(segnum, queue=True)
6527+            d2 = self._get_needed_hashes(reader, segnum)
6528+            dl = defer.DeferredList([d, d2], consumeErrors=True)
6529+            dl.addCallback(self._validate_block, segnum, reader, started)
6530+            dl.addErrback(self._validation_or_decoding_failed, [reader])
6531+            ds.append(dl)
6532+            reader.flush()
6533+        dl = defer.DeferredList(ds)
6534+        if self._verify:
6535+            dl.addCallback(lambda ignored: "")
6536+            dl.addCallback(self._set_segment)
6537+        else:
6538+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
6539+        return dl
6540+
6541+
6542+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
6543+        """
6544+        I take the results of fetching and validating the blocks from a
6545+        callback chain in another method. If the results are such that
6546+        they tell me that validation and fetching succeeded without
6547+        incident, I will proceed with decoding and decryption.
6548+        Otherwise, I will do nothing.
6549+        """
6550+        self.log("trying to decode and decrypt segment %d" % segnum)
6551+        failures = False
6552+        for block_and_salt in blocks_and_salts:
6553+            if not block_and_salt[0] or block_and_salt[1] == None:
6554+                self.log("some validation operations failed; not proceeding")
6555+                failures = True
6556                 break
6557hunk ./src/allmydata/mutable/retrieve.py 735
6558+        if not failures:
6559+            self.log("everything looks ok, building segment %d" % segnum)
6560+            d = self._decode_blocks(blocks_and_salts, segnum)
6561+            d.addCallback(self._decrypt_segment)
6562+            d.addErrback(self._validation_or_decoding_failed,
6563+                         self._active_readers)
6564+            # check to see whether we've been paused before writing
6565+            # anything.
6566+            d.addCallback(self._check_for_paused)
6567+            d.addCallback(self._set_segment)
6568+            return d
6569+        else:
6570+            return defer.succeed(None)
6571+
6572+
6573+    def _set_segment(self, segment):
6574+        """
6575+        Given a plaintext segment, I register that segment with the
6576+        target that is handling the file download.
6577+        """
6578+        self.log("got plaintext for segment %d" % self._current_segment)
6579+        if self._current_segment == self._start_segment:
6580+            # We're on the first segment. It's possible that we want
6581+            # only some part of the end of this segment, and that we
6582+            # just downloaded the whole thing to get that part. If so,
6583+            # we need to account for that and give the reader just the
6584+            # data that they want.
6585+            n = self._offset % self._segment_size
6586+            self.log("stripping %d bytes off of the first segment" % n)
6587+            self.log("original segment length: %d" % len(segment))
6588+            segment = segment[n:]
6589+            self.log("new segment length: %d" % len(segment))
6590+
6591+        if self._current_segment == self._last_segment and self._read_length is not None:
6592+            # We're on the last segment. It's possible that we only want
6593+            # part of the beginning of this segment, and that we
6594+            # downloaded the whole thing anyway. Make sure to give the
6595+            # caller only the portion of the segment that they want to
6596+            # receive.
6597+            extra = self._read_length
6598+            if self._start_segment != self._last_segment:
6599+                extra -= self._segment_size - \
6600+                            (self._offset % self._segment_size)
6601+            extra %= self._segment_size
6602+            self.log("original segment length: %d" % len(segment))
6603+            segment = segment[:extra]
6604+            self.log("new segment length: %d" % len(segment))
6605+            self.log("only taking %d bytes of the last segment" % extra)
6606+
6607+        if not self._verify:
6608+            self._consumer.write(segment)
6609+        else:
6610+            # we don't care about the plaintext if we are doing a verify.
6611+            segment = None
6612+        self._current_segment += 1
6613 
6614hunk ./src/allmydata/mutable/retrieve.py 791
6615-        # at this point, we have as many outstanding queries as we can. If
6616-        # needed!=0 then we might not have enough to recover the file.
6617-        if needed:
6618-            format = ("ran out of peers: "
6619-                      "have %(have)d shares (k=%(k)d), "
6620-                      "%(outstanding)d queries in flight, "
6621-                      "need %(need)d more, "
6622-                      "found %(bad)d bad shares")
6623-            args = {"have": len(self.shares),
6624-                    "k": k,
6625-                    "outstanding": len(self._outstanding_queries),
6626-                    "need": needed,
6627-                    "bad": len(self._bad_shares),
6628-                    }
6629-            self.log(format=format,
6630-                     level=log.WEIRD, umid="ezTfjw", **args)
6631-            err = NotEnoughSharesError("%s, last failure: %s" %
6632-                                      (format % args, self._last_failure))
6633-            if self._bad_shares:
6634-                self.log("We found some bad shares this pass. You should "
6635-                         "update the servermap and try again to check "
6636-                         "more peers",
6637-                         level=log.WEIRD, umid="EFkOlA")
6638-                err.servermap = self.servermap
6639-            raise err
6640 
6641hunk ./src/allmydata/mutable/retrieve.py 792
6642+    def _validation_or_decoding_failed(self, f, readers):
6643+        """
6644+        I am called when a block or a salt fails to correctly validate, or when
6645+        the decryption or decoding operation fails for some reason.  I react to
6646+        this failure by notifying the remote server of corruption, and then
6647+        removing the remote peer from further activity.
6648+        """
6649+        assert isinstance(readers, list)
6650+        bad_shnums = [reader.shnum for reader in readers]
6651+
6652+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
6653+                 ", segment %d: %s" % \
6654+                 (bad_shnums, readers, self._current_segment, str(f)))
6655+        for reader in readers:
6656+            self._mark_bad_share(reader, f)
6657         return
6658 
6659hunk ./src/allmydata/mutable/retrieve.py 809
6660-    def _decode(self):
6661-        started = time.time()
6662-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
6663-         offsets_tuple) = self.verinfo
6664 
6665hunk ./src/allmydata/mutable/retrieve.py 810
6666-        # shares_dict is a dict mapping shnum to share data, but the codec
6667-        # wants two lists.
6668-        shareids = []; shares = []
6669-        for shareid, share in self.shares.items():
6670+    def _validate_block(self, results, segnum, reader, started):
6671+        """
6672+        I validate a block from one share on a remote server.
6673+        """
6674+        # Grab the part of the block hash tree that is necessary to
6675+        # validate this block, then generate the block hash root.
6676+        self.log("validating share %d for segment %d" % (reader.shnum,
6677+                                                             segnum))
6678+        self._status.add_fetch_timing(reader.peerid, started)
6679+        self._status.set_status("Valdiating blocks for segment %d" % segnum)
6680+        # Did we fail to fetch either of the things that we were
6681+        # supposed to? Fail if so.
6682+        if not results[0][0] and results[1][0]:
6683+            # handled by the errback handler.
6684+
6685+            # These all get batched into one query, so the resulting
6686+            # failure should be the same for all of them, so we can just
6687+            # use the first one.
6688+            assert isinstance(results[0][1], failure.Failure)
6689+
6690+            f = results[0][1]
6691+            raise CorruptShareError(reader.peerid,
6692+                                    reader.shnum,
6693+                                    "Connection error: %s" % str(f))
6694+
6695+        block_and_salt, block_and_sharehashes = results
6696+        block, salt = block_and_salt[1]
6697+        blockhashes, sharehashes = block_and_sharehashes[1]
6698+
6699+        blockhashes = dict(enumerate(blockhashes[1]))
6700+        self.log("the reader gave me the following blockhashes: %s" % \
6701+                 blockhashes.keys())
6702+        self.log("the reader gave me the following sharehashes: %s" % \
6703+                 sharehashes[1].keys())
6704+        bht = self._block_hash_trees[reader.shnum]
6705+
6706+        if bht.needed_hashes(segnum, include_leaf=True):
6707+            try:
6708+                bht.set_hashes(blockhashes)
6709+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
6710+                    IndexError), e:
6711+                raise CorruptShareError(reader.peerid,
6712+                                        reader.shnum,
6713+                                        "block hash tree failure: %s" % e)
6714+
6715+        if self._version == MDMF_VERSION:
6716+            blockhash = hashutil.block_hash(salt + block)
6717+        else:
6718+            blockhash = hashutil.block_hash(block)
6719+        # If this works without an error, then validation is
6720+        # successful.
6721+        try:
6722+           bht.set_hashes(leaves={segnum: blockhash})
6723+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
6724+                IndexError), e:
6725+            raise CorruptShareError(reader.peerid,
6726+                                    reader.shnum,
6727+                                    "block hash tree failure: %s" % e)
6728+
6729+        # Reaching this point means that we know that this segment
6730+        # is correct. Now we need to check to see whether the share
6731+        # hash chain is also correct.
6732+        # SDMF wrote share hash chains that didn't contain the
6733+        # leaves, which would be produced from the block hash tree.
6734+        # So we need to validate the block hash tree first. If
6735+        # successful, then bht[0] will contain the root for the
6736+        # shnum, which will be a leaf in the share hash tree, which
6737+        # will allow us to validate the rest of the tree.
6738+        if self.share_hash_tree.needed_hashes(reader.shnum,
6739+                                              include_leaf=True) or \
6740+                                              self._verify:
6741+            try:
6742+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
6743+                                            leaves={reader.shnum: bht[0]})
6744+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
6745+                    IndexError), e:
6746+                raise CorruptShareError(reader.peerid,
6747+                                        reader.shnum,
6748+                                        "corrupt hashes: %s" % e)
6749+
6750+        self.log('share %d is valid for segment %d' % (reader.shnum,
6751+                                                       segnum))
6752+        return {reader.shnum: (block, salt)}
6753+
6754+
6755+    def _get_needed_hashes(self, reader, segnum):
6756+        """
6757+        I get the hashes needed to validate segnum from the reader, then return
6758+        to my caller when this is done.
6759+        """
6760+        bht = self._block_hash_trees[reader.shnum]
6761+        needed = bht.needed_hashes(segnum, include_leaf=True)
6762+        # The root of the block hash tree is also a leaf in the share
6763+        # hash tree. So we don't need to fetch it from the remote
6764+        # server. In the case of files with one segment, this means that
6765+        # we won't fetch any block hash tree from the remote server,
6766+        # since the hash of each share of the file is the entire block
6767+        # hash tree, and is a leaf in the share hash tree. This is fine,
6768+        # since any share corruption will be detected in the share hash
6769+        # tree.
6770+        #needed.discard(0)
6771+        self.log("getting blockhashes for segment %d, share %d: %s" % \
6772+                 (segnum, reader.shnum, str(needed)))
6773+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
6774+        if self.share_hash_tree.needed_hashes(reader.shnum):
6775+            need = self.share_hash_tree.needed_hashes(reader.shnum)
6776+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
6777+                                                                 str(need)))
6778+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
6779+        else:
6780+            d2 = defer.succeed({}) # the logic in the next method
6781+                                   # expects a dict
6782+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
6783+        return dl
6784+
6785+
6786+    def _decode_blocks(self, blocks_and_salts, segnum):
6787+        """
6788+        I take a list of k blocks and salts, and decode that into a
6789+        single encrypted segment.
6790+        """
6791+        d = {}
6792+        # We want to merge our dictionaries to the form
6793+        # {shnum: blocks_and_salts}
6794+        #
6795+        # The dictionaries come from validate block that way, so we just
6796+        # need to merge them.
6797+        for block_and_salt in blocks_and_salts:
6798+            d.update(block_and_salt[1])
6799+
6800+        # All of these blocks should have the same salt; in SDMF, it is
6801+        # the file-wide IV, while in MDMF it is the per-segment salt. In
6802+        # either case, we just need to get one of them and use it.
6803+        #
6804+        # d.items()[0] is like (shnum, (block, salt))
6805+        # d.items()[0][1] is like (block, salt)
6806+        # d.items()[0][1][1] is the salt.
6807+        salt = d.items()[0][1][1]
6808+        # Next, extract just the blocks from the dict. We'll use the
6809+        # salt in the next step.
6810+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
6811+        d2 = dict(share_and_shareids)
6812+        shareids = []
6813+        shares = []
6814+        for shareid, share in d2.items():
6815             shareids.append(shareid)
6816             shares.append(share)
6817 
6818hunk ./src/allmydata/mutable/retrieve.py 958
6819-        assert len(shareids) >= k, len(shareids)
6820+        self._status.set_status("Decoding")
6821+        started = time.time()
6822+        assert len(shareids) >= self._required_shares, len(shareids)
6823         # zfec really doesn't want extra shares
6824hunk ./src/allmydata/mutable/retrieve.py 962
6825-        shareids = shareids[:k]
6826-        shares = shares[:k]
6827-
6828-        fec = codec.CRSDecoder()
6829-        fec.set_params(segsize, k, N)
6830-
6831-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
6832-        self.log("about to decode, shareids=%s" % (shareids,))
6833-        d = defer.maybeDeferred(fec.decode, shares, shareids)
6834-        def _done(buffers):
6835-            self._status.timings["decode"] = time.time() - started
6836-            self.log(" decode done, %d buffers" % len(buffers))
6837+        shareids = shareids[:self._required_shares]
6838+        shares = shares[:self._required_shares]
6839+        self.log("decoding segment %d" % segnum)
6840+        if segnum == self._num_segments - 1:
6841+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
6842+        else:
6843+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
6844+        def _process(buffers):
6845             segment = "".join(buffers)
6846hunk ./src/allmydata/mutable/retrieve.py 971
6847+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
6848+                     segnum=segnum,
6849+                     numsegs=self._num_segments,
6850+                     level=log.NOISY)
6851             self.log(" joined length %d, datalength %d" %
6852hunk ./src/allmydata/mutable/retrieve.py 976
6853-                     (len(segment), datalength))
6854-            segment = segment[:datalength]
6855+                     (len(segment), self._data_length))
6856+            if segnum == self._num_segments - 1:
6857+                size_to_use = self._tail_data_size
6858+            else:
6859+                size_to_use = self._segment_size
6860+            segment = segment[:size_to_use]
6861             self.log(" segment len=%d" % len(segment))
6862hunk ./src/allmydata/mutable/retrieve.py 983
6863-            return segment
6864-        def _err(f):
6865-            self.log(" decode failed: %s" % f)
6866-            return f
6867-        d.addCallback(_done)
6868-        d.addErrback(_err)
6869+            self._status.timings.setdefault("decode", 0)
6870+            self._status.timings['decode'] = time.time() - started
6871+            return segment, salt
6872+        d.addCallback(_process)
6873         return d
6874 
6875hunk ./src/allmydata/mutable/retrieve.py 989
6876-    def _decrypt(self, crypttext, IV, readkey):
6877+
6878+    def _decrypt_segment(self, segment_and_salt):
6879+        """
6880+        I take a single segment and its salt, and decrypt it. I return
6881+        the plaintext of the segment that is in my argument.
6882+        """
6883+        segment, salt = segment_and_salt
6884         self._status.set_status("decrypting")
6885hunk ./src/allmydata/mutable/retrieve.py 997
6886+        self.log("decrypting segment %d" % self._current_segment)
6887         started = time.time()
6888hunk ./src/allmydata/mutable/retrieve.py 999
6889-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
6890+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
6891         decryptor = AES(key)
6892hunk ./src/allmydata/mutable/retrieve.py 1001
6893-        plaintext = decryptor.process(crypttext)
6894-        self._status.timings["decrypt"] = time.time() - started
6895+        plaintext = decryptor.process(segment)
6896+        self._status.timings.setdefault("decrypt", 0)
6897+        self._status.timings['decrypt'] = time.time() - started
6898         return plaintext
6899 
6900hunk ./src/allmydata/mutable/retrieve.py 1006
6901-    def _done(self, res):
6902-        if not self._running:
6903+
6904+    def notify_server_corruption(self, peerid, shnum, reason):
6905+        ss = self.servermap.connections[peerid]
6906+        ss.callRemoteOnly("advise_corrupt_share",
6907+                          "mutable", self._storage_index, shnum, reason)
6908+
6909+
6910+    def _try_to_validate_privkey(self, enc_privkey, reader):
6911+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
6912+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
6913+        if alleged_writekey != self._node.get_writekey():
6914+            self.log("invalid privkey from %s shnum %d" %
6915+                     (reader, reader.shnum),
6916+                     level=log.WEIRD, umid="YIw4tA")
6917+            if self._verify:
6918+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
6919+                                              self.verinfo[-2])
6920+                e = CorruptShareError(reader.peerid,
6921+                                      reader.shnum,
6922+                                      "invalid privkey")
6923+                f = failure.Failure(e)
6924+                self._bad_shares.add((reader.peerid, reader.shnum, f))
6925             return
6926hunk ./src/allmydata/mutable/retrieve.py 1029
6927+
6928+        # it's good
6929+        self.log("got valid privkey from shnum %d on reader %s" %
6930+                 (reader.shnum, reader))
6931+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
6932+        self._node._populate_encprivkey(enc_privkey)
6933+        self._node._populate_privkey(privkey)
6934+        self._need_privkey = False
6935+
6936+
6937+    def _check_for_done(self, res):
6938+        """
6939+        I check to see if this Retrieve object has successfully finished
6940+        its work.
6941+
6942+        I can exit in the following ways:
6943+            - If there are no more segments to download, then I exit by
6944+              causing self._done_deferred to fire with the plaintext
6945+              content requested by the caller.
6946+            - If there are still segments to be downloaded, and there
6947+              are enough active readers (readers which have not broken
6948+              and have not given us corrupt data) to continue
6949+              downloading, I send control back to
6950+              _download_current_segment.
6951+            - If there are still segments to be downloaded but there are
6952+              not enough active peers to download them, I ask
6953+              _add_active_peers to add more peers. If it is successful,
6954+              it will call _download_current_segment. If there are not
6955+              enough peers to retrieve the file, then that will cause
6956+              _done_deferred to errback.
6957+        """
6958+        self.log("checking for doneness")
6959+        if self._current_segment > self._last_segment:
6960+            # No more segments to download, we're done.
6961+            self.log("got plaintext, done")
6962+            return self._done()
6963+
6964+        if len(self._active_readers) >= self._required_shares:
6965+            # More segments to download, but we have enough good peers
6966+            # in self._active_readers that we can do that without issue,
6967+            # so go nab the next segment.
6968+            self.log("not done yet: on segment %d of %d" % \
6969+                     (self._current_segment + 1, self._num_segments))
6970+            return self._download_current_segment()
6971+
6972+        self.log("not done yet: on segment %d of %d, need to add peers" % \
6973+                 (self._current_segment + 1, self._num_segments))
6974+        return self._add_active_peers()
6975+
6976+
6977+    def _done(self):
6978+        """
6979+        I am called by _check_for_done when the download process has
6980+        finished successfully. After making some useful logging
6981+        statements, I return the decrypted contents to the owner of this
6982+        Retrieve object through self._done_deferred.
6983+        """
6984         self._running = False
6985         self._status.set_active(False)
6986hunk ./src/allmydata/mutable/retrieve.py 1088
6987-        self._status.timings["total"] = time.time() - self._started
6988-        # res is either the new contents, or a Failure
6989-        if isinstance(res, failure.Failure):
6990-            self.log("Retrieve done, with failure", failure=res,
6991-                     level=log.UNUSUAL)
6992-            self._status.set_status("Failed")
6993+        now = time.time()
6994+        self._status.timings['total'] = now - self._started
6995+        self._status.timings['fetch'] = now - self._started_fetching
6996+
6997+        if self._verify:
6998+            ret = list(self._bad_shares)
6999+            self.log("done verifying, found %d bad shares" % len(ret))
7000         else:
7001hunk ./src/allmydata/mutable/retrieve.py 1096
7002-            self.log("Retrieve done, success!")
7003-            self._status.set_status("Finished")
7004-            self._status.set_progress(1.0)
7005-            # remember the encoding parameters, use them again next time
7006-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7007-             offsets_tuple) = self.verinfo
7008-            self._node._populate_required_shares(k)
7009-            self._node._populate_total_shares(N)
7010-        eventually(self._done_deferred.callback, res)
7011+            # TODO: upload status here?
7012+            ret = self._consumer
7013+            self._consumer.unregisterProducer()
7014+        eventually(self._done_deferred.callback, ret)
7015+
7016 
7017hunk ./src/allmydata/mutable/retrieve.py 1102
7018+    def _failed(self):
7019+        """
7020+        I am called by _add_active_peers when there are not enough
7021+        active peers left to complete the download. After making some
7022+        useful logging statements, I return an exception to that effect
7023+        to the caller of this Retrieve object through
7024+        self._done_deferred.
7025+        """
7026+        self._running = False
7027+        self._status.set_active(False)
7028+        now = time.time()
7029+        self._status.timings['total'] = now - self._started
7030+        self._status.timings['fetch'] = now - self._started_fetching
7031+
7032+        if self._verify:
7033+            ret = list(self._bad_shares)
7034+        else:
7035+            format = ("ran out of peers: "
7036+                      "have %(have)d of %(total)d segments "
7037+                      "found %(bad)d bad shares "
7038+                      "encoding %(k)d-of-%(n)d")
7039+            args = {"have": self._current_segment,
7040+                    "total": self._num_segments,
7041+                    "need": self._last_segment,
7042+                    "k": self._required_shares,
7043+                    "n": self._total_shares,
7044+                    "bad": len(self._bad_shares)}
7045+            e = NotEnoughSharesError("%s, last failure: %s" % \
7046+                                     (format % args, str(self._last_failure)))
7047+            f = failure.Failure(e)
7048+            ret = f
7049+        eventually(self._done_deferred.callback, ret)
7050}
7051[mutable/servermap.py: Alter the servermap updater to work with MDMF files
7052Kevan Carstensen <kevan@isnotajoke.com>**20100811233309
7053 Ignore-this: 5d2c922283c12cad93a5346e978cd691
7054 
7055 These modifications were basically all to the end of having the
7056 servermap updater use the unified MDMF + SDMF read interface whenever
7057 possible -- this reduces the complexity of the code, making it easier to
7058 read and maintain. To do this, I needed to modify the process of
7059 updating the servermap a little bit.
7060 
7061 To support partial-file updates, I also modified the servermap updater
7062 to fetch the block hash trees and certain segments of files while it
7063 performed a servermap update (this can be done without adding any new
7064 roundtrips because of batch-read functionality that the read proxy has).
7065 
7066] {
7067hunk ./src/allmydata/mutable/servermap.py 2
7068 
7069-import sys, time
7070+import sys, time, struct
7071 from zope.interface import implements
7072 from itertools import count
7073 from twisted.internet import defer
7074hunk ./src/allmydata/mutable/servermap.py 7
7075 from twisted.python import failure
7076-from foolscap.api import DeadReferenceError, RemoteException, eventually
7077-from allmydata.util import base32, hashutil, idlib, log
7078+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
7079+                         fireEventually
7080+from allmydata.util import base32, hashutil, idlib, log, deferredutil
7081 from allmydata.storage.server import si_b2a
7082 from allmydata.interfaces import IServermapUpdaterStatus
7083 from pycryptopp.publickey import rsa
7084hunk ./src/allmydata/mutable/servermap.py 17
7085 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
7086      DictOfSets, CorruptShareError, NeedMoreDataError
7087 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
7088-     SIGNED_PREFIX_LENGTH
7089+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
7090 
7091 class UpdateStatus:
7092     implements(IServermapUpdaterStatus)
7093hunk ./src/allmydata/mutable/servermap.py 124
7094         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
7095         self.last_update_mode = None
7096         self.last_update_time = 0
7097+        self.update_data = {} # (verinfo,shnum) => data
7098 
7099     def copy(self):
7100         s = ServerMap()
7101hunk ./src/allmydata/mutable/servermap.py 255
7102         """Return a set of versionids, one for each version that is currently
7103         recoverable."""
7104         versionmap = self.make_versionmap()
7105-
7106         recoverable_versions = set()
7107         for (verinfo, shares) in versionmap.items():
7108             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7109hunk ./src/allmydata/mutable/servermap.py 340
7110         return False
7111 
7112 
7113+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
7114+        """
7115+        I return the update data for the given shnum
7116+        """
7117+        update_data = self.update_data[shnum]
7118+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
7119+        return update_datum
7120+
7121+
7122+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
7123+        """
7124+        I record the block hash tree for the given shnum.
7125+        """
7126+        self.update_data.setdefault(shnum , []).append((verinfo, data))
7127+
7128+
7129 class ServermapUpdater:
7130     def __init__(self, filenode, storage_broker, monitor, servermap,
7131hunk ./src/allmydata/mutable/servermap.py 358
7132-                 mode=MODE_READ, add_lease=False):
7133+                 mode=MODE_READ, add_lease=False, update_range=None):
7134         """I update a servermap, locating a sufficient number of useful
7135         shares and remembering where they are located.
7136 
7137hunk ./src/allmydata/mutable/servermap.py 390
7138         #  * if we need the encrypted private key, we want [-1216ish:]
7139         #   * but we can't read from negative offsets
7140         #   * the offset table tells us the 'ish', also the positive offset
7141-        # A future version of the SMDF slot format should consider using
7142-        # fixed-size slots so we can retrieve less data. For now, we'll just
7143-        # read 2000 bytes, which also happens to read enough actual data to
7144-        # pre-fetch a 9-entry dirnode.
7145+        # MDMF:
7146+        #  * Checkstring? [0:72]
7147+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
7148+        #    the offset table will tell us for sure.
7149+        #  * If we need the verification key, we have to consult the offset
7150+        #    table as well.
7151+        # At this point, we don't know which we are. Our filenode can
7152+        # tell us, but it might be lying -- in some cases, we're
7153+        # responsible for telling it which kind of file it is.
7154         self._read_size = 4000
7155         if mode == MODE_CHECK:
7156             # we use unpack_prefix_and_signature, so we need 1k
7157hunk ./src/allmydata/mutable/servermap.py 410
7158         # to ask for it during the check, we'll have problems doing the
7159         # publish.
7160 
7161+        self.fetch_update_data = False
7162+        if mode == MODE_WRITE and update_range:
7163+            # We're updating the servermap in preparation for an
7164+            # in-place file update, so we need to fetch some additional
7165+            # data from each share that we find.
7166+            assert len(update_range) == 2
7167+
7168+            self.start_segment = update_range[0]
7169+            self.end_segment = update_range[1]
7170+            self.fetch_update_data = True
7171+
7172         prefix = si_b2a(self._storage_index)[:5]
7173         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
7174                                    si=prefix, mode=mode)
7175hunk ./src/allmydata/mutable/servermap.py 459
7176         self._queries_completed = 0
7177 
7178         sb = self._storage_broker
7179+        # All of the peers, permuted by the storage index, as usual.
7180         full_peerlist = sb.get_servers_for_index(self._storage_index)
7181         self.full_peerlist = full_peerlist # for use later, immutable
7182         self.extra_peers = full_peerlist[:] # peers are removed as we use them
7183hunk ./src/allmydata/mutable/servermap.py 466
7184         self._good_peers = set() # peers who had some shares
7185         self._empty_peers = set() # peers who don't have any shares
7186         self._bad_peers = set() # peers to whom our queries failed
7187+        self._readers = {} # peerid -> dict(sharewriters), filled in
7188+                           # after responses come in.
7189 
7190         k = self._node.get_required_shares()
7191hunk ./src/allmydata/mutable/servermap.py 470
7192+        # For what cases can these conditions work?
7193         if k is None:
7194             # make a guess
7195             k = 3
7196hunk ./src/allmydata/mutable/servermap.py 483
7197         self.num_peers_to_query = k + self.EPSILON
7198 
7199         if self.mode == MODE_CHECK:
7200+            # We want to query all of the peers.
7201             initial_peers_to_query = dict(full_peerlist)
7202             must_query = set(initial_peers_to_query.keys())
7203             self.extra_peers = []
7204hunk ./src/allmydata/mutable/servermap.py 491
7205             # we're planning to replace all the shares, so we want a good
7206             # chance of finding them all. We will keep searching until we've
7207             # seen epsilon that don't have a share.
7208+            # We don't query all of the peers because that could take a while.
7209             self.num_peers_to_query = N + self.EPSILON
7210             initial_peers_to_query, must_query = self._build_initial_querylist()
7211             self.required_num_empty_peers = self.EPSILON
7212hunk ./src/allmydata/mutable/servermap.py 501
7213             # might also avoid the round trip required to read the encrypted
7214             # private key.
7215 
7216-        else:
7217+        else: # MODE_READ, MODE_ANYTHING
7218+            # 2k peers is good enough.
7219             initial_peers_to_query, must_query = self._build_initial_querylist()
7220 
7221         # this is a set of peers that we are required to get responses from:
7222hunk ./src/allmydata/mutable/servermap.py 517
7223         # before we can consider ourselves finished, and self.extra_peers
7224         # contains the overflow (peers that we should tap if we don't get
7225         # enough responses)
7226+        # I guess that self._must_query is a subset of
7227+        # initial_peers_to_query?
7228+        assert set(must_query).issubset(set(initial_peers_to_query))
7229 
7230         self._send_initial_requests(initial_peers_to_query)
7231         self._status.timings["initial_queries"] = time.time() - self._started
7232hunk ./src/allmydata/mutable/servermap.py 576
7233         # errors that aren't handled by _query_failed (and errors caused by
7234         # _query_failed) get logged, but we still want to check for doneness.
7235         d.addErrback(log.err)
7236-        d.addBoth(self._check_for_done)
7237         d.addErrback(self._fatal_error)
7238hunk ./src/allmydata/mutable/servermap.py 577
7239+        d.addCallback(self._check_for_done)
7240         return d
7241 
7242     def _do_read(self, ss, peerid, storage_index, shnums, readv):
7243hunk ./src/allmydata/mutable/servermap.py 596
7244         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
7245         return d
7246 
7247+
7248+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
7249+        """
7250+        I am called when a remote server returns a corrupt share in
7251+        response to one of our queries. By corrupt, I mean a share
7252+        without a valid signature. I then record the failure, notify the
7253+        server of the corruption, and record the share as bad.
7254+        """
7255+        f = failure.Failure(e)
7256+        self.log(format="bad share: %(f_value)s", f_value=str(f),
7257+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
7258+        # Notify the server that its share is corrupt.
7259+        self.notify_server_corruption(peerid, shnum, str(e))
7260+        # By flagging this as a bad peer, we won't count any of
7261+        # the other shares on that peer as valid, though if we
7262+        # happen to find a valid version string amongst those
7263+        # shares, we'll keep track of it so that we don't need
7264+        # to validate the signature on those again.
7265+        self._bad_peers.add(peerid)
7266+        self._last_failure = f
7267+        # XXX: Use the reader for this?
7268+        checkstring = data[:SIGNED_PREFIX_LENGTH]
7269+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
7270+        self._servermap.problems.append(f)
7271+
7272+
7273+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
7274+        """
7275+        If one of my queries returns successfully (which means that we
7276+        were able to and successfully did validate the signature), I
7277+        cache the data that we initially fetched from the storage
7278+        server. This will help reduce the number of roundtrips that need
7279+        to occur when the file is downloaded, or when the file is
7280+        updated.
7281+        """
7282+        if verinfo:
7283+            self._node._add_to_cache(verinfo, shnum, 0, data, now)
7284+
7285+
7286     def _got_results(self, datavs, peerid, readsize, stuff, started):
7287         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
7288                       peerid=idlib.shortnodeid_b2a(peerid),
7289hunk ./src/allmydata/mutable/servermap.py 642
7290                       level=log.NOISY)
7291         now = time.time()
7292         elapsed = now - started
7293-        self._queries_outstanding.discard(peerid)
7294-        self._servermap.reachable_peers.add(peerid)
7295-        self._must_query.discard(peerid)
7296-        self._queries_completed += 1
7297+        def _done_processing(ignored=None):
7298+            self._queries_outstanding.discard(peerid)
7299+            self._servermap.reachable_peers.add(peerid)
7300+            self._must_query.discard(peerid)
7301+            self._queries_completed += 1
7302         if not self._running:
7303             self.log("but we're not running, so we'll ignore it", parent=lp,
7304                      level=log.NOISY)
7305hunk ./src/allmydata/mutable/servermap.py 650
7306+            _done_processing()
7307             self._status.add_per_server_time(peerid, "late", started, elapsed)
7308             return
7309         self._status.add_per_server_time(peerid, "query", started, elapsed)
7310hunk ./src/allmydata/mutable/servermap.py 660
7311         else:
7312             self._empty_peers.add(peerid)
7313 
7314-        last_verinfo = None
7315-        last_shnum = None
7316+        ss, storage_index = stuff
7317+        ds = []
7318+
7319         for shnum,datav in datavs.items():
7320             data = datav[0]
7321hunk ./src/allmydata/mutable/servermap.py 665
7322-            try:
7323-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
7324-                last_verinfo = verinfo
7325-                last_shnum = shnum
7326-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
7327-            except CorruptShareError, e:
7328-                # log it and give the other shares a chance to be processed
7329-                f = failure.Failure()
7330-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
7331-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
7332-                self.notify_server_corruption(peerid, shnum, str(e))
7333-                self._bad_peers.add(peerid)
7334-                self._last_failure = f
7335-                checkstring = data[:SIGNED_PREFIX_LENGTH]
7336-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
7337-                self._servermap.problems.append(f)
7338-                pass
7339+            reader = MDMFSlotReadProxy(ss,
7340+                                       storage_index,
7341+                                       shnum,
7342+                                       data)
7343+            self._readers.setdefault(peerid, dict())[shnum] = reader
7344+            # our goal, with each response, is to validate the version
7345+            # information and share data as best we can at this point --
7346+            # we do this by validating the signature. To do this, we
7347+            # need to do the following:
7348+            #   - If we don't already have the public key, fetch the
7349+            #     public key. We use this to validate the signature.
7350+            if not self._node.get_pubkey():
7351+                # fetch and set the public key.
7352+                d = reader.get_verification_key(queue=True)
7353+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
7354+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
7355+                # XXX: Make self._pubkey_query_failed?
7356+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
7357+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
7358+            else:
7359+                # we already have the public key.
7360+                d = defer.succeed(None)
7361 
7362hunk ./src/allmydata/mutable/servermap.py 688
7363-        self._status.timings["cumulative_verify"] += (time.time() - now)
7364+            # Neither of these two branches return anything of
7365+            # consequence, so the first entry in our deferredlist will
7366+            # be None.
7367 
7368hunk ./src/allmydata/mutable/servermap.py 692
7369-        if self._need_privkey and last_verinfo:
7370-            # send them a request for the privkey. We send one request per
7371-            # server.
7372-            lp2 = self.log("sending privkey request",
7373-                           parent=lp, level=log.NOISY)
7374-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7375-             offsets_tuple) = last_verinfo
7376-            o = dict(offsets_tuple)
7377+            # - Next, we need the version information. We almost
7378+            #   certainly got this by reading the first thousand or so
7379+            #   bytes of the share on the storage server, so we
7380+            #   shouldn't need to fetch anything at this step.
7381+            d2 = reader.get_verinfo()
7382+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
7383+                self._got_corrupt_share(error, shnum, peerid, data, lp))
7384+            # - Next, we need the signature. For an SDMF share, it is
7385+            #   likely that we fetched this when doing our initial fetch
7386+            #   to get the version information. In MDMF, this lives at
7387+            #   the end of the share, so unless the file is quite small,
7388+            #   we'll need to do a remote fetch to get it.
7389+            d3 = reader.get_signature(queue=True)
7390+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
7391+                self._got_corrupt_share(error, shnum, peerid, data, lp))
7392+            #  Once we have all three of these responses, we can move on
7393+            #  to validating the signature
7394 
7395hunk ./src/allmydata/mutable/servermap.py 710
7396-            self._queries_outstanding.add(peerid)
7397-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
7398-            ss = self._servermap.connections[peerid]
7399-            privkey_started = time.time()
7400-            d = self._do_read(ss, peerid, self._storage_index,
7401-                              [last_shnum], readv)
7402-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
7403-                          privkey_started, lp2)
7404-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
7405-            d.addErrback(log.err)
7406-            d.addCallback(self._check_for_done)
7407-            d.addErrback(self._fatal_error)
7408+            # Does the node already have a privkey? If not, we'll try to
7409+            # fetch it here.
7410+            if self._need_privkey:
7411+                d4 = reader.get_encprivkey(queue=True)
7412+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
7413+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
7414+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
7415+                    self._privkey_query_failed(error, shnum, data, lp))
7416+            else:
7417+                d4 = defer.succeed(None)
7418+
7419+
7420+            if self.fetch_update_data:
7421+                # fetch the block hash tree and first + last segment, as
7422+                # configured earlier.
7423+                # Then set them in wherever we happen to want to set
7424+                # them.
7425+                ds = []
7426+                # XXX: We do this above, too. Is there a good way to
7427+                # make the two routines share the value without
7428+                # introducing more roundtrips?
7429+                ds.append(reader.get_verinfo())
7430+                ds.append(reader.get_blockhashes(queue=True))
7431+                ds.append(reader.get_block_and_salt(self.start_segment,
7432+                                                    queue=True))
7433+                ds.append(reader.get_block_and_salt(self.end_segment,
7434+                                                    queue=True))
7435+                d5 = deferredutil.gatherResults(ds)
7436+                d5.addCallback(self._got_update_results_one_share, shnum)
7437+            else:
7438+                d5 = defer.succeed(None)
7439 
7440hunk ./src/allmydata/mutable/servermap.py 742
7441+            dl = defer.DeferredList([d, d2, d3, d4, d5])
7442+            dl.addBoth(self._turn_barrier)
7443+            reader.flush()
7444+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
7445+                self._got_signature_one_share(results, shnum, peerid, lp))
7446+            dl.addErrback(lambda error, shnum=shnum, data=data:
7447+               self._got_corrupt_share(error, shnum, peerid, data, lp))
7448+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
7449+                self._cache_good_sharedata(verinfo, shnum, now, data))
7450+            ds.append(dl)
7451+        # dl is a deferred list that will fire when all of the shares
7452+        # that we found on this peer are done processing. When dl fires,
7453+        # we know that processing is done, so we can decrement the
7454+        # semaphore-like thing that we incremented earlier.
7455+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
7456+        # Are we done? Done means that there are no more queries to
7457+        # send, that there are no outstanding queries, and that we
7458+        # haven't received any queries that are still processing. If we
7459+        # are done, self._check_for_done will cause the done deferred
7460+        # that we returned to our caller to fire, which tells them that
7461+        # they have a complete servermap, and that we won't be touching
7462+        # the servermap anymore.
7463+        dl.addCallback(_done_processing)
7464+        dl.addCallback(self._check_for_done)
7465+        dl.addErrback(self._fatal_error)
7466         # all done!
7467         self.log("_got_results done", parent=lp, level=log.NOISY)
7468hunk ./src/allmydata/mutable/servermap.py 769
7469+        return dl
7470+
7471+
7472+    def _turn_barrier(self, result):
7473+        """
7474+        I help the servermap updater avoid the recursion limit issues
7475+        discussed in #237.
7476+        """
7477+        return fireEventually(result)
7478+
7479+
7480+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
7481+        if self._node.get_pubkey():
7482+            return # don't go through this again if we don't have to
7483+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
7484+        assert len(fingerprint) == 32
7485+        if fingerprint != self._node.get_fingerprint():
7486+            raise CorruptShareError(peerid, shnum,
7487+                                "pubkey doesn't match fingerprint")
7488+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
7489+        assert self._node.get_pubkey()
7490+
7491 
7492     def notify_server_corruption(self, peerid, shnum, reason):
7493         ss = self._servermap.connections[peerid]
7494hunk ./src/allmydata/mutable/servermap.py 797
7495         ss.callRemoteOnly("advise_corrupt_share",
7496                           "mutable", self._storage_index, shnum, reason)
7497 
7498-    def _got_results_one_share(self, shnum, data, peerid, lp):
7499+
7500+    def _got_signature_one_share(self, results, shnum, peerid, lp):
7501+        # It is our job to give versioninfo to our caller. We need to
7502+        # raise CorruptShareError if the share is corrupt for any
7503+        # reason, something that our caller will handle.
7504         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
7505                  shnum=shnum,
7506                  peerid=idlib.shortnodeid_b2a(peerid),
7507hunk ./src/allmydata/mutable/servermap.py 807
7508                  level=log.NOISY,
7509                  parent=lp)
7510+        if not self._running:
7511+            # We can't process the results, since we can't touch the
7512+            # servermap anymore.
7513+            self.log("but we're not running anymore.")
7514+            return None
7515 
7516hunk ./src/allmydata/mutable/servermap.py 813
7517-        # this might raise NeedMoreDataError, if the pubkey and signature
7518-        # live at some weird offset. That shouldn't happen, so I'm going to
7519-        # treat it as a bad share.
7520-        (seqnum, root_hash, IV, k, N, segsize, datalength,
7521-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
7522-
7523-        if not self._node.get_pubkey():
7524-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
7525-            assert len(fingerprint) == 32
7526-            if fingerprint != self._node.get_fingerprint():
7527-                raise CorruptShareError(peerid, shnum,
7528-                                        "pubkey doesn't match fingerprint")
7529-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
7530-
7531-        if self._need_privkey:
7532-            self._try_to_extract_privkey(data, peerid, shnum, lp)
7533-
7534-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
7535-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
7536+        _, verinfo, signature, __, ___ = results
7537+        (seqnum,
7538+         root_hash,
7539+         saltish,
7540+         segsize,
7541+         datalen,
7542+         k,
7543+         n,
7544+         prefix,
7545+         offsets) = verinfo[1]
7546         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
7547 
7548hunk ./src/allmydata/mutable/servermap.py 825
7549-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7550+        # XXX: This should be done for us in the method, so
7551+        # presumably you can go in there and fix it.
7552+        verinfo = (seqnum,
7553+                   root_hash,
7554+                   saltish,
7555+                   segsize,
7556+                   datalen,
7557+                   k,
7558+                   n,
7559+                   prefix,
7560                    offsets_tuple)
7561hunk ./src/allmydata/mutable/servermap.py 836
7562+        # This tuple uniquely identifies a share on the grid; we use it
7563+        # to keep track of the ones that we've already seen.
7564 
7565         if verinfo not in self._valid_versions:
7566hunk ./src/allmydata/mutable/servermap.py 840
7567-            # it's a new pair. Verify the signature.
7568-            valid = self._node.get_pubkey().verify(prefix, signature)
7569+            # This is a new version tuple, and we need to validate it
7570+            # against the public key before keeping track of it.
7571+            assert self._node.get_pubkey()
7572+            valid = self._node.get_pubkey().verify(prefix, signature[1])
7573             if not valid:
7574hunk ./src/allmydata/mutable/servermap.py 845
7575-                raise CorruptShareError(peerid, shnum, "signature is invalid")
7576+                raise CorruptShareError(peerid, shnum,
7577+                                        "signature is invalid")
7578 
7579hunk ./src/allmydata/mutable/servermap.py 848
7580-            # ok, it's a valid verinfo. Add it to the list of validated
7581-            # versions.
7582-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
7583-                     % (seqnum, base32.b2a(root_hash)[:4],
7584-                        idlib.shortnodeid_b2a(peerid), shnum,
7585-                        k, N, segsize, datalength),
7586-                     parent=lp)
7587-            self._valid_versions.add(verinfo)
7588-        # We now know that this is a valid candidate verinfo.
7589+        # ok, it's a valid verinfo. Add it to the list of validated
7590+        # versions.
7591+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
7592+                 % (seqnum, base32.b2a(root_hash)[:4],
7593+                    idlib.shortnodeid_b2a(peerid), shnum,
7594+                    k, n, segsize, datalen),
7595+                    parent=lp)
7596+        self._valid_versions.add(verinfo)
7597+        # We now know that this is a valid candidate verinfo. Whether or
7598+        # not this instance of it is valid is a matter for the next
7599+        # statement; at this point, we just know that if we see this
7600+        # version info again, that its signature checks out and that
7601+        # we're okay to skip the signature-checking step.
7602 
7603hunk ./src/allmydata/mutable/servermap.py 862
7604+        # (peerid, shnum) are bound in the method invocation.
7605         if (peerid, shnum) in self._servermap.bad_shares:
7606             # we've been told that the rest of the data in this share is
7607             # unusable, so don't add it to the servermap.
7608hunk ./src/allmydata/mutable/servermap.py 875
7609         self._servermap.add_new_share(peerid, shnum, verinfo, timestamp)
7610         # and the versionmap
7611         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
7612+
7613+        # It's our job to set the protocol version of our parent
7614+        # filenode if it isn't already set.
7615+        if not self._node.get_version():
7616+            # The first byte of the prefix is the version.
7617+            v = struct.unpack(">B", prefix[:1])[0]
7618+            self.log("got version %d" % v)
7619+            self._node.set_version(v)
7620+
7621         return verinfo
7622 
7623hunk ./src/allmydata/mutable/servermap.py 886
7624-    def _deserialize_pubkey(self, pubkey_s):
7625-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
7626-        return verifier
7627 
7628hunk ./src/allmydata/mutable/servermap.py 887
7629-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
7630-        try:
7631-            r = unpack_share(data)
7632-        except NeedMoreDataError, e:
7633-            # this share won't help us. oh well.
7634-            offset = e.encprivkey_offset
7635-            length = e.encprivkey_length
7636-            self.log("shnum %d on peerid %s: share was too short (%dB) "
7637-                     "to get the encprivkey; [%d:%d] ought to hold it" %
7638-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
7639-                      offset, offset+length),
7640-                     parent=lp)
7641-            # NOTE: if uncoordinated writes are taking place, someone might
7642-            # change the share (and most probably move the encprivkey) before
7643-            # we get a chance to do one of these reads and fetch it. This
7644-            # will cause us to see a NotEnoughSharesError(unable to fetch
7645-            # privkey) instead of an UncoordinatedWriteError . This is a
7646-            # nuisance, but it will go away when we move to DSA-based mutable
7647-            # files (since the privkey will be small enough to fit in the
7648-            # write cap).
7649+    def _got_update_results_one_share(self, results, share):
7650+        """
7651+        I record the update results in results.
7652+        """
7653+        assert len(results) == 4
7654+        verinfo, blockhashes, start, end = results
7655+        (seqnum,
7656+         root_hash,
7657+         saltish,
7658+         segsize,
7659+         datalen,
7660+         k,
7661+         n,
7662+         prefix,
7663+         offsets) = verinfo
7664+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
7665 
7666hunk ./src/allmydata/mutable/servermap.py 904
7667-            return
7668+        # XXX: This should be done for us in the method, so
7669+        # presumably you can go in there and fix it.
7670+        verinfo = (seqnum,
7671+                   root_hash,
7672+                   saltish,
7673+                   segsize,
7674+                   datalen,
7675+                   k,
7676+                   n,
7677+                   prefix,
7678+                   offsets_tuple)
7679 
7680hunk ./src/allmydata/mutable/servermap.py 916
7681-        (seqnum, root_hash, IV, k, N, segsize, datalen,
7682-         pubkey, signature, share_hash_chain, block_hash_tree,
7683-         share_data, enc_privkey) = r
7684+        update_data = (blockhashes, start, end)
7685+        self._servermap.set_update_data_for_share_and_verinfo(share,
7686+                                                              verinfo,
7687+                                                              update_data)
7688 
7689hunk ./src/allmydata/mutable/servermap.py 921
7690-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
7691+
7692+    def _deserialize_pubkey(self, pubkey_s):
7693+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
7694+        return verifier
7695 
7696hunk ./src/allmydata/mutable/servermap.py 926
7697-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
7698 
7699hunk ./src/allmydata/mutable/servermap.py 927
7700+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
7701+        """
7702+        Given a writekey from a remote server, I validate it against the
7703+        writekey stored in my node. If it is valid, then I set the
7704+        privkey and encprivkey properties of the node.
7705+        """
7706         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
7707         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
7708         if alleged_writekey != self._node.get_writekey():
7709hunk ./src/allmydata/mutable/servermap.py 1005
7710         self._queries_completed += 1
7711         self._last_failure = f
7712 
7713-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
7714-        now = time.time()
7715-        elapsed = now - started
7716-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
7717-        self._queries_outstanding.discard(peerid)
7718-        if not self._need_privkey:
7719-            return
7720-        if shnum not in datavs:
7721-            self.log("privkey wasn't there when we asked it",
7722-                     level=log.WEIRD, umid="VA9uDQ")
7723-            return
7724-        datav = datavs[shnum]
7725-        enc_privkey = datav[0]
7726-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
7727 
7728     def _privkey_query_failed(self, f, peerid, shnum, lp):
7729         self._queries_outstanding.discard(peerid)
7730hunk ./src/allmydata/mutable/servermap.py 1019
7731         self._servermap.problems.append(f)
7732         self._last_failure = f
7733 
7734+
7735     def _check_for_done(self, res):
7736         # exit paths:
7737         #  return self._send_more_queries(outstanding) : send some more queries
7738hunk ./src/allmydata/mutable/servermap.py 1025
7739         #  return self._done() : all done
7740         #  return : keep waiting, no new queries
7741-
7742         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
7743                               "%(outstanding)d queries outstanding, "
7744                               "%(extra)d extra peers available, "
7745hunk ./src/allmydata/mutable/servermap.py 1216
7746 
7747     def _done(self):
7748         if not self._running:
7749+            self.log("not running; we're already done")
7750             return
7751         self._running = False
7752         now = time.time()
7753hunk ./src/allmydata/mutable/servermap.py 1231
7754         self._servermap.last_update_time = self._started
7755         # the servermap will not be touched after this
7756         self.log("servermap: %s" % self._servermap.summarize_versions())
7757+
7758         eventually(self._done_deferred.callback, self._servermap)
7759 
7760     def _fatal_error(self, f):
7761}
7762[scripts: tell 'tahoe put' about MDMF
7763Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
7764 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
7765] {
7766hunk ./src/allmydata/scripts/cli.py 156
7767     optFlags = [
7768         ("mutable", "m", "Create a mutable file instead of an immutable one."),
7769         ]
7770+    optParameters = [
7771+        ("mutable-type", None, False, "Create a mutable file in the given format. Valid formats are 'sdmf' for SDMF and 'mdmf' for MDMF"),
7772+        ]
7773 
7774     def parseArgs(self, arg1=None, arg2=None):
7775         # see Examples below
7776hunk ./src/allmydata/scripts/tahoe_put.py 21
7777     from_file = options.from_file
7778     to_file = options.to_file
7779     mutable = options['mutable']
7780+    mutable_type = False
7781+
7782+    if mutable:
7783+        mutable_type = options['mutable-type']
7784     if options['quiet']:
7785         verbosity = 0
7786     else:
7787hunk ./src/allmydata/scripts/tahoe_put.py 33
7788     stdout = options.stdout
7789     stderr = options.stderr
7790 
7791+    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
7792+        # Don't try to pass unsupported types to the webapi
7793+        print >>stderr, "error: %s is an invalid format" % mutable_type
7794+        return 1
7795+
7796     if nodeurl[-1] != "/":
7797         nodeurl += "/"
7798     if to_file:
7799hunk ./src/allmydata/scripts/tahoe_put.py 76
7800         url = nodeurl + "uri"
7801     if mutable:
7802         url += "?mutable=true"
7803+    if mutable_type:
7804+        assert mutable
7805+        url += "&mutable-type=%s" % mutable_type
7806+
7807     if from_file:
7808         infileobj = open(os.path.expanduser(from_file), "rb")
7809     else:
7810}
7811[web: Alter the webapi to get along with and take advantage of the MDMF changes
7812Kevan Carstensen <kevan@isnotajoke.com>**20100814081012
7813 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6
7814 
7815 The main benefit that the webapi gets from MDMF, at least initially, is
7816 the ability to do a streaming download of an MDMF mutable file. It also
7817 exposes a way (through the PUT verb) to append to or otherwise modify
7818 (in-place) an MDMF mutable file.
7819] {
7820hunk ./src/allmydata/web/common.py 12
7821 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
7822      FileTooLargeError, NotEnoughSharesError, NoSharesError, \
7823      EmptyPathnameComponentError, MustBeDeepImmutableError, \
7824-     MustBeReadonlyError, MustNotBeUnknownRWError
7825+     MustBeReadonlyError, MustNotBeUnknownRWError, SDMF_VERSION, MDMF_VERSION
7826 from allmydata.mutable.common import UnrecoverableFileError
7827 from allmydata.util import abbreviate
7828 from allmydata.util.encodingutil import to_str
7829hunk ./src/allmydata/web/common.py 34
7830     else:
7831         return boolean_of_arg(replace)
7832 
7833+
7834+def parse_mutable_type_arg(arg):
7835+    if not arg:
7836+        return None # interpreted by the caller as "let the nodemaker decide"
7837+
7838+    arg = arg.lower()
7839+    assert arg in ("mdmf", "sdmf")
7840+
7841+    if arg == "mdmf":
7842+        return MDMF_VERSION
7843+
7844+    return SDMF_VERSION
7845+
7846+
7847+def parse_offset_arg(offset):
7848+    # XXX: This will raise a ValueError when invoked on something that
7849+    # is not an integer. Is that okay? Or do we want a better error
7850+    # message? Since this call is going to be used by programmers and
7851+    # their tools rather than users (through the wui), it is not
7852+    # inconsistent to return that, I guess.
7853+    offset = int(offset)
7854+    return offset
7855+
7856+
7857 def get_root(ctx_or_req):
7858     req = IRequest(ctx_or_req)
7859     # the addSlash=True gives us one extra (empty) segment
7860hunk ./src/allmydata/web/directory.py 19
7861 from allmydata.uri import from_string_dirnode
7862 from allmydata.interfaces import IDirectoryNode, IFileNode, IFilesystemNode, \
7863      IImmutableFileNode, IMutableFileNode, ExistingChildError, \
7864-     NoSuchChildError, EmptyPathnameComponentError
7865+     NoSuchChildError, EmptyPathnameComponentError, SDMF_VERSION, MDMF_VERSION
7866 from allmydata.monitor import Monitor, OperationCancelledError
7867 from allmydata import dirnode
7868 from allmydata.web.common import text_plain, WebError, \
7869hunk ./src/allmydata/web/directory.py 153
7870         if not t:
7871             # render the directory as HTML, using the docFactory and Nevow's
7872             # whole templating thing.
7873-            return DirectoryAsHTML(self.node)
7874+            return DirectoryAsHTML(self.node,
7875+                                   self.client.mutable_file_default)
7876 
7877         if t == "json":
7878             return DirectoryJSONMetadata(ctx, self.node)
7879hunk ./src/allmydata/web/directory.py 556
7880     docFactory = getxmlfile("directory.xhtml")
7881     addSlash = True
7882 
7883-    def __init__(self, node):
7884+    def __init__(self, node, default_mutable_format):
7885         rend.Page.__init__(self)
7886         self.node = node
7887 
7888hunk ./src/allmydata/web/directory.py 560
7889+        assert default_mutable_format in (MDMF_VERSION, SDMF_VERSION)
7890+        self.default_mutable_format = default_mutable_format
7891+
7892     def beforeRender(self, ctx):
7893         # attempt to get the dirnode's children, stashing them (or the
7894         # failure that results) for later use
7895hunk ./src/allmydata/web/directory.py 780
7896             ]]
7897         forms.append(T.div(class_="freeform-form")[mkdir])
7898 
7899+        # Build input elements for mutable file type. We do this outside
7900+        # of the list so we can check the appropriate format, based on
7901+        # the default configured in the client (which reflects the
7902+        # default configured in tahoe.cfg)
7903+        if self.default_mutable_format == MDMF_VERSION:
7904+            mdmf_input = T.input(type='radio', name='mutable-type',
7905+                                 id='mutable-type-mdmf', value='mdmf',
7906+                                 checked='checked')
7907+        else:
7908+            mdmf_input = T.input(type='radio', name='mutable-type',
7909+                                 id='mutable-type-mdmf', value='mdmf')
7910+
7911+        if self.default_mutable_format == SDMF_VERSION:
7912+            sdmf_input = T.input(type='radio', name='mutable-type',
7913+                                 id='mutable-type-sdmf', value='sdmf',
7914+                                 checked="checked")
7915+        else:
7916+            sdmf_input = T.input(type='radio', name='mutable-type',
7917+                                 id='mutable-type-sdmf', value='sdmf')
7918+
7919         upload = T.form(action=".", method="post",
7920                         enctype="multipart/form-data")[
7921             T.fieldset[
7922hunk ./src/allmydata/web/directory.py 812
7923             T.input(type="submit", value="Upload"),
7924             " Mutable?:",
7925             T.input(type="checkbox", name="mutable"),
7926+            sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
7927+            mdmf_input,
7928+            T.label(for_="mutable-type-mdmf")["MDMF (experimental)"],
7929             ]]
7930         forms.append(T.div(class_="freeform-form")[upload])
7931 
7932hunk ./src/allmydata/web/directory.py 850
7933                 kiddata = ("filenode", {'size': childnode.get_size(),
7934                                         'mutable': childnode.is_mutable(),
7935                                         })
7936+                if childnode.is_mutable() and \
7937+                    childnode.get_version() is not None:
7938+                    mutable_type = childnode.get_version()
7939+                    assert mutable_type in (SDMF_VERSION, MDMF_VERSION)
7940+
7941+                    if mutable_type == MDMF_VERSION:
7942+                        mutable_type = "mdmf"
7943+                    else:
7944+                        mutable_type = "sdmf"
7945+                    kiddata[1]['mutable-type'] = mutable_type
7946+
7947             elif IDirectoryNode.providedBy(childnode):
7948                 kiddata = ("dirnode", {'mutable': childnode.is_mutable()})
7949             else:
7950hunk ./src/allmydata/web/filenode.py 9
7951 from nevow import url, rend
7952 from nevow.inevow import IRequest
7953 
7954-from allmydata.interfaces import ExistingChildError
7955+from allmydata.interfaces import ExistingChildError, SDMF_VERSION, MDMF_VERSION
7956 from allmydata.monitor import Monitor
7957 from allmydata.immutable.upload import FileHandle
7958hunk ./src/allmydata/web/filenode.py 12
7959+from allmydata.mutable.publish import MutableFileHandle
7960+from allmydata.mutable.common import MODE_READ
7961 from allmydata.util import log, base32
7962 
7963 from allmydata.web.common import text_plain, WebError, RenderMixin, \
7964hunk ./src/allmydata/web/filenode.py 18
7965      boolean_of_arg, get_arg, should_create_intermediate_directories, \
7966-     MyExceptionHandler, parse_replace_arg
7967+     MyExceptionHandler, parse_replace_arg, parse_offset_arg, \
7968+     parse_mutable_type_arg
7969 from allmydata.web.check_results import CheckResults, \
7970      CheckAndRepairResults, LiteralCheckResults
7971 from allmydata.web.info import MoreInfo
7972hunk ./src/allmydata/web/filenode.py 29
7973         # a new file is being uploaded in our place.
7974         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
7975         if mutable:
7976-            req.content.seek(0)
7977-            data = req.content.read()
7978-            d = client.create_mutable_file(data)
7979+            mutable_type = parse_mutable_type_arg(get_arg(req,
7980+                                                          "mutable-type",
7981+                                                          None))
7982+            data = MutableFileHandle(req.content)
7983+            d = client.create_mutable_file(data, version=mutable_type)
7984             def _uploaded(newnode):
7985                 d2 = self.parentnode.set_node(self.name, newnode,
7986                                               overwrite=replace)
7987hunk ./src/allmydata/web/filenode.py 66
7988         d.addCallback(lambda res: childnode.get_uri())
7989         return d
7990 
7991-    def _read_data_from_formpost(self, req):
7992-        # SDMF: files are small, and we can only upload data, so we read
7993-        # the whole file into memory before uploading.
7994-        contents = req.fields["file"]
7995-        contents.file.seek(0)
7996-        data = contents.file.read()
7997-        return data
7998 
7999     def replace_me_with_a_formpost(self, req, client, replace):
8000         # create a new file, maybe mutable, maybe immutable
8001hunk ./src/allmydata/web/filenode.py 71
8002         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
8003 
8004+        # create an immutable file
8005+        contents = req.fields["file"]
8006         if mutable:
8007hunk ./src/allmydata/web/filenode.py 74
8008-            data = self._read_data_from_formpost(req)
8009-            d = client.create_mutable_file(data)
8010+            mutable_type = parse_mutable_type_arg(get_arg(req, "mutable-type",
8011+                                                          None))
8012+            uploadable = MutableFileHandle(contents.file)
8013+            d = client.create_mutable_file(uploadable, version=mutable_type)
8014             def _uploaded(newnode):
8015                 d2 = self.parentnode.set_node(self.name, newnode,
8016                                               overwrite=replace)
8017hunk ./src/allmydata/web/filenode.py 85
8018                 return d2
8019             d.addCallback(_uploaded)
8020             return d
8021-        # create an immutable file
8022-        contents = req.fields["file"]
8023+
8024         uploadable = FileHandle(contents.file, convergence=client.convergence)
8025         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
8026         d.addCallback(lambda newnode: newnode.get_uri())
8027hunk ./src/allmydata/web/filenode.py 91
8028         return d
8029 
8030+
8031 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
8032     def __init__(self, client, parentnode, name):
8033         rend.Page.__init__(self)
8034hunk ./src/allmydata/web/filenode.py 174
8035             # properly. So we assume that at least the browser will agree
8036             # with itself, and echo back the same bytes that we were given.
8037             filename = get_arg(req, "filename", self.name) or "unknown"
8038-            if self.node.is_mutable():
8039-                # some day: d = self.node.get_best_version()
8040-                d = makeMutableDownloadable(self.node)
8041-            else:
8042-                d = defer.succeed(self.node)
8043+            d = self.node.get_best_readable_version()
8044             d.addCallback(lambda dn: FileDownloader(dn, filename))
8045             return d
8046         if t == "json":
8047hunk ./src/allmydata/web/filenode.py 178
8048-            if self.parentnode and self.name:
8049-                d = self.parentnode.get_metadata_for(self.name)
8050+            # We do this to make sure that fields like size and
8051+            # mutable-type (which depend on the file on the grid and not
8052+            # just on the cap) are filled in. The latter gets used in
8053+            # tests, in particular.
8054+            #
8055+            # TODO: Make it so that the servermap knows how to update in
8056+            # a mode specifically designed to fill in these fields, and
8057+            # then update it in that mode.
8058+            if self.node.is_mutable():
8059+                d = self.node.get_servermap(MODE_READ)
8060             else:
8061                 d = defer.succeed(None)
8062hunk ./src/allmydata/web/filenode.py 190
8063+            if self.parentnode and self.name:
8064+                d.addCallback(lambda ignored:
8065+                    self.parentnode.get_metadata_for(self.name))
8066+            else:
8067+                d.addCallback(lambda ignored: None)
8068             d.addCallback(lambda md: FileJSONMetadata(ctx, self.node, md))
8069             return d
8070         if t == "info":
8071hunk ./src/allmydata/web/filenode.py 211
8072         if t:
8073             raise WebError("GET file: bad t=%s" % t)
8074         filename = get_arg(req, "filename", self.name) or "unknown"
8075-        if self.node.is_mutable():
8076-            # some day: d = self.node.get_best_version()
8077-            d = makeMutableDownloadable(self.node)
8078-        else:
8079-            d = defer.succeed(self.node)
8080+        d = self.node.get_best_readable_version()
8081         d.addCallback(lambda dn: FileDownloader(dn, filename))
8082         return d
8083 
8084hunk ./src/allmydata/web/filenode.py 219
8085         req = IRequest(ctx)
8086         t = get_arg(req, "t", "").strip()
8087         replace = parse_replace_arg(get_arg(req, "replace", "true"))
8088+        offset = parse_offset_arg(get_arg(req, "offset", -1))
8089 
8090         if not t:
8091hunk ./src/allmydata/web/filenode.py 222
8092-            if self.node.is_mutable():
8093+            if self.node.is_mutable() and offset >= 0:
8094+                return self.update_my_contents(req, offset)
8095+
8096+            elif self.node.is_mutable():
8097                 return self.replace_my_contents(req)
8098             if not replace:
8099                 # this is the early trap: if someone else modifies the
8100hunk ./src/allmydata/web/filenode.py 232
8101                 # directory while we're uploading, the add_file(overwrite=)
8102                 # call in replace_me_with_a_child will do the late trap.
8103                 raise ExistingChildError()
8104+            if offset >= 0:
8105+                raise WebError("PUT to a file: append operation invoked "
8106+                               "on an immutable cap")
8107+
8108+
8109             assert self.parentnode and self.name
8110             return self.replace_me_with_a_child(req, self.client, replace)
8111         if t == "uri":
8112hunk ./src/allmydata/web/filenode.py 299
8113 
8114     def replace_my_contents(self, req):
8115         req.content.seek(0)
8116-        new_contents = req.content.read()
8117+        new_contents = MutableFileHandle(req.content)
8118         d = self.node.overwrite(new_contents)
8119         d.addCallback(lambda res: self.node.get_uri())
8120         return d
8121hunk ./src/allmydata/web/filenode.py 304
8122 
8123+
8124+    def update_my_contents(self, req, offset):
8125+        req.content.seek(0)
8126+        added_contents = MutableFileHandle(req.content)
8127+
8128+        d = self.node.get_best_mutable_version()
8129+        d.addCallback(lambda mv:
8130+            mv.update(added_contents, offset))
8131+        d.addCallback(lambda ignored:
8132+            self.node.get_uri())
8133+        return d
8134+
8135+
8136     def replace_my_contents_with_a_formpost(self, req):
8137         # we have a mutable file. Get the data from the formpost, and replace
8138         # the mutable file's contents with it.
8139hunk ./src/allmydata/web/filenode.py 320
8140-        new_contents = self._read_data_from_formpost(req)
8141+        new_contents = req.fields['file']
8142+        new_contents = MutableFileHandle(new_contents.file)
8143+
8144         d = self.node.overwrite(new_contents)
8145         d.addCallback(lambda res: self.node.get_uri())
8146         return d
8147hunk ./src/allmydata/web/filenode.py 327
8148 
8149-class MutableDownloadable:
8150-    #implements(IDownloadable)
8151-    def __init__(self, size, node):
8152-        self.size = size
8153-        self.node = node
8154-    def get_size(self):
8155-        return self.size
8156-    def is_mutable(self):
8157-        return True
8158-    def read(self, consumer, offset=0, size=None):
8159-        d = self.node.download_best_version()
8160-        d.addCallback(self._got_data, consumer, offset, size)
8161-        return d
8162-    def _got_data(self, contents, consumer, offset, size):
8163-        start = offset
8164-        if size is not None:
8165-            end = offset+size
8166-        else:
8167-            end = self.size
8168-        # SDMF: we can write the whole file in one big chunk
8169-        consumer.write(contents[start:end])
8170-        return consumer
8171-
8172-def makeMutableDownloadable(n):
8173-    d = defer.maybeDeferred(n.get_size_of_best_version)
8174-    d.addCallback(MutableDownloadable, n)
8175-    return d
8176 
8177 class FileDownloader(rend.Page):
8178     # since we override the rendering process (to let the tahoe Downloader
8179hunk ./src/allmydata/web/filenode.py 492
8180     data[1]['mutable'] = filenode.is_mutable()
8181     if edge_metadata is not None:
8182         data[1]['metadata'] = edge_metadata
8183+
8184+    if filenode.is_mutable() and filenode.get_version() is not None:
8185+        mutable_type = filenode.get_version()
8186+        assert mutable_type in (MDMF_VERSION, SDMF_VERSION)
8187+        if mutable_type == MDMF_VERSION:
8188+            mutable_type = "mdmf"
8189+        else:
8190+            mutable_type = "sdmf"
8191+        data[1]['mutable-type'] = mutable_type
8192+
8193     return text_plain(simplejson.dumps(data, indent=1) + "\n", ctx)
8194 
8195 def FileURI(ctx, filenode):
8196hunk ./src/allmydata/web/root.py 15
8197 from allmydata import get_package_versions_string
8198 from allmydata import provisioning
8199 from allmydata.util import idlib, log
8200-from allmydata.interfaces import IFileNode
8201+from allmydata.interfaces import IFileNode, MDMF_VERSION, SDMF_VERSION
8202 from allmydata.web import filenode, directory, unlinked, status, operations
8203 from allmydata.web import reliability, storage
8204 from allmydata.web.common import abbreviate_size, getxmlfile, WebError, \
8205hunk ./src/allmydata/web/root.py 19
8206-     get_arg, RenderMixin, boolean_of_arg
8207+     get_arg, RenderMixin, boolean_of_arg, parse_mutable_type_arg
8208 
8209 
8210 class URIHandler(RenderMixin, rend.Page):
8211hunk ./src/allmydata/web/root.py 50
8212         if t == "":
8213             mutable = boolean_of_arg(get_arg(req, "mutable", "false").strip())
8214             if mutable:
8215-                return unlinked.PUTUnlinkedSSK(req, self.client)
8216+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
8217+                                                 None))
8218+                return unlinked.PUTUnlinkedSSK(req, self.client, version)
8219             else:
8220                 return unlinked.PUTUnlinkedCHK(req, self.client)
8221         if t == "mkdir":
8222hunk ./src/allmydata/web/root.py 70
8223         if t in ("", "upload"):
8224             mutable = bool(get_arg(req, "mutable", "").strip())
8225             if mutable:
8226-                return unlinked.POSTUnlinkedSSK(req, self.client)
8227+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
8228+                                                         None))
8229+                return unlinked.POSTUnlinkedSSK(req, self.client, version)
8230             else:
8231                 return unlinked.POSTUnlinkedCHK(req, self.client)
8232         if t == "mkdir":
8233hunk ./src/allmydata/web/root.py 324
8234 
8235     def render_upload_form(self, ctx, data):
8236         # this is a form where users can upload unlinked files
8237+        #
8238+        # for mutable files, users can choose the format by selecting
8239+        # MDMF or SDMF from a radio button. They can also configure a
8240+        # default format in tahoe.cfg, which they rightly expect us to
8241+        # obey. we convey to them that we are obeying their choice by
8242+        # ensuring that the one that they've chosen is selected in the
8243+        # interface.
8244+        if self.client.mutable_file_default == MDMF_VERSION:
8245+            mdmf_input = T.input(type='radio', name='mutable-type',
8246+                                 value='mdmf', id='mutable-type-mdmf',
8247+                                 checked='checked')
8248+        else:
8249+            mdmf_input = T.input(type='radio', name='mutable-type',
8250+                                 value='mdmf', id='mutable-type-mdmf')
8251+
8252+        if self.client.mutable_file_default == SDMF_VERSION:
8253+            sdmf_input = T.input(type='radio', name='mutable-type',
8254+                                 value='sdmf', id='mutable-type-sdmf',
8255+                                 checked='checked')
8256+        else:
8257+            sdmf_input = T.input(type='radio', name='mutable-type',
8258+                                 value='sdmf', id='mutable-type-sdmf')
8259+
8260+
8261         form = T.form(action="uri", method="post",
8262                       enctype="multipart/form-data")[
8263             T.fieldset[
8264hunk ./src/allmydata/web/root.py 356
8265                   T.input(type="file", name="file", class_="freeform-input-file")],
8266             T.input(type="hidden", name="t", value="upload"),
8267             T.div[T.input(type="checkbox", name="mutable"), T.label(for_="mutable")["Create mutable file"],
8268+                  sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
8269+                  mdmf_input,
8270+                  T.label(for_='mutable-type-mdmf')['MDMF (experimental)'],
8271                   " ", T.input(type="submit", value="Upload!")],
8272             ]]
8273         return T.div[form]
8274hunk ./src/allmydata/web/unlinked.py 7
8275 from twisted.internet import defer
8276 from nevow import rend, url, tags as T
8277 from allmydata.immutable.upload import FileHandle
8278+from allmydata.mutable.publish import MutableFileHandle
8279 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
8280      convert_children_json, WebError
8281 from allmydata.web import status
8282hunk ./src/allmydata/web/unlinked.py 20
8283     # that fires with the URI of the new file
8284     return d
8285 
8286-def PUTUnlinkedSSK(req, client):
8287+def PUTUnlinkedSSK(req, client, version):
8288     # SDMF: files are small, and we can only upload data
8289     req.content.seek(0)
8290hunk ./src/allmydata/web/unlinked.py 23
8291-    data = req.content.read()
8292-    d = client.create_mutable_file(data)
8293+    data = MutableFileHandle(req.content)
8294+    d = client.create_mutable_file(data, version=version)
8295     d.addCallback(lambda n: n.get_uri())
8296     return d
8297 
8298hunk ./src/allmydata/web/unlinked.py 83
8299                       ["/uri/" + res.uri])
8300         return d
8301 
8302-def POSTUnlinkedSSK(req, client):
8303+def POSTUnlinkedSSK(req, client, version):
8304     # "POST /uri", to create an unlinked file.
8305     # SDMF: files are small, and we can only upload data
8306hunk ./src/allmydata/web/unlinked.py 86
8307-    contents = req.fields["file"]
8308-    contents.file.seek(0)
8309-    data = contents.file.read()
8310-    d = client.create_mutable_file(data)
8311+    contents = req.fields["file"].file
8312+    data = MutableFileHandle(contents)
8313+    d = client.create_mutable_file(data, version=version)
8314     d.addCallback(lambda n: n.get_uri())
8315     return d
8316 
8317}
8318[docs: update docs to mention MDMF
8319Kevan Carstensen <kevan@isnotajoke.com>**20100814225644
8320 Ignore-this: 1c3caa3cd44831007dcfbef297814308
8321] {
8322hunk ./docs/configuration.txt 293
8323  (Mutable files use a different share placement algorithm that does not
8324   consider this parameter.)
8325 
8326+mutable.format = sdmf or mdmf
8327+
8328+ This value tells Tahoe-LAFS what the default mutable file format should
8329+ be. If mutable.format=sdmf, then newly created mutable files will be in
8330+ the old SDMF format. This is desirable for clients that operate on
8331+ grids where some peers run older versions of Tahoe-LAFS, as these older
8332+ versions cannot read the new MDMF mutable file format. If
8333+ mutable.format = mdmf, then newly created mutable files will use the
8334+ new MDMF format, which supports efficient in-place modification and
8335+ streaming downloads. You can overwrite this value using a special
8336+ mutable-type parameter in the webapi. If you do not specify a value
8337+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
8338+
8339+ Note that this parameter only applies to mutable files. Mutable
8340+ directories, which are stored as mutable files, are not controlled by
8341+ this parameter and will always use SDMF. We may revisit this decision
8342+ in future versions of Tahoe-LAFS.
8343 
8344 == Storage Server Configuration ==
8345 
8346hunk ./docs/frontends/webapi.txt 324
8347  writeable mutable file, that file's contents will be overwritten in-place. If
8348  it is a read-cap for a mutable file, an error will occur. If it is an
8349  immutable file, the old file will be discarded, and a new one will be put in
8350- its place.
8351+ its place. If the target file is a writable mutable file, you may also
8352+ specify an "offset" parameter -- a byte offset that determines where in
8353+ the mutable file the data from the HTTP request body is placed. This
8354+ operation is relatively efficient for MDMF mutable files, and is
8355+ relatively inefficient (but still supported) for SDMF mutable files.
8356 
8357  When creating a new file, if "mutable=true" is in the query arguments, the
8358  operation will create a mutable file instead of an immutable one.
8359hunk ./docs/frontends/webapi.txt 349
8360 
8361  If "mutable=true" is in the query arguments, the operation will create a
8362  mutable file, and return its write-cap in the HTTP respose. The default is
8363- to create an immutable file, returning the read-cap as a response.
8364+ to create an immutable file, returning the read-cap as a response. If
8365+ you create a mutable file, you can also use the "mutable-type" query
8366+ parameter. If "mutable-type=sdmf", then the mutable file will be created
8367+ in the old SDMF mutable file format. This is desirable for files that
8368+ need to be read by old clients. If "mutable-type=mdmf", then the file
8369+ will be created in the new MDMF mutable file format. MDMF mutable files
8370+ can be downloaded more efficiently, and modified in-place efficiently,
8371+ but are not compatible with older versions of Tahoe-LAFS. If no
8372+ "mutable-type" argument is given, the file is created in whatever
8373+ format was configured in tahoe.cfg.
8374 
8375 === Creating A New Directory ===
8376 
8377hunk ./docs/frontends/webapi.txt 1020
8378  If a "mutable=true" argument is provided, the operation will create a
8379  mutable file, and the response body will contain the write-cap instead of
8380  the upload results page. The default is to create an immutable file,
8381- returning the upload results page as a response.
8382+ returning the upload results page as a response. If you create a
8383+ mutable file, you may choose to specify the format of that mutable file
8384+ with the "mutable-type" parameter. If "mutable-type=mdmf", then the
8385+ file will be created as an MDMF mutable file. If "mutable-type=sdmf",
8386+ then the file will be created as an SDMF mutable file. If no value is
8387+ specified, the file will be created in whatever format is specified in
8388+ tahoe.cfg.
8389 
8390 
8391 POST /uri/$DIRCAP/[SUBDIRS../]?t=upload
8392}
8393[client.py: learn how to create different kinds of mutable files
8394Kevan Carstensen <kevan@isnotajoke.com>**20100814225711
8395 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b
8396] {
8397hunk ./src/allmydata/client.py 25
8398 from allmydata.util.time_format import parse_duration, parse_date
8399 from allmydata.stats import StatsProvider
8400 from allmydata.history import History
8401-from allmydata.interfaces import IStatsProducer, RIStubClient
8402+from allmydata.interfaces import IStatsProducer, RIStubClient, \
8403+                                 SDMF_VERSION, MDMF_VERSION
8404 from allmydata.nodemaker import NodeMaker
8405 
8406 
8407hunk ./src/allmydata/client.py 357
8408                                    self.terminator,
8409                                    self.get_encoding_parameters(),
8410                                    self._key_generator)
8411+        default = self.get_config("client", "mutable.format", default="sdmf")
8412+        if default == "mdmf":
8413+            self.mutable_file_default = MDMF_VERSION
8414+        else:
8415+            self.mutable_file_default = SDMF_VERSION
8416 
8417     def get_history(self):
8418         return self.history
8419hunk ./src/allmydata/client.py 500
8420     def create_immutable_dirnode(self, children, convergence=None):
8421         return self.nodemaker.create_immutable_directory(children, convergence)
8422 
8423-    def create_mutable_file(self, contents=None, keysize=None):
8424-        return self.nodemaker.create_mutable_file(contents, keysize)
8425+    def create_mutable_file(self, contents=None, keysize=None, version=None):
8426+        if not version:
8427+            version = self.mutable_file_default
8428+        return self.nodemaker.create_mutable_file(contents, keysize,
8429+                                                  version=version)
8430 
8431     def upload(self, uploadable):
8432         uploader = self.getServiceNamed("uploader")
8433}
8434[mutable/filenode.py: add versions and partial-file updates to the mutable file node
8435Kevan Carstensen <kevan@isnotajoke.com>**20100814225738
8436 Ignore-this: 3f33454d119b4b82de54063bc57205e3
8437 
8438 One of the goals of MDMF as a GSoC project is to lay the groundwork for
8439 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
8440 multiple versions of a single cap on the grid. In line with this, there
8441 is a now a distinction between an overriding mutable file (which can be
8442 thought to correspond to the cap/unique identifier for that mutable
8443 file) and versions of the mutable file (which we can download, update,
8444 and so on). All download, upload, and modification operations end up
8445 happening on a particular version of a mutable file, but there are
8446 shortcut methods on the object representing the overriding mutable file
8447 that perform these operations on the best version of the mutable file
8448 (which is what code should be doing until we have LDMF and better
8449 support for other paradigms).
8450 
8451 Another goal of MDMF was to take advantage of segmentation to give
8452 callers more efficient partial file updates or appends. This patch
8453 implements methods that do that, too.
8454 
8455] {
8456hunk ./src/allmydata/mutable/filenode.py 7
8457 from zope.interface import implements
8458 from twisted.internet import defer, reactor
8459 from foolscap.api import eventually
8460-from allmydata.interfaces import IMutableFileNode, \
8461-     ICheckable, ICheckResults, NotEnoughSharesError
8462-from allmydata.util import hashutil, log
8463+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
8464+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
8465+     IMutableFileVersion, IWritable
8466+from allmydata import hashtree
8467+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
8468 from allmydata.util.assertutil import precondition
8469 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
8470 from allmydata.monitor import Monitor
8471hunk ./src/allmydata/mutable/filenode.py 17
8472 from pycryptopp.cipher.aes import AES
8473 
8474-from allmydata.mutable.publish import Publish
8475+from allmydata.mutable.publish import Publish, MutableFileHandle, \
8476+                                      MutableData,\
8477+                                      DEFAULT_MAX_SEGMENT_SIZE, \
8478+                                      TransformingUploadable
8479 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
8480      ResponseCache, UncoordinatedWriteError
8481 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
8482hunk ./src/allmydata/mutable/filenode.py 72
8483         self._sharemap = {} # known shares, shnum-to-[nodeids]
8484         self._cache = ResponseCache()
8485         self._most_recent_size = None
8486+        # filled in after __init__ if we're being created for the first time;
8487+        # filled in by the servermap updater before publishing, otherwise.
8488+        # set to this default value in case neither of those things happen,
8489+        # or in case the servermap can't find any shares to tell us what
8490+        # to publish as.
8491+        # TODO: Set this back to None, and find out why the tests fail
8492+        #       with it set to None.
8493+        self._protocol_version = None
8494 
8495         # all users of this MutableFileNode go through the serializer. This
8496         # takes advantage of the fact that Deferreds discard the callbacks
8497hunk ./src/allmydata/mutable/filenode.py 136
8498         return self._upload(initial_contents, None)
8499 
8500     def _get_initial_contents(self, contents):
8501-        if isinstance(contents, str):
8502-            return contents
8503         if contents is None:
8504hunk ./src/allmydata/mutable/filenode.py 137
8505-            return ""
8506+            return MutableData("")
8507+
8508+        if IMutableUploadable.providedBy(contents):
8509+            return contents
8510+
8511         assert callable(contents), "%s should be callable, not %s" % \
8512                (contents, type(contents))
8513         return contents(self)
8514hunk ./src/allmydata/mutable/filenode.py 211
8515 
8516     def get_size(self):
8517         return self._most_recent_size
8518+
8519     def get_current_size(self):
8520         d = self.get_size_of_best_version()
8521         d.addCallback(self._stash_size)
8522hunk ./src/allmydata/mutable/filenode.py 216
8523         return d
8524+
8525     def _stash_size(self, size):
8526         self._most_recent_size = size
8527         return size
8528hunk ./src/allmydata/mutable/filenode.py 275
8529             return cmp(self.__class__, them.__class__)
8530         return cmp(self._uri, them._uri)
8531 
8532-    def _do_serialized(self, cb, *args, **kwargs):
8533-        # note: to avoid deadlock, this callable is *not* allowed to invoke
8534-        # other serialized methods within this (or any other)
8535-        # MutableFileNode. The callable should be a bound method of this same
8536-        # MFN instance.
8537-        d = defer.Deferred()
8538-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
8539-        # we need to put off d.callback until this Deferred is finished being
8540-        # processed. Otherwise the caller's subsequent activities (like,
8541-        # doing other things with this node) can cause reentrancy problems in
8542-        # the Deferred code itself
8543-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
8544-        # add a log.err just in case something really weird happens, because
8545-        # self._serializer stays around forever, therefore we won't see the
8546-        # usual Unhandled Error in Deferred that would give us a hint.
8547-        self._serializer.addErrback(log.err)
8548-        return d
8549 
8550     #################################
8551     # ICheckable
8552hunk ./src/allmydata/mutable/filenode.py 300
8553 
8554 
8555     #################################
8556-    # IMutableFileNode
8557+    # IFileNode
8558+
8559+    def get_best_readable_version(self):
8560+        """
8561+        I return a Deferred that fires with a MutableFileVersion
8562+        representing the best readable version of the file that I
8563+        represent
8564+        """
8565+        return self.get_readable_version()
8566+
8567+
8568+    def get_readable_version(self, servermap=None, version=None):
8569+        """
8570+        I return a Deferred that fires with an MutableFileVersion for my
8571+        version argument, if there is a recoverable file of that version
8572+        on the grid. If there is no recoverable version, I fire with an
8573+        UnrecoverableFileError.
8574+
8575+        If a servermap is provided, I look in there for the requested
8576+        version. If no servermap is provided, I create and update a new
8577+        one.
8578+
8579+        If no version is provided, then I return a MutableFileVersion
8580+        representing the best recoverable version of the file.
8581+        """
8582+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
8583+        def _build_version((servermap, their_version)):
8584+            assert their_version in servermap.recoverable_versions()
8585+            assert their_version in servermap.make_versionmap()
8586+
8587+            mfv = MutableFileVersion(self,
8588+                                     servermap,
8589+                                     their_version,
8590+                                     self._storage_index,
8591+                                     self._storage_broker,
8592+                                     self._readkey,
8593+                                     history=self._history)
8594+            assert mfv.is_readonly()
8595+            # our caller can use this to download the contents of the
8596+            # mutable file.
8597+            return mfv
8598+        return d.addCallback(_build_version)
8599+
8600+
8601+    def _get_version_from_servermap(self,
8602+                                    mode,
8603+                                    servermap=None,
8604+                                    version=None):
8605+        """
8606+        I return a Deferred that fires with (servermap, version).
8607+
8608+        This function performs validation and a servermap update. If it
8609+        returns (servermap, version), the caller can assume that:
8610+            - servermap was last updated in mode.
8611+            - version is recoverable, and corresponds to the servermap.
8612+
8613+        If version and servermap are provided to me, I will validate
8614+        that version exists in the servermap, and that the servermap was
8615+        updated correctly.
8616+
8617+        If version is not provided, but servermap is, I will validate
8618+        the servermap and return the best recoverable version that I can
8619+        find in the servermap.
8620+
8621+        If the version is provided but the servermap isn't, I will
8622+        obtain a servermap that has been updated in the correct mode and
8623+        validate that version is found and recoverable.
8624+
8625+        If neither servermap nor version are provided, I will obtain a
8626+        servermap updated in the correct mode, and return the best
8627+        recoverable version that I can find in there.
8628+        """
8629+        # XXX: wording ^^^^
8630+        if servermap and servermap.last_update_mode == mode:
8631+            d = defer.succeed(servermap)
8632+        else:
8633+            d = self._get_servermap(mode)
8634+
8635+        def _get_version(servermap, v):
8636+            if v and v not in servermap.recoverable_versions():
8637+                v = None
8638+            elif not v:
8639+                v = servermap.best_recoverable_version()
8640+            if not v:
8641+                raise UnrecoverableFileError("no recoverable versions")
8642+
8643+            return (servermap, v)
8644+        return d.addCallback(_get_version, version)
8645+
8646 
8647     def download_best_version(self):
8648hunk ./src/allmydata/mutable/filenode.py 391
8649+        """
8650+        I return a Deferred that fires with the contents of the best
8651+        version of this mutable file.
8652+        """
8653         return self._do_serialized(self._download_best_version)
8654hunk ./src/allmydata/mutable/filenode.py 396
8655+
8656+
8657     def _download_best_version(self):
8658hunk ./src/allmydata/mutable/filenode.py 399
8659-        servermap = ServerMap()
8660-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
8661-        def _maybe_retry(f):
8662-            f.trap(NotEnoughSharesError)
8663-            # the download is worth retrying once. Make sure to use the
8664-            # old servermap, since it is what remembers the bad shares,
8665-            # but use MODE_WRITE to make it look for even more shares.
8666-            # TODO: consider allowing this to retry multiple times.. this
8667-            # approach will let us tolerate about 8 bad shares, I think.
8668-            return self._try_once_to_download_best_version(servermap,
8669-                                                           MODE_WRITE)
8670+        """
8671+        I am the serialized sibling of download_best_version.
8672+        """
8673+        d = self.get_best_readable_version()
8674+        d.addCallback(self._record_size)
8675+        d.addCallback(lambda version: version.download_to_data())
8676+
8677+        # It is possible that the download will fail because there
8678+        # aren't enough shares to be had. If so, we will try again after
8679+        # updating the servermap in MODE_WRITE, which may find more
8680+        # shares than updating in MODE_READ, as we just did. We can do
8681+        # this by getting the best mutable version and downloading from
8682+        # that -- the best mutable version will be a MutableFileVersion
8683+        # with a servermap that was last updated in MODE_WRITE, as we
8684+        # want. If this fails, then we give up.
8685+        def _maybe_retry(failure):
8686+            failure.trap(NotEnoughSharesError)
8687+
8688+            d = self.get_best_mutable_version()
8689+            d.addCallback(self._record_size)
8690+            d.addCallback(lambda version: version.download_to_data())
8691+            return d
8692+
8693         d.addErrback(_maybe_retry)
8694         return d
8695hunk ./src/allmydata/mutable/filenode.py 424
8696-    def _try_once_to_download_best_version(self, servermap, mode):
8697-        d = self._update_servermap(servermap, mode)
8698-        d.addCallback(self._once_updated_download_best_version, servermap)
8699-        return d
8700-    def _once_updated_download_best_version(self, ignored, servermap):
8701-        goal = servermap.best_recoverable_version()
8702-        if not goal:
8703-            raise UnrecoverableFileError("no recoverable versions")
8704-        return self._try_once_to_download_version(servermap, goal)
8705+
8706+
8707+    def _record_size(self, mfv):
8708+        """
8709+        I record the size of a mutable file version.
8710+        """
8711+        self._most_recent_size = mfv.get_size()
8712+        return mfv
8713+
8714 
8715     def get_size_of_best_version(self):
8716hunk ./src/allmydata/mutable/filenode.py 435
8717-        d = self.get_servermap(MODE_READ)
8718-        def _got_servermap(smap):
8719-            ver = smap.best_recoverable_version()
8720-            if not ver:
8721-                raise UnrecoverableFileError("no recoverable version")
8722-            return smap.size_of_version(ver)
8723-        d.addCallback(_got_servermap)
8724-        return d
8725+        """
8726+        I return the size of the best version of this mutable file.
8727 
8728hunk ./src/allmydata/mutable/filenode.py 438
8729+        This is equivalent to calling get_size() on the result of
8730+        get_best_readable_version().
8731+        """
8732+        d = self.get_best_readable_version()
8733+        return d.addCallback(lambda mfv: mfv.get_size())
8734+
8735+
8736+    #################################
8737+    # IMutableFileNode
8738+
8739+    def get_best_mutable_version(self, servermap=None):
8740+        """
8741+        I return a Deferred that fires with a MutableFileVersion
8742+        representing the best readable version of the file that I
8743+        represent. I am like get_best_readable_version, except that I
8744+        will try to make a writable version if I can.
8745+        """
8746+        return self.get_mutable_version(servermap=servermap)
8747+
8748+
8749+    def get_mutable_version(self, servermap=None, version=None):
8750+        """
8751+        I return a version of this mutable file. I return a Deferred
8752+        that fires with a MutableFileVersion
8753+
8754+        If version is provided, the Deferred will fire with a
8755+        MutableFileVersion initailized with that version. Otherwise, it
8756+        will fire with the best version that I can recover.
8757+
8758+        If servermap is provided, I will use that to find versions
8759+        instead of performing my own servermap update.
8760+        """
8761+        if self.is_readonly():
8762+            return self.get_readable_version(servermap=servermap,
8763+                                             version=version)
8764+
8765+        # get_mutable_version => write intent, so we require that the
8766+        # servermap is updated in MODE_WRITE
8767+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
8768+        def _build_version((servermap, smap_version)):
8769+            # these should have been set by the servermap update.
8770+            assert self._secret_holder
8771+            assert self._writekey
8772+
8773+            mfv = MutableFileVersion(self,
8774+                                     servermap,
8775+                                     smap_version,
8776+                                     self._storage_index,
8777+                                     self._storage_broker,
8778+                                     self._readkey,
8779+                                     self._writekey,
8780+                                     self._secret_holder,
8781+                                     history=self._history)
8782+            assert not mfv.is_readonly()
8783+            return mfv
8784+
8785+        return d.addCallback(_build_version)
8786+
8787+
8788+    # XXX: I'm uncomfortable with the difference between upload and
8789+    #      overwrite, which, FWICT, is basically that you don't have to
8790+    #      do a servermap update before you overwrite. We split them up
8791+    #      that way anyway, so I guess there's no real difficulty in
8792+    #      offering both ways to callers, but it also makes the
8793+    #      public-facing API cluttery, and makes it hard to discern the
8794+    #      right way of doing things.
8795+
8796+    # In general, we leave it to callers to ensure that they aren't
8797+    # going to cause UncoordinatedWriteErrors when working with
8798+    # MutableFileVersions. We know that the next three operations
8799+    # (upload, overwrite, and modify) will all operate on the same
8800+    # version, so we say that only one of them can be going on at once,
8801+    # and serialize them to ensure that that actually happens, since as
8802+    # the caller in this situation it is our job to do that.
8803     def overwrite(self, new_contents):
8804hunk ./src/allmydata/mutable/filenode.py 513
8805+        """
8806+        I overwrite the contents of the best recoverable version of this
8807+        mutable file with new_contents. This is equivalent to calling
8808+        overwrite on the result of get_best_mutable_version with
8809+        new_contents as an argument. I return a Deferred that eventually
8810+        fires with the results of my replacement process.
8811+        """
8812         return self._do_serialized(self._overwrite, new_contents)
8813hunk ./src/allmydata/mutable/filenode.py 521
8814+
8815+
8816     def _overwrite(self, new_contents):
8817hunk ./src/allmydata/mutable/filenode.py 524
8818+        """
8819+        I am the serialized sibling of overwrite.
8820+        """
8821+        d = self.get_best_mutable_version()
8822+        d.addCallback(lambda mfv: mfv.overwrite(new_contents))
8823+        d.addCallback(self._did_upload, new_contents.get_size())
8824+        return d
8825+
8826+
8827+
8828+    def upload(self, new_contents, servermap):
8829+        """
8830+        I overwrite the contents of the best recoverable version of this
8831+        mutable file with new_contents, using servermap instead of
8832+        creating/updating our own servermap. I return a Deferred that
8833+        fires with the results of my upload.
8834+        """
8835+        return self._do_serialized(self._upload, new_contents, servermap)
8836+
8837+
8838+    def modify(self, modifier, backoffer=None):
8839+        """
8840+        I modify the contents of the best recoverable version of this
8841+        mutable file with the modifier. This is equivalent to calling
8842+        modify on the result of get_best_mutable_version. I return a
8843+        Deferred that eventually fires with an UploadResults instance
8844+        describing this process.
8845+        """
8846+        return self._do_serialized(self._modify, modifier, backoffer)
8847+
8848+
8849+    def _modify(self, modifier, backoffer):
8850+        """
8851+        I am the serialized sibling of modify.
8852+        """
8853+        d = self.get_best_mutable_version()
8854+        d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
8855+        return d
8856+
8857+
8858+    def download_version(self, servermap, version, fetch_privkey=False):
8859+        """
8860+        Download the specified version of this mutable file. I return a
8861+        Deferred that fires with the contents of the specified version
8862+        as a bytestring, or errbacks if the file is not recoverable.
8863+        """
8864+        d = self.get_readable_version(servermap, version)
8865+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
8866+
8867+
8868+    def get_servermap(self, mode):
8869+        """
8870+        I return a servermap that has been updated in mode.
8871+
8872+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
8873+        MODE_ANYTHING. See servermap.py for more on what these mean.
8874+        """
8875+        return self._do_serialized(self._get_servermap, mode)
8876+
8877+
8878+    def _get_servermap(self, mode):
8879+        """
8880+        I am a serialized twin to get_servermap.
8881+        """
8882         servermap = ServerMap()
8883hunk ./src/allmydata/mutable/filenode.py 589
8884-        d = self._update_servermap(servermap, mode=MODE_WRITE)
8885-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
8886+        d = self._update_servermap(servermap, mode)
8887+        # The servermap will tell us about the most recent size of the
8888+        # file, so we may as well set that so that callers might get
8889+        # more data about us.
8890+        if not self._most_recent_size:
8891+            d.addCallback(self._get_size_from_servermap)
8892+        return d
8893+
8894+
8895+    def _get_size_from_servermap(self, servermap):
8896+        """
8897+        I extract the size of the best version of this file and record
8898+        it in self._most_recent_size. I return the servermap that I was
8899+        given.
8900+        """
8901+        if servermap.recoverable_versions():
8902+            v = servermap.best_recoverable_version()
8903+            size = v[4] # verinfo[4] == size
8904+            self._most_recent_size = size
8905+        return servermap
8906+
8907+
8908+    def _update_servermap(self, servermap, mode):
8909+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
8910+                             mode)
8911+        if self._history:
8912+            self._history.notify_mapupdate(u.get_status())
8913+        return u.update()
8914+
8915+
8916+    def set_version(self, version):
8917+        # I can be set in two ways:
8918+        #  1. When the node is created.
8919+        #  2. (for an existing share) when the Servermap is updated
8920+        #     before I am read.
8921+        assert version in (MDMF_VERSION, SDMF_VERSION)
8922+        self._protocol_version = version
8923+
8924+
8925+    def get_version(self):
8926+        return self._protocol_version
8927+
8928+
8929+    def _do_serialized(self, cb, *args, **kwargs):
8930+        # note: to avoid deadlock, this callable is *not* allowed to invoke
8931+        # other serialized methods within this (or any other)
8932+        # MutableFileNode. The callable should be a bound method of this same
8933+        # MFN instance.
8934+        d = defer.Deferred()
8935+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
8936+        # we need to put off d.callback until this Deferred is finished being
8937+        # processed. Otherwise the caller's subsequent activities (like,
8938+        # doing other things with this node) can cause reentrancy problems in
8939+        # the Deferred code itself
8940+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
8941+        # add a log.err just in case something really weird happens, because
8942+        # self._serializer stays around forever, therefore we won't see the
8943+        # usual Unhandled Error in Deferred that would give us a hint.
8944+        self._serializer.addErrback(log.err)
8945         return d
8946 
8947 
8948hunk ./src/allmydata/mutable/filenode.py 651
8949+    def _upload(self, new_contents, servermap):
8950+        """
8951+        A MutableFileNode still has to have some way of getting
8952+        published initially, which is what I am here for. After that,
8953+        all publishing, updating, modifying and so on happens through
8954+        MutableFileVersions.
8955+        """
8956+        assert self._pubkey, "update_servermap must be called before publish"
8957+
8958+        p = Publish(self, self._storage_broker, servermap)
8959+        if self._history:
8960+            self._history.notify_publish(p.get_status(),
8961+                                         new_contents.get_size())
8962+        d = p.publish(new_contents)
8963+        d.addCallback(self._did_upload, new_contents.get_size())
8964+        return d
8965+
8966+
8967+    def _did_upload(self, res, size):
8968+        self._most_recent_size = size
8969+        return res
8970+
8971+
8972+class MutableFileVersion:
8973+    """
8974+    I represent a specific version (most likely the best version) of a
8975+    mutable file.
8976+
8977+    Since I implement IReadable, instances which hold a
8978+    reference to an instance of me are guaranteed the ability (absent
8979+    connection difficulties or unrecoverable versions) to read the file
8980+    that I represent. Depending on whether I was initialized with a
8981+    write capability or not, I may also provide callers the ability to
8982+    overwrite or modify the contents of the mutable file that I
8983+    reference.
8984+    """
8985+    implements(IMutableFileVersion, IWritable)
8986+
8987+    def __init__(self,
8988+                 node,
8989+                 servermap,
8990+                 version,
8991+                 storage_index,
8992+                 storage_broker,
8993+                 readcap,
8994+                 writekey=None,
8995+                 write_secrets=None,
8996+                 history=None):
8997+
8998+        self._node = node
8999+        self._servermap = servermap
9000+        self._version = version
9001+        self._storage_index = storage_index
9002+        self._write_secrets = write_secrets
9003+        self._history = history
9004+        self._storage_broker = storage_broker
9005+
9006+        #assert isinstance(readcap, IURI)
9007+        self._readcap = readcap
9008+
9009+        self._writekey = writekey
9010+        self._serializer = defer.succeed(None)
9011+
9012+
9013+    def get_sequence_number(self):
9014+        """
9015+        Get the sequence number of the mutable version that I represent.
9016+        """
9017+        return self._version[0] # verinfo[0] == the sequence number
9018+
9019+
9020+    # TODO: Terminology?
9021+    def get_writekey(self):
9022+        """
9023+        I return a writekey or None if I don't have a writekey.
9024+        """
9025+        return self._writekey
9026+
9027+
9028+    def overwrite(self, new_contents):
9029+        """
9030+        I overwrite the contents of this mutable file version with the
9031+        data in new_contents.
9032+        """
9033+        assert not self.is_readonly()
9034+
9035+        return self._do_serialized(self._overwrite, new_contents)
9036+
9037+
9038+    def _overwrite(self, new_contents):
9039+        assert IMutableUploadable.providedBy(new_contents)
9040+        assert self._servermap.last_update_mode == MODE_WRITE
9041+
9042+        return self._upload(new_contents)
9043+
9044+
9045     def modify(self, modifier, backoffer=None):
9046         """I use a modifier callback to apply a change to the mutable file.
9047         I implement the following pseudocode::
9048hunk ./src/allmydata/mutable/filenode.py 787
9049         backoffer should not invoke any methods on this MutableFileNode
9050         instance, and it needs to be highly conscious of deadlock issues.
9051         """
9052+        assert not self.is_readonly()
9053+
9054         return self._do_serialized(self._modify, modifier, backoffer)
9055hunk ./src/allmydata/mutable/filenode.py 790
9056+
9057+
9058     def _modify(self, modifier, backoffer):
9059hunk ./src/allmydata/mutable/filenode.py 793
9060-        servermap = ServerMap()
9061         if backoffer is None:
9062             backoffer = BackoffAgent().delay
9063hunk ./src/allmydata/mutable/filenode.py 795
9064-        return self._modify_and_retry(servermap, modifier, backoffer, True)
9065-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
9066-        d = self._modify_once(servermap, modifier, first_time)
9067+        return self._modify_and_retry(modifier, backoffer, True)
9068+
9069+
9070+    def _modify_and_retry(self, modifier, backoffer, first_time):
9071+        """
9072+        I try to apply modifier to the contents of this version of the
9073+        mutable file. If I succeed, I return an UploadResults instance
9074+        describing my success. If I fail, I try again after waiting for
9075+        a little bit.
9076+        """
9077+        log.msg("doing modify")
9078+        d = self._modify_once(modifier, first_time)
9079         def _retry(f):
9080             f.trap(UncoordinatedWriteError)
9081             d2 = defer.maybeDeferred(backoffer, self, f)
9082hunk ./src/allmydata/mutable/filenode.py 811
9083             d2.addCallback(lambda ignored:
9084-                           self._modify_and_retry(servermap, modifier,
9085+                           self._modify_and_retry(modifier,
9086                                                   backoffer, False))
9087             return d2
9088         d.addErrback(_retry)
9089hunk ./src/allmydata/mutable/filenode.py 816
9090         return d
9091-    def _modify_once(self, servermap, modifier, first_time):
9092-        d = self._update_servermap(servermap, MODE_WRITE)
9093-        d.addCallback(self._once_updated_download_best_version, servermap)
9094+
9095+
9096+    def _modify_once(self, modifier, first_time):
9097+        """
9098+        I attempt to apply a modifier to the contents of the mutable
9099+        file.
9100+        """
9101+        # XXX: This is wrong -- we could get more servers if we updated
9102+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
9103+        # assert that the last update wasn't MODE_READ
9104+        assert self._servermap.last_update_mode == MODE_WRITE
9105+
9106+        # download_to_data is serialized, so we have to call this to
9107+        # avoid deadlock.
9108+        d = self._try_to_download_data()
9109         def _apply(old_contents):
9110hunk ./src/allmydata/mutable/filenode.py 832
9111-            new_contents = modifier(old_contents, servermap, first_time)
9112+            new_contents = modifier(old_contents, self._servermap, first_time)
9113+            precondition((isinstance(new_contents, str) or
9114+                          new_contents is None),
9115+                         "Modifier function must return a string "
9116+                         "or None")
9117+
9118             if new_contents is None or new_contents == old_contents:
9119hunk ./src/allmydata/mutable/filenode.py 839
9120+                log.msg("no changes")
9121                 # no changes need to be made
9122                 if first_time:
9123                     return
9124hunk ./src/allmydata/mutable/filenode.py 847
9125                 # recovery when it observes UCWE, we need to do a second
9126                 # publish. See #551 for details. We'll basically loop until
9127                 # we managed an uncontested publish.
9128-                new_contents = old_contents
9129-            precondition(isinstance(new_contents, str),
9130-                         "Modifier function must return a string or None")
9131-            return self._upload(new_contents, servermap)
9132+                old_uploadable = MutableData(old_contents)
9133+                new_contents = old_uploadable
9134+            else:
9135+                new_contents = MutableData(new_contents)
9136+
9137+            return self._upload(new_contents)
9138         d.addCallback(_apply)
9139         return d
9140 
9141hunk ./src/allmydata/mutable/filenode.py 856
9142-    def get_servermap(self, mode):
9143-        return self._do_serialized(self._get_servermap, mode)
9144-    def _get_servermap(self, mode):
9145-        servermap = ServerMap()
9146-        return self._update_servermap(servermap, mode)
9147-    def _update_servermap(self, servermap, mode):
9148-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
9149-                             mode)
9150-        if self._history:
9151-            self._history.notify_mapupdate(u.get_status())
9152-        return u.update()
9153 
9154hunk ./src/allmydata/mutable/filenode.py 857
9155-    def download_version(self, servermap, version, fetch_privkey=False):
9156-        return self._do_serialized(self._try_once_to_download_version,
9157-                                   servermap, version, fetch_privkey)
9158-    def _try_once_to_download_version(self, servermap, version,
9159-                                      fetch_privkey=False):
9160-        r = Retrieve(self, servermap, version, fetch_privkey)
9161+    def is_readonly(self):
9162+        """
9163+        I return True if this MutableFileVersion provides no write
9164+        access to the file that it encapsulates, and False if it
9165+        provides the ability to modify the file.
9166+        """
9167+        return self._writekey is None
9168+
9169+
9170+    def is_mutable(self):
9171+        """
9172+        I return True, since mutable files are always mutable by
9173+        somebody.
9174+        """
9175+        return True
9176+
9177+
9178+    def get_storage_index(self):
9179+        """
9180+        I return the storage index of the reference that I encapsulate.
9181+        """
9182+        return self._storage_index
9183+
9184+
9185+    def get_size(self):
9186+        """
9187+        I return the length, in bytes, of this readable object.
9188+        """
9189+        return self._servermap.size_of_version(self._version)
9190+
9191+
9192+    def download_to_data(self, fetch_privkey=False):
9193+        """
9194+        I return a Deferred that fires with the contents of this
9195+        readable object as a byte string.
9196+
9197+        """
9198+        c = consumer.MemoryConsumer()
9199+        d = self.read(c, fetch_privkey=fetch_privkey)
9200+        d.addCallback(lambda mc: "".join(mc.chunks))
9201+        return d
9202+
9203+
9204+    def _try_to_download_data(self):
9205+        """
9206+        I am an unserialized cousin of download_to_data; I am called
9207+        from the children of modify() to download the data associated
9208+        with this mutable version.
9209+        """
9210+        c = consumer.MemoryConsumer()
9211+        # modify will almost certainly write, so we need the privkey.
9212+        d = self._read(c, fetch_privkey=True)
9213+        d.addCallback(lambda mc: "".join(mc.chunks))
9214+        return d
9215+
9216+
9217+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
9218+        """
9219+        I read a portion (possibly all) of the mutable file that I
9220+        reference into consumer.
9221+        """
9222+        return self._do_serialized(self._read, consumer, offset, size,
9223+                                   fetch_privkey)
9224+
9225+
9226+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
9227+        """
9228+        I am the serialized companion of read.
9229+        """
9230+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
9231         if self._history:
9232             self._history.notify_retrieve(r.get_status())
9233hunk ./src/allmydata/mutable/filenode.py 929
9234-        d = r.download()
9235-        d.addCallback(self._downloaded_version)
9236+        d = r.download(consumer, offset, size)
9237         return d
9238hunk ./src/allmydata/mutable/filenode.py 931
9239-    def _downloaded_version(self, data):
9240-        self._most_recent_size = len(data)
9241-        return data
9242 
9243hunk ./src/allmydata/mutable/filenode.py 932
9244-    def upload(self, new_contents, servermap):
9245-        return self._do_serialized(self._upload, new_contents, servermap)
9246-    def _upload(self, new_contents, servermap):
9247-        assert self._pubkey, "update_servermap must be called before publish"
9248-        p = Publish(self, self._storage_broker, servermap)
9249+
9250+    def _do_serialized(self, cb, *args, **kwargs):
9251+        # note: to avoid deadlock, this callable is *not* allowed to invoke
9252+        # other serialized methods within this (or any other)
9253+        # MutableFileNode. The callable should be a bound method of this same
9254+        # MFN instance.
9255+        d = defer.Deferred()
9256+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
9257+        # we need to put off d.callback until this Deferred is finished being
9258+        # processed. Otherwise the caller's subsequent activities (like,
9259+        # doing other things with this node) can cause reentrancy problems in
9260+        # the Deferred code itself
9261+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
9262+        # add a log.err just in case something really weird happens, because
9263+        # self._serializer stays around forever, therefore we won't see the
9264+        # usual Unhandled Error in Deferred that would give us a hint.
9265+        self._serializer.addErrback(log.err)
9266+        return d
9267+
9268+
9269+    def _upload(self, new_contents):
9270+        #assert self._pubkey, "update_servermap must be called before publish"
9271+        p = Publish(self._node, self._storage_broker, self._servermap)
9272         if self._history:
9273hunk ./src/allmydata/mutable/filenode.py 956
9274-            self._history.notify_publish(p.get_status(), len(new_contents))
9275+            self._history.notify_publish(p.get_status(),
9276+                                         new_contents.get_size())
9277         d = p.publish(new_contents)
9278hunk ./src/allmydata/mutable/filenode.py 959
9279-        d.addCallback(self._did_upload, len(new_contents))
9280+        d.addCallback(self._did_upload, new_contents.get_size())
9281         return d
9282hunk ./src/allmydata/mutable/filenode.py 961
9283+
9284+
9285     def _did_upload(self, res, size):
9286         self._most_recent_size = size
9287         return res
9288hunk ./src/allmydata/mutable/filenode.py 966
9289+
9290+    def update(self, data, offset):
9291+        """
9292+        Do an update of this mutable file version by inserting data at
9293+        offset within the file. If offset is the EOF, this is an append
9294+        operation. I return a Deferred that fires with the results of
9295+        the update operation when it has completed.
9296+
9297+        In cases where update does not append any data, or where it does
9298+        not append so many blocks that the block count crosses a
9299+        power-of-two boundary, this operation will use roughly
9300+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
9301+        Otherwise, it must download, re-encode, and upload the entire
9302+        file again, which will use O(filesize) resources.
9303+        """
9304+        return self._do_serialized(self._update, data, offset)
9305+
9306+
9307+    def _update(self, data, offset):
9308+        """
9309+        I update the mutable file version represented by this particular
9310+        IMutableVersion by inserting the data in data at the offset
9311+        offset. I return a Deferred that fires when this has been
9312+        completed.
9313+        """
9314+        # We have two cases here:
9315+        # 1. The new data will add few enough segments so that it does
9316+        #    not cross into the next power-of-two boundary.
9317+        # 2. It doesn't.
9318+        #
9319+        # In the former case, we can modify the file in place. In the
9320+        # latter case, we need to re-encode the file.
9321+        new_size = data.get_size() + offset
9322+        old_size = self.get_size()
9323+        segment_size = self._version[3]
9324+        num_old_segments = mathutil.div_ceil(old_size,
9325+                                             segment_size)
9326+        num_new_segments = mathutil.div_ceil(new_size,
9327+                                             segment_size)
9328+        log.msg("got %d old segments, %d new segments" % \
9329+                        (num_old_segments, num_new_segments))
9330+
9331+        # We also do a whole file re-encode if the file is an SDMF file.
9332+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
9333+            log.msg("doing re-encode instead of in-place update")
9334+            return self._do_modify_update(data, offset)
9335+
9336+        log.msg("updating in place")
9337+        d = self._do_update_update(data, offset)
9338+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
9339+        d.addCallback(self._build_uploadable_and_finish, data, offset)
9340+        return d
9341+
9342+
9343+    def _do_modify_update(self, data, offset):
9344+        """
9345+        I perform a file update by modifying the contents of the file
9346+        after downloading it, then reuploading it. I am less efficient
9347+        than _do_update_update, but am necessary for certain updates.
9348+        """
9349+        def m(old, servermap, first_time):
9350+            start = offset
9351+            rest = offset + data.get_size()
9352+            new = old[:start]
9353+            new += "".join(data.read(data.get_size()))
9354+            new += old[rest:]
9355+            return new
9356+        return self._modify(m, None)
9357+
9358+
9359+    def _do_update_update(self, data, offset):
9360+        """
9361+        I start the Servermap update that gets us the data we need to
9362+        continue the update process. I return a Deferred that fires when
9363+        the servermap update is done.
9364+        """
9365+        assert IMutableUploadable.providedBy(data)
9366+        assert self.is_mutable()
9367+        # offset == self.get_size() is valid and means that we are
9368+        # appending data to the file.
9369+        assert offset <= self.get_size()
9370+
9371+        datasize = data.get_size()
9372+        # We'll need the segment that the data starts in, regardless of
9373+        # what we'll do later.
9374+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
9375+        start_segment -= 1
9376+
9377+        # We only need the end segment if the data we append does not go
9378+        # beyond the current end-of-file.
9379+        end_segment = start_segment
9380+        if offset + data.get_size() < self.get_size():
9381+            end_data = offset + data.get_size()
9382+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
9383+            end_segment -= 1
9384+        self._start_segment = start_segment
9385+        self._end_segment = end_segment
9386+
9387+        # Now ask for the servermap to be updated in MODE_WRITE with
9388+        # this update range.
9389+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
9390+                             self._servermap,
9391+                             mode=MODE_WRITE,
9392+                             update_range=(start_segment, end_segment))
9393+        return u.update()
9394+
9395+
9396+    def _decode_and_decrypt_segments(self, ignored, data, offset):
9397+        """
9398+        After the servermap update, I take the encrypted and encoded
9399+        data that the servermap fetched while doing its update and
9400+        transform it into decoded-and-decrypted plaintext that can be
9401+        used by the new uploadable. I return a Deferred that fires with
9402+        the segments.
9403+        """
9404+        r = Retrieve(self._node, self._servermap, self._version)
9405+        # decode: takes in our blocks and salts from the servermap,
9406+        # returns a Deferred that fires with the corresponding plaintext
9407+        # segments. Does not download -- simply takes advantage of
9408+        # existing infrastructure within the Retrieve class to avoid
9409+        # duplicating code.
9410+        sm = self._servermap
9411+        # XXX: If the methods in the servermap don't work as
9412+        # abstractions, you should rewrite them instead of going around
9413+        # them.
9414+        update_data = sm.update_data
9415+        start_segments = {} # shnum -> start segment
9416+        end_segments = {} # shnum -> end segment
9417+        blockhashes = {} # shnum -> blockhash tree
9418+        for (shnum, data) in update_data.iteritems():
9419+            data = [d[1] for d in data if d[0] == self._version]
9420+
9421+            # Every data entry in our list should now be share shnum for
9422+            # a particular version of the mutable file, so all of the
9423+            # entries should be identical.
9424+            datum = data[0]
9425+            assert filter(lambda x: x != datum, data) == []
9426+
9427+            blockhashes[shnum] = datum[0]
9428+            start_segments[shnum] = datum[1]
9429+            end_segments[shnum] = datum[2]
9430+
9431+        d1 = r.decode(start_segments, self._start_segment)
9432+        d2 = r.decode(end_segments, self._end_segment)
9433+        d3 = defer.succeed(blockhashes)
9434+        return deferredutil.gatherResults([d1, d2, d3])
9435+
9436+
9437+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
9438+        """
9439+        After the process has the plaintext segments, I build the
9440+        TransformingUploadable that the publisher will eventually
9441+        re-upload to the grid. I then invoke the publisher with that
9442+        uploadable, and return a Deferred when the publish operation has
9443+        completed without issue.
9444+        """
9445+        u = TransformingUploadable(data, offset,
9446+                                   self._version[3],
9447+                                   segments_and_bht[0],
9448+                                   segments_and_bht[1])
9449+        p = Publish(self._node, self._storage_broker, self._servermap)
9450+        return p.update(u, offset, segments_and_bht[2], self._version)
9451}
9452[nodemaker.py: Make nodemaker expose a way to create MDMF files
9453Kevan Carstensen <kevan@isnotajoke.com>**20100814225829
9454 Ignore-this: fceefa3045dda1931ce28f9dd1bf121b
9455] {
9456hunk ./src/allmydata/nodemaker.py 3
9457 import weakref
9458 from zope.interface import implements
9459-from allmydata.interfaces import INodeMaker
9460+from allmydata.util.assertutil import precondition
9461+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
9462+                                 SDMF_VERSION, MDMF_VERSION
9463 from allmydata.immutable.literal import LiteralFileNode
9464 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
9465 from allmydata.immutable.upload import Data
9466hunk ./src/allmydata/nodemaker.py 10
9467 from allmydata.mutable.filenode import MutableFileNode
9468+from allmydata.mutable.publish import MutableData
9469 from allmydata.dirnode import DirectoryNode, pack_children
9470 from allmydata.unknown import UnknownNode
9471 from allmydata import uri
9472hunk ./src/allmydata/nodemaker.py 93
9473             return self._create_dirnode(filenode)
9474         return None
9475 
9476-    def create_mutable_file(self, contents=None, keysize=None):
9477+    def create_mutable_file(self, contents=None, keysize=None,
9478+                            version=SDMF_VERSION):
9479         n = MutableFileNode(self.storage_broker, self.secret_holder,
9480                             self.default_encoding_parameters, self.history)
9481hunk ./src/allmydata/nodemaker.py 97
9482+        n.set_version(version)
9483         d = self.key_generator.generate(keysize)
9484         d.addCallback(n.create_with_keys, contents)
9485         d.addCallback(lambda res: n)
9486hunk ./src/allmydata/nodemaker.py 104
9487         return d
9488 
9489     def create_new_mutable_directory(self, initial_children={}):
9490+        # mutable directories will always be SDMF for now, to help
9491+        # compatibility with older clients.
9492+        version = SDMF_VERSION
9493+        # initial_children must have metadata (i.e. {} instead of None)
9494+        for (name, (node, metadata)) in initial_children.iteritems():
9495+            precondition(isinstance(metadata, dict),
9496+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
9497+            node.raise_error()
9498         d = self.create_mutable_file(lambda n:
9499hunk ./src/allmydata/nodemaker.py 113
9500-                                     pack_children(initial_children, n.get_writekey()))
9501+                                     MutableData(pack_children(initial_children,
9502+                                                    n.get_writekey())),
9503+                                     version=version)
9504         d.addCallback(self._create_dirnode)
9505         return d
9506 
9507}
9508[tests:
9509Kevan Carstensen <kevan@isnotajoke.com>**20100814225848
9510 Ignore-this: b8ca299c4286c38ac1703aadf61c755d
9511 
9512     - A lot of existing tests relied on aspects of the mutable file
9513       implementation that were changed. This patch updates those tests
9514       to work with the changes.
9515     - This patch also adds tests for new features.
9516] {
9517hunk ./src/allmydata/test/common.py 12
9518 from allmydata import uri, dirnode, client
9519 from allmydata.introducer.server import IntroducerNode
9520 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
9521-     FileTooLargeError, NotEnoughSharesError, ICheckable
9522+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
9523+     IMutableUploadable, SDMF_VERSION, MDMF_VERSION
9524 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
9525      DeepCheckResults, DeepCheckAndRepairResults
9526 from allmydata.mutable.common import CorruptShareError
9527hunk ./src/allmydata/test/common.py 18
9528 from allmydata.mutable.layout import unpack_header
9529+from allmydata.mutable.publish import MutableData
9530 from allmydata.storage.server import storage_index_to_dir
9531 from allmydata.storage.mutable import MutableShareFile
9532 from allmydata.util import hashutil, log, fileutil, pollmixin
9533hunk ./src/allmydata/test/common.py 152
9534         consumer.write(data[start:end])
9535         return consumer
9536 
9537+
9538+    def get_best_readable_version(self):
9539+        return defer.succeed(self)
9540+
9541+
9542+    download_best_version = download_to_data
9543+
9544+
9545+    def download_to_data(self):
9546+        return download_to_data(self)
9547+
9548+
9549+    def get_size_of_best_version(self):
9550+        return defer.succeed(self.get_size)
9551+
9552+
9553 def make_chk_file_cap(size):
9554     return uri.CHKFileURI(key=os.urandom(16),
9555                           uri_extension_hash=os.urandom(32),
9556hunk ./src/allmydata/test/common.py 192
9557     MUTABLE_SIZELIMIT = 10000
9558     all_contents = {}
9559     bad_shares = {}
9560+    file_types = {} # storage index => MDMF_VERSION or SDMF_VERSION
9561 
9562     def __init__(self, storage_broker, secret_holder,
9563                  default_encoding_parameters, history):
9564hunk ./src/allmydata/test/common.py 199
9565         self.init_from_cap(make_mutable_file_cap())
9566     def create(self, contents, key_generator=None, keysize=None):
9567         initial_contents = self._get_initial_contents(contents)
9568-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
9569-            raise FileTooLargeError("SDMF is limited to one segment, and "
9570-                                    "%d > %d" % (len(initial_contents),
9571-                                                 self.MUTABLE_SIZELIMIT))
9572-        self.all_contents[self.storage_index] = initial_contents
9573+        data = initial_contents.read(initial_contents.get_size())
9574+        data = "".join(data)
9575+        self.all_contents[self.storage_index] = data
9576         return defer.succeed(self)
9577     def _get_initial_contents(self, contents):
9578hunk ./src/allmydata/test/common.py 204
9579-        if isinstance(contents, str):
9580-            return contents
9581         if contents is None:
9582hunk ./src/allmydata/test/common.py 205
9583-            return ""
9584+            return MutableData("")
9585+
9586+        if IMutableUploadable.providedBy(contents):
9587+            return contents
9588+
9589         assert callable(contents), "%s should be callable, not %s" % \
9590                (contents, type(contents))
9591         return contents(self)
9592hunk ./src/allmydata/test/common.py 257
9593     def get_storage_index(self):
9594         return self.storage_index
9595 
9596+    def get_servermap(self, mode):
9597+        return defer.succeed(None)
9598+
9599+    def set_version(self, version):
9600+        assert version in (SDMF_VERSION, MDMF_VERSION)
9601+        self.file_types[self.storage_index] = version
9602+
9603+    def get_version(self):
9604+        assert self.storage_index in self.file_types
9605+        return self.file_types[self.storage_index]
9606+
9607     def check(self, monitor, verify=False, add_lease=False):
9608         r = CheckResults(self.my_uri, self.storage_index)
9609         is_bad = self.bad_shares.get(self.storage_index, None)
9610hunk ./src/allmydata/test/common.py 326
9611         return d
9612 
9613     def download_best_version(self):
9614+        return defer.succeed(self._download_best_version())
9615+
9616+
9617+    def _download_best_version(self, ignored=None):
9618         if isinstance(self.my_uri, uri.LiteralFileURI):
9619hunk ./src/allmydata/test/common.py 331
9620-            return defer.succeed(self.my_uri.data)
9621+            return self.my_uri.data
9622         if self.storage_index not in self.all_contents:
9623hunk ./src/allmydata/test/common.py 333
9624-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9625-        return defer.succeed(self.all_contents[self.storage_index])
9626+            raise NotEnoughSharesError(None, 0, 3)
9627+        return self.all_contents[self.storage_index]
9628+
9629 
9630     def overwrite(self, new_contents):
9631hunk ./src/allmydata/test/common.py 338
9632-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9633-            raise FileTooLargeError("SDMF is limited to one segment, and "
9634-                                    "%d > %d" % (len(new_contents),
9635-                                                 self.MUTABLE_SIZELIMIT))
9636         assert not self.is_readonly()
9637hunk ./src/allmydata/test/common.py 339
9638-        self.all_contents[self.storage_index] = new_contents
9639+        new_data = new_contents.read(new_contents.get_size())
9640+        new_data = "".join(new_data)
9641+        self.all_contents[self.storage_index] = new_data
9642         return defer.succeed(None)
9643     def modify(self, modifier):
9644         # this does not implement FileTooLargeError, but the real one does
9645hunk ./src/allmydata/test/common.py 349
9646     def _modify(self, modifier):
9647         assert not self.is_readonly()
9648         old_contents = self.all_contents[self.storage_index]
9649-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9650+        new_data = modifier(old_contents, None, True)
9651+        self.all_contents[self.storage_index] = new_data
9652         return None
9653 
9654hunk ./src/allmydata/test/common.py 353
9655+    # As actually implemented, MutableFilenode and MutableFileVersion
9656+    # are distinct. However, nothing in the webapi uses (yet) that
9657+    # distinction -- it just uses the unified download interface
9658+    # provided by get_best_readable_version and read. When we start
9659+    # doing cooler things like LDMF, we will want to revise this code to
9660+    # be less simplistic.
9661+    def get_best_readable_version(self):
9662+        return defer.succeed(self)
9663+
9664+
9665+    def get_best_mutable_version(self):
9666+        return defer.succeed(self)
9667+
9668+    # Ditto for this, which is an implementation of IWritable.
9669+    # XXX: Declare that the same is implemented.
9670+    def update(self, data, offset):
9671+        assert not self.is_readonly()
9672+        def modifier(old, servermap, first_time):
9673+            new = old[:offset] + "".join(data.read(data.get_size()))
9674+            new += old[len(new):]
9675+            return new
9676+        return self.modify(modifier)
9677+
9678+
9679+    def read(self, consumer, offset=0, size=None):
9680+        data = self._download_best_version()
9681+        if size:
9682+            data = data[offset:offset+size]
9683+        consumer.write(data)
9684+        return defer.succeed(consumer)
9685+
9686+
9687 def make_mutable_file_cap():
9688     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
9689                                    fingerprint=os.urandom(32))
9690hunk ./src/allmydata/test/test_checker.py 11
9691 from allmydata.test.no_network import GridTestMixin
9692 from allmydata.immutable.upload import Data
9693 from allmydata.test.common_web import WebRenderingMixin
9694+from allmydata.mutable.publish import MutableData
9695 
9696 class FakeClient:
9697     def get_storage_broker(self):
9698hunk ./src/allmydata/test/test_checker.py 291
9699         def _stash_immutable(ur):
9700             self.imm = c0.create_node_from_uri(ur.uri)
9701         d.addCallback(_stash_immutable)
9702-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9703+        d.addCallback(lambda ign:
9704+            c0.create_mutable_file(MutableData("contents")))
9705         def _stash_mutable(node):
9706             self.mut = node
9707         d.addCallback(_stash_mutable)
9708hunk ./src/allmydata/test/test_cli.py 11
9709 from allmydata.util import fileutil, hashutil, base32
9710 from allmydata import uri
9711 from allmydata.immutable import upload
9712+from allmydata.mutable.publish import MutableData
9713 from allmydata.dirnode import normalize
9714 
9715 # Test that the scripts can be imported -- although the actual tests of their
9716hunk ./src/allmydata/test/test_cli.py 644
9717 
9718         d = self.do_cli("create-alias", etudes_arg)
9719         def _check_create_unicode((rc, out, err)):
9720-            self.failUnlessReallyEqual(rc, 0)
9721+            #self.failUnlessReallyEqual(rc, 0)
9722             self.failUnlessReallyEqual(err, "")
9723             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
9724 
9725hunk ./src/allmydata/test/test_cli.py 949
9726         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
9727         return d
9728 
9729+    def test_mutable_type(self):
9730+        self.basedir = "cli/Put/mutable_type"
9731+        self.set_up_grid()
9732+        data = "data" * 100000
9733+        fn1 = os.path.join(self.basedir, "data")
9734+        fileutil.write(fn1, data)
9735+        d = self.do_cli("create-alias", "tahoe")
9736+        d.addCallback(lambda ignored:
9737+            self.do_cli("put", "--mutable", "--mutable-type=mdmf",
9738+                        fn1, "tahoe:uploaded.txt"))
9739+        d.addCallback(lambda ignored:
9740+            self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
9741+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
9742+        d.addCallback(lambda ignored:
9743+            self.do_cli("put", "--mutable", "--mutable-type=sdmf",
9744+                        fn1, "tahoe:uploaded2.txt"))
9745+        d.addCallback(lambda ignored:
9746+            self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
9747+        d.addCallback(lambda (rc, json, err):
9748+            self.failUnlessIn("sdmf", json))
9749+        return d
9750+
9751+    def test_mutable_type_unlinked(self):
9752+        self.basedir = "cli/Put/mutable_type_unlinked"
9753+        self.set_up_grid()
9754+        data = "data" * 100000
9755+        fn1 = os.path.join(self.basedir, "data")
9756+        fileutil.write(fn1, data)
9757+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
9758+        d.addCallback(lambda (rc, cap, err):
9759+            self.do_cli("ls", "--json", cap))
9760+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
9761+        d.addCallback(lambda ignored:
9762+            self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
9763+        d.addCallback(lambda (rc, cap, err):
9764+            self.do_cli("ls", "--json", cap))
9765+        d.addCallback(lambda (rc, json, err):
9766+            self.failUnlessIn("sdmf", json))
9767+        return d
9768+
9769+    def test_mutable_type_invalid_format(self):
9770+        self.basedir = "cli/Put/mutable_type_invalid_format"
9771+        self.set_up_grid()
9772+        data = "data" * 100000
9773+        fn1 = os.path.join(self.basedir, "data")
9774+        fileutil.write(fn1, data)
9775+        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
9776+        def _check_failure((rc, out, err)):
9777+            self.failIfEqual(rc, 0)
9778+            self.failUnlessIn("invalid", err)
9779+        d.addCallback(_check_failure)
9780+        return d
9781+
9782     def test_put_with_nonexistent_alias(self):
9783         # when invoked with an alias that doesn't exist, 'tahoe put'
9784         # should output a useful error message, not a stack trace
9785hunk ./src/allmydata/test/test_cli.py 2028
9786         self.set_up_grid()
9787         c0 = self.g.clients[0]
9788         DATA = "data" * 100
9789-        d = c0.create_mutable_file(DATA)
9790+        DATA_uploadable = MutableData(DATA)
9791+        d = c0.create_mutable_file(DATA_uploadable)
9792         def _stash_uri(n):
9793             self.uri = n.get_uri()
9794         d.addCallback(_stash_uri)
9795hunk ./src/allmydata/test/test_cli.py 2130
9796                                            upload.Data("literal",
9797                                                         convergence="")))
9798         d.addCallback(_stash_uri, "small")
9799-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9800+        d.addCallback(lambda ign:
9801+            c0.create_mutable_file(MutableData(DATA+"1")))
9802         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9803         d.addCallback(_stash_uri, "mutable")
9804 
9805hunk ./src/allmydata/test/test_cli.py 2149
9806         # root/small
9807         # root/mutable
9808 
9809+        # We haven't broken anything yet, so this should all be healthy.
9810         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9811                                               self.rooturi))
9812         def _check2((rc, out, err)):
9813hunk ./src/allmydata/test/test_cli.py 2164
9814                             in lines, out)
9815         d.addCallback(_check2)
9816 
9817+        # Similarly, all of these results should be as we expect them to
9818+        # be for a healthy file layout.
9819         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9820         def _check_stats((rc, out, err)):
9821             self.failUnlessReallyEqual(err, "")
9822hunk ./src/allmydata/test/test_cli.py 2181
9823             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9824         d.addCallback(_check_stats)
9825 
9826+        # Now we break things.
9827         def _clobber_shares(ignored):
9828             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
9829             self.failUnlessReallyEqual(len(shares), 10)
9830hunk ./src/allmydata/test/test_cli.py 2206
9831 
9832         d.addCallback(lambda ign:
9833                       self.do_cli("deep-check", "--verbose", self.rooturi))
9834+        # This should reveal the missing share, but not the corrupt
9835+        # share, since we didn't tell the deep check operation to also
9836+        # verify.
9837         def _check3((rc, out, err)):
9838             self.failUnlessReallyEqual(err, "")
9839             self.failUnlessReallyEqual(rc, 0)
9840hunk ./src/allmydata/test/test_cli.py 2257
9841                                   "--verbose", "--verify", "--repair",
9842                                   self.rooturi))
9843         def _check6((rc, out, err)):
9844+            # We've just repaired the directory. There is no reason for
9845+            # that repair to be unsuccessful.
9846             self.failUnlessReallyEqual(err, "")
9847             self.failUnlessReallyEqual(rc, 0)
9848             lines = out.splitlines()
9849hunk ./src/allmydata/test/test_deepcheck.py 9
9850 from twisted.internet import threads # CLI tests use deferToThread
9851 from allmydata.immutable import upload
9852 from allmydata.mutable.common import UnrecoverableFileError
9853+from allmydata.mutable.publish import MutableData
9854 from allmydata.util import idlib
9855 from allmydata.util import base32
9856 from allmydata.scripts import runner
9857hunk ./src/allmydata/test/test_deepcheck.py 38
9858         self.basedir = "deepcheck/MutableChecker/good"
9859         self.set_up_grid()
9860         CONTENTS = "a little bit of data"
9861-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9862+        CONTENTS_uploadable = MutableData(CONTENTS)
9863+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9864         def _created(node):
9865             self.node = node
9866             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9867hunk ./src/allmydata/test/test_deepcheck.py 61
9868         self.basedir = "deepcheck/MutableChecker/corrupt"
9869         self.set_up_grid()
9870         CONTENTS = "a little bit of data"
9871-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9872+        CONTENTS_uploadable = MutableData(CONTENTS)
9873+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9874         def _stash_and_corrupt(node):
9875             self.node = node
9876             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9877hunk ./src/allmydata/test/test_deepcheck.py 99
9878         self.basedir = "deepcheck/MutableChecker/delete_share"
9879         self.set_up_grid()
9880         CONTENTS = "a little bit of data"
9881-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9882+        CONTENTS_uploadable = MutableData(CONTENTS)
9883+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9884         def _stash_and_delete(node):
9885             self.node = node
9886             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9887hunk ./src/allmydata/test/test_deepcheck.py 223
9888             self.root = n
9889             self.root_uri = n.get_uri()
9890         d.addCallback(_created_root)
9891-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9892+        d.addCallback(lambda ign:
9893+            c0.create_mutable_file(MutableData("mutable file contents")))
9894         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9895         def _created_mutable(n):
9896             self.mutable = n
9897hunk ./src/allmydata/test/test_deepcheck.py 965
9898     def create_mangled(self, ignored, name):
9899         nodetype, mangletype = name.split("-", 1)
9900         if nodetype == "mutable":
9901-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9902+            mutable_uploadable = MutableData("mutable file contents")
9903+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9904             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9905         elif nodetype == "large":
9906             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9907hunk ./src/allmydata/test/test_dirnode.py 1304
9908     implements(IMutableFileNode)
9909     counter = 0
9910     def __init__(self, initial_contents=""):
9911-        self.data = self._get_initial_contents(initial_contents)
9912+        data = self._get_initial_contents(initial_contents)
9913+        self.data = data.read(data.get_size())
9914+        self.data = "".join(self.data)
9915+
9916         counter = FakeMutableFile.counter
9917         FakeMutableFile.counter += 1
9918         writekey = hashutil.ssk_writekey_hash(str(counter))
9919hunk ./src/allmydata/test/test_dirnode.py 1354
9920         pass
9921 
9922     def modify(self, modifier):
9923-        self.data = modifier(self.data, None, True)
9924+        data = modifier(self.data, None, True)
9925+        self.data = data
9926         return defer.succeed(None)
9927 
9928 class FakeNodeMaker(NodeMaker):
9929hunk ./src/allmydata/test/test_dirnode.py 1359
9930-    def create_mutable_file(self, contents="", keysize=None):
9931+    def create_mutable_file(self, contents="", keysize=None, version=None):
9932         return defer.succeed(FakeMutableFile(contents))
9933 
9934 class FakeClient2(Client):
9935hunk ./src/allmydata/test/test_filenode.py 98
9936         def _check_segment(res):
9937             self.failUnlessEqual(res, DATA[1:1+5])
9938         d.addCallback(_check_segment)
9939+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
9940+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
9941+        d.addCallback(lambda ignored:
9942+            fn1.get_size_of_best_version())
9943+        d.addCallback(lambda size:
9944+            self.failUnlessEqual(size, len(DATA)))
9945+        d.addCallback(lambda ignored:
9946+            fn1.download_to_data())
9947+        d.addCallback(lambda data:
9948+            self.failUnlessEqual(data, DATA))
9949+        d.addCallback(lambda ignored:
9950+            fn1.download_best_version())
9951+        d.addCallback(lambda data:
9952+            self.failUnlessEqual(data, DATA))
9953 
9954         return d
9955 
9956hunk ./src/allmydata/test/test_hung_server.py 10
9957 from allmydata.util.consumer import download_to_data
9958 from allmydata.immutable import upload
9959 from allmydata.mutable.common import UnrecoverableFileError
9960+from allmydata.mutable.publish import MutableData
9961 from allmydata.storage.common import storage_index_to_dir
9962 from allmydata.test.no_network import GridTestMixin
9963 from allmydata.test.common import ShouldFailMixin
9964hunk ./src/allmydata/test/test_hung_server.py 108
9965         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9966 
9967         if mutable:
9968-            d = nm.create_mutable_file(mutable_plaintext)
9969+            uploadable = MutableData(mutable_plaintext)
9970+            d = nm.create_mutable_file(uploadable)
9971             def _uploaded_mutable(node):
9972                 self.uri = node.get_uri()
9973                 self.shares = self.find_uri_shares(self.uri)
9974hunk ./src/allmydata/test/test_immutable.py 4
9975 from allmydata.test import common
9976 from allmydata.interfaces import NotEnoughSharesError
9977 from allmydata.util.consumer import download_to_data
9978-from twisted.internet import defer
9979+from twisted.internet import defer, base
9980 from twisted.trial import unittest
9981 import random
9982 
9983hunk ./src/allmydata/test/test_immutable.py 143
9984         d.addCallback(_after_attempt)
9985         return d
9986 
9987+    def test_download_to_data(self):
9988+        d = self.n.download_to_data()
9989+        d.addCallback(lambda data:
9990+            self.failUnlessEqual(data, common.TEST_DATA))
9991+        return d
9992 
9993hunk ./src/allmydata/test/test_immutable.py 149
9994+
9995+    def test_download_best_version(self):
9996+        d = self.n.download_best_version()
9997+        d.addCallback(lambda data:
9998+            self.failUnlessEqual(data, common.TEST_DATA))
9999+        return d
10000+
10001+
10002+    def test_get_best_readable_version(self):
10003+        d = self.n.get_best_readable_version()
10004+        d.addCallback(lambda n2:
10005+            self.failUnlessEqual(n2, self.n))
10006+        return d
10007+
10008+    def test_get_size_of_best_version(self):
10009+        d = self.n.get_size_of_best_version()
10010+        d.addCallback(lambda size:
10011+            self.failUnlessEqual(size, len(common.TEST_DATA)))
10012+        return d
10013+
10014+
10015 # XXX extend these tests to show bad behavior of various kinds from servers:
10016 # raising exception from each remove_foo() method, for example
10017 
10018hunk ./src/allmydata/test/test_mutable.py 2
10019 
10020-import struct
10021+import struct, os
10022 from cStringIO import StringIO
10023 from twisted.trial import unittest
10024 from twisted.internet import defer, reactor
10025hunk ./src/allmydata/test/test_mutable.py 8
10026 from allmydata import uri, client
10027 from allmydata.nodemaker import NodeMaker
10028-from allmydata.util import base32
10029+from allmydata.util import base32, consumer, mathutil
10030 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
10031      ssk_pubkey_fingerprint_hash
10032hunk ./src/allmydata/test/test_mutable.py 11
10033+from allmydata.util.deferredutil import gatherResults
10034 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
10035hunk ./src/allmydata/test/test_mutable.py 13
10036-     NotEnoughSharesError
10037+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
10038 from allmydata.monitor import Monitor
10039 from allmydata.test.common import ShouldFailMixin
10040 from allmydata.test.no_network import GridTestMixin
10041hunk ./src/allmydata/test/test_mutable.py 27
10042      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
10043      NotEnoughServersError, CorruptShareError
10044 from allmydata.mutable.retrieve import Retrieve
10045-from allmydata.mutable.publish import Publish
10046+from allmydata.mutable.publish import Publish, MutableFileHandle, \
10047+                                      MutableData, \
10048+                                      DEFAULT_MAX_SEGMENT_SIZE
10049 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
10050hunk ./src/allmydata/test/test_mutable.py 31
10051-from allmydata.mutable.layout import unpack_header, unpack_share
10052+from allmydata.mutable.layout import unpack_header, unpack_share, \
10053+                                     MDMFSlotReadProxy
10054 from allmydata.mutable.repairer import MustForceRepairError
10055 
10056 import allmydata.test.common_util as testutil
10057hunk ./src/allmydata/test/test_mutable.py 101
10058         self.storage = storage
10059         self.queries = 0
10060     def callRemote(self, methname, *args, **kwargs):
10061+        self.queries += 1
10062         def _call():
10063             meth = getattr(self, methname)
10064             return meth(*args, **kwargs)
10065hunk ./src/allmydata/test/test_mutable.py 108
10066         d = fireEventually()
10067         d.addCallback(lambda res: _call())
10068         return d
10069+
10070     def callRemoteOnly(self, methname, *args, **kwargs):
10071hunk ./src/allmydata/test/test_mutable.py 110
10072+        self.queries += 1
10073         d = self.callRemote(methname, *args, **kwargs)
10074         d.addBoth(lambda ignore: None)
10075         pass
10076hunk ./src/allmydata/test/test_mutable.py 158
10077             chr(ord(original[byte_offset]) ^ 0x01) +
10078             original[byte_offset+1:])
10079 
10080+def add_two(original, byte_offset):
10081+    # It isn't enough to simply flip the bit for the version number,
10082+    # because 1 is a valid version number. So we add two instead.
10083+    return (original[:byte_offset] +
10084+            chr(ord(original[byte_offset]) ^ 0x02) +
10085+            original[byte_offset+1:])
10086+
10087 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
10088     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
10089     # list of shnums to corrupt.
10090hunk ./src/allmydata/test/test_mutable.py 168
10091+    ds = []
10092     for peerid in s._peers:
10093         shares = s._peers[peerid]
10094         for shnum in shares:
10095hunk ./src/allmydata/test/test_mutable.py 176
10096                 and shnum not in shnums_to_corrupt):
10097                 continue
10098             data = shares[shnum]
10099-            (version,
10100-             seqnum,
10101-             root_hash,
10102-             IV,
10103-             k, N, segsize, datalen,
10104-             o) = unpack_header(data)
10105-            if isinstance(offset, tuple):
10106-                offset1, offset2 = offset
10107-            else:
10108-                offset1 = offset
10109-                offset2 = 0
10110-            if offset1 == "pubkey":
10111-                real_offset = 107
10112-            elif offset1 in o:
10113-                real_offset = o[offset1]
10114-            else:
10115-                real_offset = offset1
10116-            real_offset = int(real_offset) + offset2 + offset_offset
10117-            assert isinstance(real_offset, int), offset
10118-            shares[shnum] = flip_bit(data, real_offset)
10119-    return res
10120+            # We're feeding the reader all of the share data, so it
10121+            # won't need to use the rref that we didn't provide, nor the
10122+            # storage index that we didn't provide. We do this because
10123+            # the reader will work for both MDMF and SDMF.
10124+            reader = MDMFSlotReadProxy(None, None, shnum, data)
10125+            # We need to get the offsets for the next part.
10126+            d = reader.get_verinfo()
10127+            def _do_corruption(verinfo, data, shnum):
10128+                (seqnum,
10129+                 root_hash,
10130+                 IV,
10131+                 segsize,
10132+                 datalen,
10133+                 k, n, prefix, o) = verinfo
10134+                if isinstance(offset, tuple):
10135+                    offset1, offset2 = offset
10136+                else:
10137+                    offset1 = offset
10138+                    offset2 = 0
10139+                if offset1 == "pubkey" and IV:
10140+                    real_offset = 107
10141+                elif offset1 == "share_data" and not IV:
10142+                    real_offset = 107
10143+                elif offset1 in o:
10144+                    real_offset = o[offset1]
10145+                else:
10146+                    real_offset = offset1
10147+                real_offset = int(real_offset) + offset2 + offset_offset
10148+                assert isinstance(real_offset, int), offset
10149+                if offset1 == 0: # verbyte
10150+                    f = add_two
10151+                else:
10152+                    f = flip_bit
10153+                shares[shnum] = f(data, real_offset)
10154+            d.addCallback(_do_corruption, data, shnum)
10155+            ds.append(d)
10156+    dl = defer.DeferredList(ds)
10157+    dl.addCallback(lambda ignored: res)
10158+    return dl
10159 
10160 def make_storagebroker(s=None, num_peers=10):
10161     if not s:
10162hunk ./src/allmydata/test/test_mutable.py 257
10163             self.failUnlessEqual(len(shnums), 1)
10164         d.addCallback(_created)
10165         return d
10166+    test_create.timeout = 15
10167+
10168+
10169+    def test_create_mdmf(self):
10170+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10171+        def _created(n):
10172+            self.failUnless(isinstance(n, MutableFileNode))
10173+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
10174+            sb = self.nodemaker.storage_broker
10175+            peer0 = sorted(sb.get_all_serverids())[0]
10176+            shnums = self._storage._peers[peer0].keys()
10177+            self.failUnlessEqual(len(shnums), 1)
10178+        d.addCallback(_created)
10179+        return d
10180+
10181 
10182     def test_serialize(self):
10183         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
10184hunk ./src/allmydata/test/test_mutable.py 302
10185             d.addCallback(lambda smap: smap.dump(StringIO()))
10186             d.addCallback(lambda sio:
10187                           self.failUnless("3-of-10" in sio.getvalue()))
10188-            d.addCallback(lambda res: n.overwrite("contents 1"))
10189+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10190             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10191             d.addCallback(lambda res: n.download_best_version())
10192             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10193hunk ./src/allmydata/test/test_mutable.py 309
10194             d.addCallback(lambda res: n.get_size_of_best_version())
10195             d.addCallback(lambda size:
10196                           self.failUnlessEqual(size, len("contents 1")))
10197-            d.addCallback(lambda res: n.overwrite("contents 2"))
10198+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10199             d.addCallback(lambda res: n.download_best_version())
10200             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10201             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10202hunk ./src/allmydata/test/test_mutable.py 313
10203-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10204+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10205             d.addCallback(lambda res: n.download_best_version())
10206             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10207             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10208hunk ./src/allmydata/test/test_mutable.py 325
10209             # mapupdate-to-retrieve data caching (i.e. make the shares larger
10210             # than the default readsize, which is 2000 bytes). A 15kB file
10211             # will have 5kB shares.
10212-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
10213+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
10214             d.addCallback(lambda res: n.download_best_version())
10215             d.addCallback(lambda res:
10216                           self.failUnlessEqual(res, "large size file" * 1000))
10217hunk ./src/allmydata/test/test_mutable.py 333
10218         d.addCallback(_created)
10219         return d
10220 
10221+
10222+    def test_upload_and_download_mdmf(self):
10223+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10224+        def _created(n):
10225+            d = defer.succeed(None)
10226+            d.addCallback(lambda ignored:
10227+                n.get_servermap(MODE_READ))
10228+            def _then(servermap):
10229+                dumped = servermap.dump(StringIO())
10230+                self.failUnlessIn("3-of-10", dumped.getvalue())
10231+            d.addCallback(_then)
10232+            # Now overwrite the contents with some new contents. We want
10233+            # to make them big enough to force the file to be uploaded
10234+            # in more than one segment.
10235+            big_contents = "contents1" * 100000 # about 900 KiB
10236+            big_contents_uploadable = MutableData(big_contents)
10237+            d.addCallback(lambda ignored:
10238+                n.overwrite(big_contents_uploadable))
10239+            d.addCallback(lambda ignored:
10240+                n.download_best_version())
10241+            d.addCallback(lambda data:
10242+                self.failUnlessEqual(data, big_contents))
10243+            # Overwrite the contents again with some new contents. As
10244+            # before, they need to be big enough to force multiple
10245+            # segments, so that we make the downloader deal with
10246+            # multiple segments.
10247+            bigger_contents = "contents2" * 1000000 # about 9MiB
10248+            bigger_contents_uploadable = MutableData(bigger_contents)
10249+            d.addCallback(lambda ignored:
10250+                n.overwrite(bigger_contents_uploadable))
10251+            d.addCallback(lambda ignored:
10252+                n.download_best_version())
10253+            d.addCallback(lambda data:
10254+                self.failUnlessEqual(data, bigger_contents))
10255+            return d
10256+        d.addCallback(_created)
10257+        return d
10258+
10259+
10260+    def test_mdmf_write_count(self):
10261+        # Publishing an MDMF file should only cause one write for each
10262+        # share that is to be published. Otherwise, we introduce
10263+        # undesirable semantics that are a regression from SDMF
10264+        upload = MutableData("MDMF" * 100000) # about 400 KiB
10265+        d = self.nodemaker.create_mutable_file(upload,
10266+                                               version=MDMF_VERSION)
10267+        def _check_server_write_counts(ignored):
10268+            sb = self.nodemaker.storage_broker
10269+            peers = sb.test_servers.values()
10270+            for peer in peers:
10271+                self.failUnlessEqual(peer.queries, 1)
10272+        d.addCallback(_check_server_write_counts)
10273+        return d
10274+
10275+
10276     def test_create_with_initial_contents(self):
10277hunk ./src/allmydata/test/test_mutable.py 389
10278-        d = self.nodemaker.create_mutable_file("contents 1")
10279+        upload1 = MutableData("contents 1")
10280+        d = self.nodemaker.create_mutable_file(upload1)
10281         def _created(n):
10282             d = n.download_best_version()
10283             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10284hunk ./src/allmydata/test/test_mutable.py 394
10285-            d.addCallback(lambda res: n.overwrite("contents 2"))
10286+            upload2 = MutableData("contents 2")
10287+            d.addCallback(lambda res: n.overwrite(upload2))
10288             d.addCallback(lambda res: n.download_best_version())
10289             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10290             return d
10291hunk ./src/allmydata/test/test_mutable.py 401
10292         d.addCallback(_created)
10293         return d
10294+    test_create_with_initial_contents.timeout = 15
10295+
10296+
10297+    def test_create_mdmf_with_initial_contents(self):
10298+        initial_contents = "foobarbaz" * 131072 # 900KiB
10299+        initial_contents_uploadable = MutableData(initial_contents)
10300+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
10301+                                               version=MDMF_VERSION)
10302+        def _created(n):
10303+            d = n.download_best_version()
10304+            d.addCallback(lambda data:
10305+                self.failUnlessEqual(data, initial_contents))
10306+            uploadable2 = MutableData(initial_contents + "foobarbaz")
10307+            d.addCallback(lambda ignored:
10308+                n.overwrite(uploadable2))
10309+            d.addCallback(lambda ignored:
10310+                n.download_best_version())
10311+            d.addCallback(lambda data:
10312+                self.failUnlessEqual(data, initial_contents +
10313+                                           "foobarbaz"))
10314+            return d
10315+        d.addCallback(_created)
10316+        return d
10317+    test_create_mdmf_with_initial_contents.timeout = 20
10318+
10319 
10320     def test_create_with_initial_contents_function(self):
10321         data = "initial contents"
10322hunk ./src/allmydata/test/test_mutable.py 434
10323             key = n.get_writekey()
10324             self.failUnless(isinstance(key, str), key)
10325             self.failUnlessEqual(len(key), 16) # AES key size
10326-            return data
10327+            return MutableData(data)
10328         d = self.nodemaker.create_mutable_file(_make_contents)
10329         def _created(n):
10330             return n.download_best_version()
10331hunk ./src/allmydata/test/test_mutable.py 442
10332         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
10333         return d
10334 
10335+
10336+    def test_create_mdmf_with_initial_contents_function(self):
10337+        data = "initial contents" * 100000
10338+        def _make_contents(n):
10339+            self.failUnless(isinstance(n, MutableFileNode))
10340+            key = n.get_writekey()
10341+            self.failUnless(isinstance(key, str), key)
10342+            self.failUnlessEqual(len(key), 16)
10343+            return MutableData(data)
10344+        d = self.nodemaker.create_mutable_file(_make_contents,
10345+                                               version=MDMF_VERSION)
10346+        d.addCallback(lambda n:
10347+            n.download_best_version())
10348+        d.addCallback(lambda data2:
10349+            self.failUnlessEqual(data2, data))
10350+        return d
10351+
10352+
10353     def test_create_with_too_large_contents(self):
10354         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10355hunk ./src/allmydata/test/test_mutable.py 462
10356-        d = self.nodemaker.create_mutable_file(BIG)
10357+        BIG_uploadable = MutableData(BIG)
10358+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
10359         def _created(n):
10360hunk ./src/allmydata/test/test_mutable.py 465
10361-            d = n.overwrite(BIG)
10362+            other_BIG_uploadable = MutableData(BIG)
10363+            d = n.overwrite(other_BIG_uploadable)
10364             return d
10365         d.addCallback(_created)
10366         return d
10367hunk ./src/allmydata/test/test_mutable.py 480
10368 
10369     def test_modify(self):
10370         def _modifier(old_contents, servermap, first_time):
10371-            return old_contents + "line2"
10372+            new_contents = old_contents + "line2"
10373+            return new_contents
10374         def _non_modifier(old_contents, servermap, first_time):
10375             return old_contents
10376         def _none_modifier(old_contents, servermap, first_time):
10377hunk ./src/allmydata/test/test_mutable.py 489
10378         def _error_modifier(old_contents, servermap, first_time):
10379             raise ValueError("oops")
10380         def _toobig_modifier(old_contents, servermap, first_time):
10381-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
10382+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10383+            return new_content
10384         calls = []
10385         def _ucw_error_modifier(old_contents, servermap, first_time):
10386             # simulate an UncoordinatedWriteError once
10387hunk ./src/allmydata/test/test_mutable.py 497
10388             calls.append(1)
10389             if len(calls) <= 1:
10390                 raise UncoordinatedWriteError("simulated")
10391-            return old_contents + "line3"
10392+            new_contents = old_contents + "line3"
10393+            return new_contents
10394         def _ucw_error_non_modifier(old_contents, servermap, first_time):
10395             # simulate an UncoordinatedWriteError once, and don't actually
10396             # modify the contents on subsequent invocations
10397hunk ./src/allmydata/test/test_mutable.py 507
10398                 raise UncoordinatedWriteError("simulated")
10399             return old_contents
10400 
10401-        d = self.nodemaker.create_mutable_file("line1")
10402+        initial_contents = "line1"
10403+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
10404         def _created(n):
10405             d = n.modify(_modifier)
10406             d.addCallback(lambda res: n.download_best_version())
10407hunk ./src/allmydata/test/test_mutable.py 565
10408             return d
10409         d.addCallback(_created)
10410         return d
10411+    test_modify.timeout = 15
10412+
10413 
10414     def test_modify_backoffer(self):
10415         def _modifier(old_contents, servermap, first_time):
10416hunk ./src/allmydata/test/test_mutable.py 592
10417         giveuper._delay = 0.1
10418         giveuper.factor = 1
10419 
10420-        d = self.nodemaker.create_mutable_file("line1")
10421+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
10422         def _created(n):
10423             d = n.modify(_modifier)
10424             d.addCallback(lambda res: n.download_best_version())
10425hunk ./src/allmydata/test/test_mutable.py 642
10426             d.addCallback(lambda smap: smap.dump(StringIO()))
10427             d.addCallback(lambda sio:
10428                           self.failUnless("3-of-10" in sio.getvalue()))
10429-            d.addCallback(lambda res: n.overwrite("contents 1"))
10430+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10431             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10432             d.addCallback(lambda res: n.download_best_version())
10433             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10434hunk ./src/allmydata/test/test_mutable.py 646
10435-            d.addCallback(lambda res: n.overwrite("contents 2"))
10436+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10437             d.addCallback(lambda res: n.download_best_version())
10438             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10439             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10440hunk ./src/allmydata/test/test_mutable.py 650
10441-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10442+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10443             d.addCallback(lambda res: n.download_best_version())
10444             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10445             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10446hunk ./src/allmydata/test/test_mutable.py 663
10447         return d
10448 
10449 
10450-class MakeShares(unittest.TestCase):
10451-    def test_encrypt(self):
10452-        nm = make_nodemaker()
10453-        CONTENTS = "some initial contents"
10454-        d = nm.create_mutable_file(CONTENTS)
10455-        def _created(fn):
10456-            p = Publish(fn, nm.storage_broker, None)
10457-            p.salt = "SALT" * 4
10458-            p.readkey = "\x00" * 16
10459-            p.newdata = CONTENTS
10460-            p.required_shares = 3
10461-            p.total_shares = 10
10462-            p.setup_encoding_parameters()
10463-            return p._encrypt_and_encode()
10464+    def test_size_after_servermap_update(self):
10465+        # a mutable file node should have something to say about how big
10466+        # it is after a servermap update is performed, since this tells
10467+        # us how large the best version of that mutable file is.
10468+        d = self.nodemaker.create_mutable_file()
10469+        def _created(n):
10470+            self.n = n
10471+            return n.get_servermap(MODE_READ)
10472+        d.addCallback(_created)
10473+        d.addCallback(lambda ignored:
10474+            self.failUnlessEqual(self.n.get_size(), 0))
10475+        d.addCallback(lambda ignored:
10476+            self.n.overwrite(MutableData("foobarbaz")))
10477+        d.addCallback(lambda ignored:
10478+            self.failUnlessEqual(self.n.get_size(), 9))
10479+        d.addCallback(lambda ignored:
10480+            self.nodemaker.create_mutable_file(MutableData("foobarbaz")))
10481+        d.addCallback(_created)
10482+        d.addCallback(lambda ignored:
10483+            self.failUnlessEqual(self.n.get_size(), 9))
10484+        return d
10485+
10486+
10487+class PublishMixin:
10488+    def publish_one(self):
10489+        # publish a file and create shares, which can then be manipulated
10490+        # later.
10491+        self.CONTENTS = "New contents go here" * 1000
10492+        self.uploadable = MutableData(self.CONTENTS)
10493+        self._storage = FakeStorage()
10494+        self._nodemaker = make_nodemaker(self._storage)
10495+        self._storage_broker = self._nodemaker.storage_broker
10496+        d = self._nodemaker.create_mutable_file(self.uploadable)
10497+        def _created(node):
10498+            self._fn = node
10499+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10500         d.addCallback(_created)
10501hunk ./src/allmydata/test/test_mutable.py 700
10502-        def _done(shares_and_shareids):
10503-            (shares, share_ids) = shares_and_shareids
10504-            self.failUnlessEqual(len(shares), 10)
10505-            for sh in shares:
10506-                self.failUnless(isinstance(sh, str))
10507-                self.failUnlessEqual(len(sh), 7)
10508-            self.failUnlessEqual(len(share_ids), 10)
10509-        d.addCallback(_done)
10510         return d
10511 
10512hunk ./src/allmydata/test/test_mutable.py 702
10513-    def test_generate(self):
10514-        nm = make_nodemaker()
10515-        CONTENTS = "some initial contents"
10516-        d = nm.create_mutable_file(CONTENTS)
10517-        def _created(fn):
10518-            self._fn = fn
10519-            p = Publish(fn, nm.storage_broker, None)
10520-            self._p = p
10521-            p.newdata = CONTENTS
10522-            p.required_shares = 3
10523-            p.total_shares = 10
10524-            p.setup_encoding_parameters()
10525-            p._new_seqnum = 3
10526-            p.salt = "SALT" * 4
10527-            # make some fake shares
10528-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
10529-            p._privkey = fn.get_privkey()
10530-            p._encprivkey = fn.get_encprivkey()
10531-            p._pubkey = fn.get_pubkey()
10532-            return p._generate_shares(shares_and_ids)
10533+    def publish_mdmf(self):
10534+        # like publish_one, except that the result is guaranteed to be
10535+        # an MDMF file.
10536+        # self.CONTENTS should have more than one segment.
10537+        self.CONTENTS = "This is an MDMF file" * 100000
10538+        self.uploadable = MutableData(self.CONTENTS)
10539+        self._storage = FakeStorage()
10540+        self._nodemaker = make_nodemaker(self._storage)
10541+        self._storage_broker = self._nodemaker.storage_broker
10542+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
10543+        def _created(node):
10544+            self._fn = node
10545+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10546         d.addCallback(_created)
10547hunk ./src/allmydata/test/test_mutable.py 716
10548-        def _generated(res):
10549-            p = self._p
10550-            final_shares = p.shares
10551-            root_hash = p.root_hash
10552-            self.failUnlessEqual(len(root_hash), 32)
10553-            self.failUnless(isinstance(final_shares, dict))
10554-            self.failUnlessEqual(len(final_shares), 10)
10555-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
10556-            for i,sh in final_shares.items():
10557-                self.failUnless(isinstance(sh, str))
10558-                # feed the share through the unpacker as a sanity-check
10559-                pieces = unpack_share(sh)
10560-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
10561-                 pubkey, signature, share_hash_chain, block_hash_tree,
10562-                 share_data, enc_privkey) = pieces
10563-                self.failUnlessEqual(u_seqnum, 3)
10564-                self.failUnlessEqual(u_root_hash, root_hash)
10565-                self.failUnlessEqual(k, 3)
10566-                self.failUnlessEqual(N, 10)
10567-                self.failUnlessEqual(segsize, 21)
10568-                self.failUnlessEqual(datalen, len(CONTENTS))
10569-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
10570-                sig_material = struct.pack(">BQ32s16s BBQQ",
10571-                                           0, p._new_seqnum, root_hash, IV,
10572-                                           k, N, segsize, datalen)
10573-                self.failUnless(p._pubkey.verify(sig_material, signature))
10574-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
10575-                self.failUnless(isinstance(share_hash_chain, dict))
10576-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
10577-                for shnum,share_hash in share_hash_chain.items():
10578-                    self.failUnless(isinstance(shnum, int))
10579-                    self.failUnless(isinstance(share_hash, str))
10580-                    self.failUnlessEqual(len(share_hash), 32)
10581-                self.failUnless(isinstance(block_hash_tree, list))
10582-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
10583-                self.failUnlessEqual(IV, "SALT"*4)
10584-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
10585-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
10586-        d.addCallback(_generated)
10587         return d
10588 
10589hunk ./src/allmydata/test/test_mutable.py 718
10590-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
10591-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
10592-    # when we publish to zero peers, we should get a NotEnoughSharesError
10593 
10594hunk ./src/allmydata/test/test_mutable.py 719
10595-class PublishMixin:
10596-    def publish_one(self):
10597-        # publish a file and create shares, which can then be manipulated
10598-        # later.
10599-        self.CONTENTS = "New contents go here" * 1000
10600+    def publish_sdmf(self):
10601+        # like publish_one, except that the result is guaranteed to be
10602+        # an SDMF file
10603+        self.CONTENTS = "This is an SDMF file" * 1000
10604+        self.uploadable = MutableData(self.CONTENTS)
10605         self._storage = FakeStorage()
10606         self._nodemaker = make_nodemaker(self._storage)
10607         self._storage_broker = self._nodemaker.storage_broker
10608hunk ./src/allmydata/test/test_mutable.py 727
10609-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10610+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
10611         def _created(node):
10612             self._fn = node
10613             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10614hunk ./src/allmydata/test/test_mutable.py 734
10615         d.addCallback(_created)
10616         return d
10617 
10618-    def publish_multiple(self):
10619+
10620+    def publish_multiple(self, version=0):
10621         self.CONTENTS = ["Contents 0",
10622                          "Contents 1",
10623                          "Contents 2",
10624hunk ./src/allmydata/test/test_mutable.py 741
10625                          "Contents 3a",
10626                          "Contents 3b"]
10627+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
10628         self._copied_shares = {}
10629         self._storage = FakeStorage()
10630         self._nodemaker = make_nodemaker(self._storage)
10631hunk ./src/allmydata/test/test_mutable.py 745
10632-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
10633+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
10634         def _created(node):
10635             self._fn = node
10636             # now create multiple versions of the same file, and accumulate
10637hunk ./src/allmydata/test/test_mutable.py 752
10638             # their shares, so we can mix and match them later.
10639             d = defer.succeed(None)
10640             d.addCallback(self._copy_shares, 0)
10641-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
10642+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
10643             d.addCallback(self._copy_shares, 1)
10644hunk ./src/allmydata/test/test_mutable.py 754
10645-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
10646+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
10647             d.addCallback(self._copy_shares, 2)
10648hunk ./src/allmydata/test/test_mutable.py 756
10649-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
10650+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
10651             d.addCallback(self._copy_shares, 3)
10652             # now we replace all the shares with version s3, and upload a new
10653             # version to get s4b.
10654hunk ./src/allmydata/test/test_mutable.py 762
10655             rollback = dict([(i,2) for i in range(10)])
10656             d.addCallback(lambda res: self._set_versions(rollback))
10657-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
10658+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
10659             d.addCallback(self._copy_shares, 4)
10660             # we leave the storage in state 4
10661             return d
10662hunk ./src/allmydata/test/test_mutable.py 769
10663         d.addCallback(_created)
10664         return d
10665 
10666+
10667     def _copy_shares(self, ignored, index):
10668         shares = self._storage._peers
10669         # we need a deep copy
10670hunk ./src/allmydata/test/test_mutable.py 793
10671                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
10672 
10673 
10674+
10675+
10676 class Servermap(unittest.TestCase, PublishMixin):
10677     def setUp(self):
10678         return self.publish_one()
10679hunk ./src/allmydata/test/test_mutable.py 799
10680 
10681-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
10682+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
10683+                       update_range=None):
10684         if fn is None:
10685             fn = self._fn
10686         if sb is None:
10687hunk ./src/allmydata/test/test_mutable.py 806
10688             sb = self._storage_broker
10689         smu = ServermapUpdater(fn, sb, Monitor(),
10690-                               ServerMap(), mode)
10691+                               ServerMap(), mode, update_range=update_range)
10692         d = smu.update()
10693         return d
10694 
10695hunk ./src/allmydata/test/test_mutable.py 872
10696         # create a new file, which is large enough to knock the privkey out
10697         # of the early part of the file
10698         LARGE = "These are Larger contents" * 200 # about 5KB
10699-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
10700+        LARGE_uploadable = MutableData(LARGE)
10701+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
10702         def _created(large_fn):
10703             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
10704             return self.make_servermap(MODE_WRITE, large_fn2)
10705hunk ./src/allmydata/test/test_mutable.py 881
10706         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
10707         return d
10708 
10709+
10710     def test_mark_bad(self):
10711         d = defer.succeed(None)
10712         ms = self.make_servermap
10713hunk ./src/allmydata/test/test_mutable.py 927
10714         self._storage._peers = {} # delete all shares
10715         ms = self.make_servermap
10716         d = defer.succeed(None)
10717-
10718+#
10719         d.addCallback(lambda res: ms(mode=MODE_CHECK))
10720         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
10721 
10722hunk ./src/allmydata/test/test_mutable.py 979
10723         return d
10724 
10725 
10726+    def test_servermapupdater_finds_mdmf_files(self):
10727+        # setUp already published an MDMF file for us. We just need to
10728+        # make sure that when we run the ServermapUpdater, the file is
10729+        # reported to have one recoverable version.
10730+        d = defer.succeed(None)
10731+        d.addCallback(lambda ignored:
10732+            self.publish_mdmf())
10733+        d.addCallback(lambda ignored:
10734+            self.make_servermap(mode=MODE_CHECK))
10735+        # Calling make_servermap also updates the servermap in the mode
10736+        # that we specify, so we just need to see what it says.
10737+        def _check_servermap(sm):
10738+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
10739+        d.addCallback(_check_servermap)
10740+        return d
10741+
10742+
10743+    def test_fetch_update(self):
10744+        d = defer.succeed(None)
10745+        d.addCallback(lambda ignored:
10746+            self.publish_mdmf())
10747+        d.addCallback(lambda ignored:
10748+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
10749+        def _check_servermap(sm):
10750+            # 10 shares
10751+            self.failUnlessEqual(len(sm.update_data), 10)
10752+            # one version
10753+            for data in sm.update_data.itervalues():
10754+                self.failUnlessEqual(len(data), 1)
10755+        d.addCallback(_check_servermap)
10756+        return d
10757+
10758+
10759+    def test_servermapupdater_finds_sdmf_files(self):
10760+        d = defer.succeed(None)
10761+        d.addCallback(lambda ignored:
10762+            self.publish_sdmf())
10763+        d.addCallback(lambda ignored:
10764+            self.make_servermap(mode=MODE_CHECK))
10765+        d.addCallback(lambda servermap:
10766+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
10767+        return d
10768+
10769 
10770 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
10771     def setUp(self):
10772hunk ./src/allmydata/test/test_mutable.py 1062
10773         if version is None:
10774             version = servermap.best_recoverable_version()
10775         r = Retrieve(self._fn, servermap, version)
10776-        return r.download()
10777+        c = consumer.MemoryConsumer()
10778+        d = r.download(consumer=c)
10779+        d.addCallback(lambda mc: "".join(mc.chunks))
10780+        return d
10781+
10782 
10783     def test_basic(self):
10784         d = self.make_servermap()
10785hunk ./src/allmydata/test/test_mutable.py 1143
10786         return d
10787     test_no_servers_download.timeout = 15
10788 
10789+
10790     def _test_corrupt_all(self, offset, substring,
10791hunk ./src/allmydata/test/test_mutable.py 1145
10792-                          should_succeed=False, corrupt_early=True,
10793-                          failure_checker=None):
10794+                          should_succeed=False,
10795+                          corrupt_early=True,
10796+                          failure_checker=None,
10797+                          fetch_privkey=False):
10798         d = defer.succeed(None)
10799         if corrupt_early:
10800             d.addCallback(corrupt, self._storage, offset)
10801hunk ./src/allmydata/test/test_mutable.py 1165
10802                     self.failUnlessIn(substring, "".join(allproblems))
10803                 return servermap
10804             if should_succeed:
10805-                d1 = self._fn.download_version(servermap, ver)
10806+                d1 = self._fn.download_version(servermap, ver,
10807+                                               fetch_privkey)
10808                 d1.addCallback(lambda new_contents:
10809                                self.failUnlessEqual(new_contents, self.CONTENTS))
10810             else:
10811hunk ./src/allmydata/test/test_mutable.py 1173
10812                 d1 = self.shouldFail(NotEnoughSharesError,
10813                                      "_corrupt_all(offset=%s)" % (offset,),
10814                                      substring,
10815-                                     self._fn.download_version, servermap, ver)
10816+                                     self._fn.download_version, servermap,
10817+                                                                ver,
10818+                                                                fetch_privkey)
10819             if failure_checker:
10820                 d1.addCallback(failure_checker)
10821             d1.addCallback(lambda res: servermap)
10822hunk ./src/allmydata/test/test_mutable.py 1184
10823         return d
10824 
10825     def test_corrupt_all_verbyte(self):
10826-        # when the version byte is not 0, we hit an UnknownVersionError error
10827-        # in unpack_share().
10828+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
10829+        # error in unpack_share().
10830         d = self._test_corrupt_all(0, "UnknownVersionError")
10831         def _check_servermap(servermap):
10832             # and the dump should mention the problems
10833hunk ./src/allmydata/test/test_mutable.py 1191
10834             s = StringIO()
10835             dump = servermap.dump(s).getvalue()
10836-            self.failUnless("10 PROBLEMS" in dump, dump)
10837+            self.failUnless("30 PROBLEMS" in dump, dump)
10838         d.addCallback(_check_servermap)
10839         return d
10840 
10841hunk ./src/allmydata/test/test_mutable.py 1261
10842         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
10843 
10844 
10845+    def test_corrupt_all_encprivkey_late(self):
10846+        # this should work for the same reason as above, but we corrupt
10847+        # after the servermap update to exercise the error handling
10848+        # code.
10849+        # We need to remove the privkey from the node, or the retrieve
10850+        # process won't know to update it.
10851+        self._fn._privkey = None
10852+        return self._test_corrupt_all("enc_privkey",
10853+                                      None, # this shouldn't fail
10854+                                      should_succeed=True,
10855+                                      corrupt_early=False,
10856+                                      fetch_privkey=True)
10857+
10858+
10859     def test_corrupt_all_seqnum_late(self):
10860         # corrupting the seqnum between mapupdate and retrieve should result
10861         # in NotEnoughSharesError, since each share will look invalid
10862hunk ./src/allmydata/test/test_mutable.py 1281
10863         def _check(res):
10864             f = res[0]
10865             self.failUnless(f.check(NotEnoughSharesError))
10866-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
10867+            self.failUnless("uncoordinated write" in str(f))
10868         return self._test_corrupt_all(1, "ran out of peers",
10869                                       corrupt_early=False,
10870                                       failure_checker=_check)
10871hunk ./src/allmydata/test/test_mutable.py 1325
10872                             in str(servermap.problems[0]))
10873             ver = servermap.best_recoverable_version()
10874             r = Retrieve(self._fn, servermap, ver)
10875-            return r.download()
10876+            c = consumer.MemoryConsumer()
10877+            return r.download(c)
10878         d.addCallback(_do_retrieve)
10879hunk ./src/allmydata/test/test_mutable.py 1328
10880+        d.addCallback(lambda mc: "".join(mc.chunks))
10881         d.addCallback(lambda new_contents:
10882                       self.failUnlessEqual(new_contents, self.CONTENTS))
10883         return d
10884hunk ./src/allmydata/test/test_mutable.py 1333
10885 
10886-    def test_corrupt_some(self):
10887-        # corrupt the data of first five shares (so the servermap thinks
10888-        # they're good but retrieve marks them as bad), so that the
10889-        # MODE_READ set of 6 will be insufficient, forcing node.download to
10890-        # retry with more servers.
10891-        corrupt(None, self._storage, "share_data", range(5))
10892-        d = self.make_servermap()
10893+
10894+    def _test_corrupt_some(self, offset, mdmf=False):
10895+        if mdmf:
10896+            d = self.publish_mdmf()
10897+        else:
10898+            d = defer.succeed(None)
10899+        d.addCallback(lambda ignored:
10900+            corrupt(None, self._storage, offset, range(5)))
10901+        d.addCallback(lambda ignored:
10902+            self.make_servermap())
10903         def _do_retrieve(servermap):
10904             ver = servermap.best_recoverable_version()
10905             self.failUnless(ver)
10906hunk ./src/allmydata/test/test_mutable.py 1349
10907             return self._fn.download_best_version()
10908         d.addCallback(_do_retrieve)
10909         d.addCallback(lambda new_contents:
10910-                      self.failUnlessEqual(new_contents, self.CONTENTS))
10911+            self.failUnlessEqual(new_contents, self.CONTENTS))
10912         return d
10913 
10914hunk ./src/allmydata/test/test_mutable.py 1352
10915+
10916+    def test_corrupt_some(self):
10917+        # corrupt the data of first five shares (so the servermap thinks
10918+        # they're good but retrieve marks them as bad), so that the
10919+        # MODE_READ set of 6 will be insufficient, forcing node.download to
10920+        # retry with more servers.
10921+        return self._test_corrupt_some("share_data")
10922+
10923+
10924     def test_download_fails(self):
10925hunk ./src/allmydata/test/test_mutable.py 1362
10926-        corrupt(None, self._storage, "signature")
10927-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10928+        d = corrupt(None, self._storage, "signature")
10929+        d.addCallback(lambda ignored:
10930+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10931                             "no recoverable versions",
10932hunk ./src/allmydata/test/test_mutable.py 1366
10933-                            self._fn.download_best_version)
10934+                            self._fn.download_best_version))
10935         return d
10936 
10937 
10938hunk ./src/allmydata/test/test_mutable.py 1370
10939+
10940+    def test_corrupt_mdmf_block_hash_tree(self):
10941+        d = self.publish_mdmf()
10942+        d.addCallback(lambda ignored:
10943+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10944+                                   "block hash tree failure",
10945+                                   corrupt_early=False,
10946+                                   should_succeed=False))
10947+        return d
10948+
10949+
10950+    def test_corrupt_mdmf_block_hash_tree_late(self):
10951+        d = self.publish_mdmf()
10952+        d.addCallback(lambda ignored:
10953+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10954+                                   "block hash tree failure",
10955+                                   corrupt_early=True,
10956+                                   should_succeed=False))
10957+        return d
10958+
10959+
10960+    def test_corrupt_mdmf_share_data(self):
10961+        d = self.publish_mdmf()
10962+        d.addCallback(lambda ignored:
10963+            # TODO: Find out what the block size is and corrupt a
10964+            # specific block, rather than just guessing.
10965+            self._test_corrupt_all(("share_data", 12 * 40),
10966+                                    "block hash tree failure",
10967+                                    corrupt_early=True,
10968+                                    should_succeed=False))
10969+        return d
10970+
10971+
10972+    def test_corrupt_some_mdmf(self):
10973+        return self._test_corrupt_some(("share_data", 12 * 40),
10974+                                       mdmf=True)
10975+
10976+
10977 class CheckerMixin:
10978     def check_good(self, r, where):
10979         self.failUnless(r.is_healthy(), where)
10980hunk ./src/allmydata/test/test_mutable.py 1438
10981         d.addCallback(self.check_good, "test_check_good")
10982         return d
10983 
10984+    def test_check_mdmf_good(self):
10985+        d = self.publish_mdmf()
10986+        d.addCallback(lambda ignored:
10987+            self._fn.check(Monitor()))
10988+        d.addCallback(self.check_good, "test_check_mdmf_good")
10989+        return d
10990+
10991     def test_check_no_shares(self):
10992         for shares in self._storage._peers.values():
10993             shares.clear()
10994hunk ./src/allmydata/test/test_mutable.py 1452
10995         d.addCallback(self.check_bad, "test_check_no_shares")
10996         return d
10997 
10998+    def test_check_mdmf_no_shares(self):
10999+        d = self.publish_mdmf()
11000+        def _then(ignored):
11001+            for share in self._storage._peers.values():
11002+                share.clear()
11003+        d.addCallback(_then)
11004+        d.addCallback(lambda ignored:
11005+            self._fn.check(Monitor()))
11006+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
11007+        return d
11008+
11009     def test_check_not_enough_shares(self):
11010         for shares in self._storage._peers.values():
11011             for shnum in shares.keys():
11012hunk ./src/allmydata/test/test_mutable.py 1472
11013         d.addCallback(self.check_bad, "test_check_not_enough_shares")
11014         return d
11015 
11016+    def test_check_mdmf_not_enough_shares(self):
11017+        d = self.publish_mdmf()
11018+        def _then(ignored):
11019+            for shares in self._storage._peers.values():
11020+                for shnum in shares.keys():
11021+                    if shnum > 0:
11022+                        del shares[shnum]
11023+        d.addCallback(_then)
11024+        d.addCallback(lambda ignored:
11025+            self._fn.check(Monitor()))
11026+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
11027+        return d
11028+
11029+
11030     def test_check_all_bad_sig(self):
11031hunk ./src/allmydata/test/test_mutable.py 1487
11032-        corrupt(None, self._storage, 1) # bad sig
11033-        d = self._fn.check(Monitor())
11034+        d = corrupt(None, self._storage, 1) # bad sig
11035+        d.addCallback(lambda ignored:
11036+            self._fn.check(Monitor()))
11037         d.addCallback(self.check_bad, "test_check_all_bad_sig")
11038         return d
11039 
11040hunk ./src/allmydata/test/test_mutable.py 1493
11041+    def test_check_mdmf_all_bad_sig(self):
11042+        d = self.publish_mdmf()
11043+        d.addCallback(lambda ignored:
11044+            corrupt(None, self._storage, 1))
11045+        d.addCallback(lambda ignored:
11046+            self._fn.check(Monitor()))
11047+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
11048+        return d
11049+
11050     def test_check_all_bad_blocks(self):
11051hunk ./src/allmydata/test/test_mutable.py 1503
11052-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11053+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11054         # the Checker won't notice this.. it doesn't look at actual data
11055hunk ./src/allmydata/test/test_mutable.py 1505
11056-        d = self._fn.check(Monitor())
11057+        d.addCallback(lambda ignored:
11058+            self._fn.check(Monitor()))
11059         d.addCallback(self.check_good, "test_check_all_bad_blocks")
11060         return d
11061 
11062hunk ./src/allmydata/test/test_mutable.py 1510
11063+
11064+    def test_check_mdmf_all_bad_blocks(self):
11065+        d = self.publish_mdmf()
11066+        d.addCallback(lambda ignored:
11067+            corrupt(None, self._storage, "share_data"))
11068+        d.addCallback(lambda ignored:
11069+            self._fn.check(Monitor()))
11070+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
11071+        return d
11072+
11073     def test_verify_good(self):
11074         d = self._fn.check(Monitor(), verify=True)
11075         d.addCallback(self.check_good, "test_verify_good")
11076hunk ./src/allmydata/test/test_mutable.py 1524
11077         return d
11078+    test_verify_good.timeout = 15
11079 
11080     def test_verify_all_bad_sig(self):
11081hunk ./src/allmydata/test/test_mutable.py 1527
11082-        corrupt(None, self._storage, 1) # bad sig
11083-        d = self._fn.check(Monitor(), verify=True)
11084+        d = corrupt(None, self._storage, 1) # bad sig
11085+        d.addCallback(lambda ignored:
11086+            self._fn.check(Monitor(), verify=True))
11087         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
11088         return d
11089 
11090hunk ./src/allmydata/test/test_mutable.py 1534
11091     def test_verify_one_bad_sig(self):
11092-        corrupt(None, self._storage, 1, [9]) # bad sig
11093-        d = self._fn.check(Monitor(), verify=True)
11094+        d = corrupt(None, self._storage, 1, [9]) # bad sig
11095+        d.addCallback(lambda ignored:
11096+            self._fn.check(Monitor(), verify=True))
11097         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
11098         return d
11099 
11100hunk ./src/allmydata/test/test_mutable.py 1541
11101     def test_verify_one_bad_block(self):
11102-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11103+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11104         # the Verifier *will* notice this, since it examines every byte
11105hunk ./src/allmydata/test/test_mutable.py 1543
11106-        d = self._fn.check(Monitor(), verify=True)
11107+        d.addCallback(lambda ignored:
11108+            self._fn.check(Monitor(), verify=True))
11109         d.addCallback(self.check_bad, "test_verify_one_bad_block")
11110         d.addCallback(self.check_expected_failure,
11111                       CorruptShareError, "block hash tree failure",
11112hunk ./src/allmydata/test/test_mutable.py 1552
11113         return d
11114 
11115     def test_verify_one_bad_sharehash(self):
11116-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
11117-        d = self._fn.check(Monitor(), verify=True)
11118+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
11119+        d.addCallback(lambda ignored:
11120+            self._fn.check(Monitor(), verify=True))
11121         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
11122         d.addCallback(self.check_expected_failure,
11123                       CorruptShareError, "corrupt hashes",
11124hunk ./src/allmydata/test/test_mutable.py 1562
11125         return d
11126 
11127     def test_verify_one_bad_encprivkey(self):
11128-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11129-        d = self._fn.check(Monitor(), verify=True)
11130+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11131+        d.addCallback(lambda ignored:
11132+            self._fn.check(Monitor(), verify=True))
11133         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
11134         d.addCallback(self.check_expected_failure,
11135                       CorruptShareError, "invalid privkey",
11136hunk ./src/allmydata/test/test_mutable.py 1572
11137         return d
11138 
11139     def test_verify_one_bad_encprivkey_uncheckable(self):
11140-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11141+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11142         readonly_fn = self._fn.get_readonly()
11143         # a read-only node has no way to validate the privkey
11144hunk ./src/allmydata/test/test_mutable.py 1575
11145-        d = readonly_fn.check(Monitor(), verify=True)
11146+        d.addCallback(lambda ignored:
11147+            readonly_fn.check(Monitor(), verify=True))
11148         d.addCallback(self.check_good,
11149                       "test_verify_one_bad_encprivkey_uncheckable")
11150         return d
11151hunk ./src/allmydata/test/test_mutable.py 1581
11152 
11153+
11154+    def test_verify_mdmf_good(self):
11155+        d = self.publish_mdmf()
11156+        d.addCallback(lambda ignored:
11157+            self._fn.check(Monitor(), verify=True))
11158+        d.addCallback(self.check_good, "test_verify_mdmf_good")
11159+        return d
11160+
11161+
11162+    def test_verify_mdmf_one_bad_block(self):
11163+        d = self.publish_mdmf()
11164+        d.addCallback(lambda ignored:
11165+            corrupt(None, self._storage, "share_data", [1]))
11166+        d.addCallback(lambda ignored:
11167+            self._fn.check(Monitor(), verify=True))
11168+        # We should find one bad block here
11169+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
11170+        d.addCallback(self.check_expected_failure,
11171+                      CorruptShareError, "block hash tree failure",
11172+                      "test_verify_mdmf_one_bad_block")
11173+        return d
11174+
11175+
11176+    def test_verify_mdmf_bad_encprivkey(self):
11177+        d = self.publish_mdmf()
11178+        d.addCallback(lambda ignored:
11179+            corrupt(None, self._storage, "enc_privkey", [1]))
11180+        d.addCallback(lambda ignored:
11181+            self._fn.check(Monitor(), verify=True))
11182+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
11183+        d.addCallback(self.check_expected_failure,
11184+                      CorruptShareError, "privkey",
11185+                      "test_verify_mdmf_bad_encprivkey")
11186+        return d
11187+
11188+
11189+    def test_verify_mdmf_bad_sig(self):
11190+        d = self.publish_mdmf()
11191+        d.addCallback(lambda ignored:
11192+            corrupt(None, self._storage, 1, [1]))
11193+        d.addCallback(lambda ignored:
11194+            self._fn.check(Monitor(), verify=True))
11195+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
11196+        return d
11197+
11198+
11199+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
11200+        d = self.publish_mdmf()
11201+        d.addCallback(lambda ignored:
11202+            corrupt(None, self._storage, "enc_privkey", [1]))
11203+        d.addCallback(lambda ignored:
11204+            self._fn.get_readonly())
11205+        d.addCallback(lambda fn:
11206+            fn.check(Monitor(), verify=True))
11207+        d.addCallback(self.check_good,
11208+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
11209+        return d
11210+
11211+
11212 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
11213 
11214     def get_shares(self, s):
11215hunk ./src/allmydata/test/test_mutable.py 1705
11216         current_shares = self.old_shares[-1]
11217         self.failUnlessEqual(old_shares, current_shares)
11218 
11219+
11220     def test_unrepairable_0shares(self):
11221         d = self.publish_one()
11222         def _delete_all_shares(ign):
11223hunk ./src/allmydata/test/test_mutable.py 1720
11224         d.addCallback(_check)
11225         return d
11226 
11227+    def test_mdmf_unrepairable_0shares(self):
11228+        d = self.publish_mdmf()
11229+        def _delete_all_shares(ign):
11230+            shares = self._storage._peers
11231+            for peerid in shares:
11232+                shares[peerid] = {}
11233+        d.addCallback(_delete_all_shares)
11234+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11235+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11236+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
11237+        return d
11238+
11239+
11240     def test_unrepairable_1share(self):
11241         d = self.publish_one()
11242         def _delete_all_shares(ign):
11243hunk ./src/allmydata/test/test_mutable.py 1749
11244         d.addCallback(_check)
11245         return d
11246 
11247+    def test_mdmf_unrepairable_1share(self):
11248+        d = self.publish_mdmf()
11249+        def _delete_all_shares(ign):
11250+            shares = self._storage._peers
11251+            for peerid in shares:
11252+                for shnum in list(shares[peerid]):
11253+                    if shnum > 0:
11254+                        del shares[peerid][shnum]
11255+        d.addCallback(_delete_all_shares)
11256+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11257+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11258+        def _check(crr):
11259+            self.failUnlessEqual(crr.get_successful(), False)
11260+        d.addCallback(_check)
11261+        return d
11262+
11263+    def test_repairable_5shares(self):
11264+        d = self.publish_mdmf()
11265+        def _delete_all_shares(ign):
11266+            shares = self._storage._peers
11267+            for peerid in shares:
11268+                for shnum in list(shares[peerid]):
11269+                    if shnum > 4:
11270+                        del shares[peerid][shnum]
11271+        d.addCallback(_delete_all_shares)
11272+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11273+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11274+        def _check(crr):
11275+            self.failUnlessEqual(crr.get_successful(), True)
11276+        d.addCallback(_check)
11277+        return d
11278+
11279+    def test_mdmf_repairable_5shares(self):
11280+        d = self.publish_mdmf()
11281+        def _delete_some_shares(ign):
11282+            shares = self._storage._peers
11283+            for peerid in shares:
11284+                for shnum in list(shares[peerid]):
11285+                    if shnum > 5:
11286+                        del shares[peerid][shnum]
11287+        d.addCallback(_delete_some_shares)
11288+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11289+        def _check(cr):
11290+            self.failIf(cr.is_healthy())
11291+            self.failUnless(cr.is_recoverable())
11292+            return cr
11293+        d.addCallback(_check)
11294+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11295+        def _check1(crr):
11296+            self.failUnlessEqual(crr.get_successful(), True)
11297+        d.addCallback(_check1)
11298+        return d
11299+
11300+
11301     def test_merge(self):
11302         self.old_shares = []
11303         d = self.publish_multiple()
11304hunk ./src/allmydata/test/test_mutable.py 1917
11305 class MultipleEncodings(unittest.TestCase):
11306     def setUp(self):
11307         self.CONTENTS = "New contents go here"
11308+        self.uploadable = MutableData(self.CONTENTS)
11309         self._storage = FakeStorage()
11310         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
11311         self._storage_broker = self._nodemaker.storage_broker
11312hunk ./src/allmydata/test/test_mutable.py 1921
11313-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
11314+        d = self._nodemaker.create_mutable_file(self.uploadable)
11315         def _created(node):
11316             self._fn = node
11317         d.addCallback(_created)
11318hunk ./src/allmydata/test/test_mutable.py 1927
11319         return d
11320 
11321-    def _encode(self, k, n, data):
11322+    def _encode(self, k, n, data, version=SDMF_VERSION):
11323         # encode 'data' into a peerid->shares dict.
11324 
11325         fn = self._fn
11326hunk ./src/allmydata/test/test_mutable.py 1943
11327         # and set the encoding parameters to something completely different
11328         fn2._required_shares = k
11329         fn2._total_shares = n
11330+        # Normally a servermap update would occur before a publish.
11331+        # Here, it doesn't, so we have to do it ourselves.
11332+        fn2.set_version(version)
11333 
11334         s = self._storage
11335         s._peers = {} # clear existing storage
11336hunk ./src/allmydata/test/test_mutable.py 1950
11337         p2 = Publish(fn2, self._storage_broker, None)
11338-        d = p2.publish(data)
11339+        uploadable = MutableData(data)
11340+        d = p2.publish(uploadable)
11341         def _published(res):
11342             shares = s._peers
11343             s._peers = {}
11344hunk ./src/allmydata/test/test_mutable.py 2253
11345         self.basedir = "mutable/Problems/test_publish_surprise"
11346         self.set_up_grid()
11347         nm = self.g.clients[0].nodemaker
11348-        d = nm.create_mutable_file("contents 1")
11349+        d = nm.create_mutable_file(MutableData("contents 1"))
11350         def _created(n):
11351             d = defer.succeed(None)
11352             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11353hunk ./src/allmydata/test/test_mutable.py 2263
11354             d.addCallback(_got_smap1)
11355             # then modify the file, leaving the old map untouched
11356             d.addCallback(lambda res: log.msg("starting winning write"))
11357-            d.addCallback(lambda res: n.overwrite("contents 2"))
11358+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11359             # now attempt to modify the file with the old servermap. This
11360             # will look just like an uncoordinated write, in which every
11361             # single share got updated between our mapupdate and our publish
11362hunk ./src/allmydata/test/test_mutable.py 2272
11363                           self.shouldFail(UncoordinatedWriteError,
11364                                           "test_publish_surprise", None,
11365                                           n.upload,
11366-                                          "contents 2a", self.old_map))
11367+                                          MutableData("contents 2a"), self.old_map))
11368             return d
11369         d.addCallback(_created)
11370         return d
11371hunk ./src/allmydata/test/test_mutable.py 2281
11372         self.basedir = "mutable/Problems/test_retrieve_surprise"
11373         self.set_up_grid()
11374         nm = self.g.clients[0].nodemaker
11375-        d = nm.create_mutable_file("contents 1")
11376+        d = nm.create_mutable_file(MutableData("contents 1"))
11377         def _created(n):
11378             d = defer.succeed(None)
11379             d.addCallback(lambda res: n.get_servermap(MODE_READ))
11380hunk ./src/allmydata/test/test_mutable.py 2291
11381             d.addCallback(_got_smap1)
11382             # then modify the file, leaving the old map untouched
11383             d.addCallback(lambda res: log.msg("starting winning write"))
11384-            d.addCallback(lambda res: n.overwrite("contents 2"))
11385+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11386             # now attempt to retrieve the old version with the old servermap.
11387             # This will look like someone has changed the file since we
11388             # updated the servermap.
11389hunk ./src/allmydata/test/test_mutable.py 2300
11390             d.addCallback(lambda res:
11391                           self.shouldFail(NotEnoughSharesError,
11392                                           "test_retrieve_surprise",
11393-                                          "ran out of peers: have 0 shares (k=3)",
11394+                                          "ran out of peers: have 0 of 1",
11395                                           n.download_version,
11396                                           self.old_map,
11397                                           self.old_map.best_recoverable_version(),
11398hunk ./src/allmydata/test/test_mutable.py 2309
11399         d.addCallback(_created)
11400         return d
11401 
11402+
11403     def test_unexpected_shares(self):
11404         # upload the file, take a servermap, shut down one of the servers,
11405         # upload it again (causing shares to appear on a new server), then
11406hunk ./src/allmydata/test/test_mutable.py 2319
11407         self.basedir = "mutable/Problems/test_unexpected_shares"
11408         self.set_up_grid()
11409         nm = self.g.clients[0].nodemaker
11410-        d = nm.create_mutable_file("contents 1")
11411+        d = nm.create_mutable_file(MutableData("contents 1"))
11412         def _created(n):
11413             d = defer.succeed(None)
11414             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11415hunk ./src/allmydata/test/test_mutable.py 2331
11416                 self.g.remove_server(peer0)
11417                 # then modify the file, leaving the old map untouched
11418                 log.msg("starting winning write")
11419-                return n.overwrite("contents 2")
11420+                return n.overwrite(MutableData("contents 2"))
11421             d.addCallback(_got_smap1)
11422             # now attempt to modify the file with the old servermap. This
11423             # will look just like an uncoordinated write, in which every
11424hunk ./src/allmydata/test/test_mutable.py 2341
11425                           self.shouldFail(UncoordinatedWriteError,
11426                                           "test_surprise", None,
11427                                           n.upload,
11428-                                          "contents 2a", self.old_map))
11429+                                          MutableData("contents 2a"), self.old_map))
11430             return d
11431         d.addCallback(_created)
11432         return d
11433hunk ./src/allmydata/test/test_mutable.py 2345
11434+    test_unexpected_shares.timeout = 15
11435 
11436     def test_bad_server(self):
11437         # Break one server, then create the file: the initial publish should
11438hunk ./src/allmydata/test/test_mutable.py 2381
11439         d.addCallback(_break_peer0)
11440         # now "create" the file, using the pre-established key, and let the
11441         # initial publish finally happen
11442-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
11443+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
11444         # that ought to work
11445         def _got_node(n):
11446             d = n.download_best_version()
11447hunk ./src/allmydata/test/test_mutable.py 2390
11448             def _break_peer1(res):
11449                 self.connection1.broken = True
11450             d.addCallback(_break_peer1)
11451-            d.addCallback(lambda res: n.overwrite("contents 2"))
11452+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11453             # that ought to work too
11454             d.addCallback(lambda res: n.download_best_version())
11455             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11456hunk ./src/allmydata/test/test_mutable.py 2422
11457         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
11458         self.g.break_server(peerids[0])
11459 
11460-        d = nm.create_mutable_file("contents 1")
11461+        d = nm.create_mutable_file(MutableData("contents 1"))
11462         def _created(n):
11463             d = n.download_best_version()
11464             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
11465hunk ./src/allmydata/test/test_mutable.py 2430
11466             def _break_second_server(res):
11467                 self.g.break_server(peerids[1])
11468             d.addCallback(_break_second_server)
11469-            d.addCallback(lambda res: n.overwrite("contents 2"))
11470+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11471             # that ought to work too
11472             d.addCallback(lambda res: n.download_best_version())
11473             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11474hunk ./src/allmydata/test/test_mutable.py 2449
11475         d = self.shouldFail(NotEnoughServersError,
11476                             "test_publish_all_servers_bad",
11477                             "Ran out of non-bad servers",
11478-                            nm.create_mutable_file, "contents")
11479+                            nm.create_mutable_file, MutableData("contents"))
11480         return d
11481 
11482     def test_publish_no_servers(self):
11483hunk ./src/allmydata/test/test_mutable.py 2461
11484         d = self.shouldFail(NotEnoughServersError,
11485                             "test_publish_no_servers",
11486                             "Ran out of non-bad servers",
11487-                            nm.create_mutable_file, "contents")
11488+                            nm.create_mutable_file, MutableData("contents"))
11489         return d
11490     test_publish_no_servers.timeout = 30
11491 
11492hunk ./src/allmydata/test/test_mutable.py 2479
11493         # we need some contents that are large enough to push the privkey out
11494         # of the early part of the file
11495         LARGE = "These are Larger contents" * 2000 # about 50KB
11496-        d = nm.create_mutable_file(LARGE)
11497+        LARGE_uploadable = MutableData(LARGE)
11498+        d = nm.create_mutable_file(LARGE_uploadable)
11499         def _created(n):
11500             self.uri = n.get_uri()
11501             self.n2 = nm.create_from_cap(self.uri)
11502hunk ./src/allmydata/test/test_mutable.py 2515
11503         self.basedir = "mutable/Problems/test_privkey_query_missing"
11504         self.set_up_grid(num_servers=20)
11505         nm = self.g.clients[0].nodemaker
11506-        LARGE = "These are Larger contents" * 2000 # about 50KB
11507+        LARGE = "These are Larger contents" * 2000 # about 50KiB
11508+        LARGE_uploadable = MutableData(LARGE)
11509         nm._node_cache = DevNullDictionary() # disable the nodecache
11510 
11511hunk ./src/allmydata/test/test_mutable.py 2519
11512-        d = nm.create_mutable_file(LARGE)
11513+        d = nm.create_mutable_file(LARGE_uploadable)
11514         def _created(n):
11515             self.uri = n.get_uri()
11516             self.n2 = nm.create_from_cap(self.uri)
11517hunk ./src/allmydata/test/test_mutable.py 2529
11518         d.addCallback(_created)
11519         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
11520         return d
11521+
11522+
11523+    def test_block_and_hash_query_error(self):
11524+        # This tests for what happens when a query to a remote server
11525+        # fails in either the hash validation step or the block getting
11526+        # step (because of batching, this is the same actual query).
11527+        # We need to have the storage server persist up until the point
11528+        # that its prefix is validated, then suddenly die. This
11529+        # exercises some exception handling code in Retrieve.
11530+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
11531+        self.set_up_grid(num_servers=20)
11532+        nm = self.g.clients[0].nodemaker
11533+        CONTENTS = "contents" * 2000
11534+        CONTENTS_uploadable = MutableData(CONTENTS)
11535+        d = nm.create_mutable_file(CONTENTS_uploadable)
11536+        def _created(node):
11537+            self._node = node
11538+        d.addCallback(_created)
11539+        d.addCallback(lambda ignored:
11540+            self._node.get_servermap(MODE_READ))
11541+        def _then(servermap):
11542+            # we have our servermap. Now we set up the servers like the
11543+            # tests above -- the first one that gets a read call should
11544+            # start throwing errors, but only after returning its prefix
11545+            # for validation. Since we'll download without fetching the
11546+            # private key, the next query to the remote server will be
11547+            # for either a block and salt or for hashes, either of which
11548+            # will exercise the error handling code.
11549+            killer = FirstServerGetsKilled()
11550+            for (serverid, ss) in nm.storage_broker.get_all_servers():
11551+                ss.post_call_notifier = killer.notify
11552+            ver = servermap.best_recoverable_version()
11553+            assert ver
11554+            return self._node.download_version(servermap, ver)
11555+        d.addCallback(_then)
11556+        d.addCallback(lambda data:
11557+            self.failUnlessEqual(data, CONTENTS))
11558+        return d
11559+
11560+
11561+class FileHandle(unittest.TestCase):
11562+    def setUp(self):
11563+        self.test_data = "Test Data" * 50000
11564+        self.sio = StringIO(self.test_data)
11565+        self.uploadable = MutableFileHandle(self.sio)
11566+
11567+
11568+    def test_filehandle_read(self):
11569+        self.basedir = "mutable/FileHandle/test_filehandle_read"
11570+        chunk_size = 10
11571+        for i in xrange(0, len(self.test_data), chunk_size):
11572+            data = self.uploadable.read(chunk_size)
11573+            data = "".join(data)
11574+            start = i
11575+            end = i + chunk_size
11576+            self.failUnlessEqual(data, self.test_data[start:end])
11577+
11578+
11579+    def test_filehandle_get_size(self):
11580+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
11581+        actual_size = len(self.test_data)
11582+        size = self.uploadable.get_size()
11583+        self.failUnlessEqual(size, actual_size)
11584+
11585+
11586+    def test_filehandle_get_size_out_of_order(self):
11587+        # We should be able to call get_size whenever we want without
11588+        # disturbing the location of the seek pointer.
11589+        chunk_size = 100
11590+        data = self.uploadable.read(chunk_size)
11591+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11592+
11593+        # Now get the size.
11594+        size = self.uploadable.get_size()
11595+        self.failUnlessEqual(size, len(self.test_data))
11596+
11597+        # Now get more data. We should be right where we left off.
11598+        more_data = self.uploadable.read(chunk_size)
11599+        start = chunk_size
11600+        end = chunk_size * 2
11601+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11602+
11603+
11604+    def test_filehandle_file(self):
11605+        # Make sure that the MutableFileHandle works on a file as well
11606+        # as a StringIO object, since in some cases it will be asked to
11607+        # deal with files.
11608+        self.basedir = self.mktemp()
11609+        # necessary? What am I doing wrong here?
11610+        os.mkdir(self.basedir)
11611+        f_path = os.path.join(self.basedir, "test_file")
11612+        f = open(f_path, "w")
11613+        f.write(self.test_data)
11614+        f.close()
11615+        f = open(f_path, "r")
11616+
11617+        uploadable = MutableFileHandle(f)
11618+
11619+        data = uploadable.read(len(self.test_data))
11620+        self.failUnlessEqual("".join(data), self.test_data)
11621+        size = uploadable.get_size()
11622+        self.failUnlessEqual(size, len(self.test_data))
11623+
11624+
11625+    def test_close(self):
11626+        # Make sure that the MutableFileHandle closes its handle when
11627+        # told to do so.
11628+        self.uploadable.close()
11629+        self.failUnless(self.sio.closed)
11630+
11631+
11632+class DataHandle(unittest.TestCase):
11633+    def setUp(self):
11634+        self.test_data = "Test Data" * 50000
11635+        self.uploadable = MutableData(self.test_data)
11636+
11637+
11638+    def test_datahandle_read(self):
11639+        chunk_size = 10
11640+        for i in xrange(0, len(self.test_data), chunk_size):
11641+            data = self.uploadable.read(chunk_size)
11642+            data = "".join(data)
11643+            start = i
11644+            end = i + chunk_size
11645+            self.failUnlessEqual(data, self.test_data[start:end])
11646+
11647+
11648+    def test_datahandle_get_size(self):
11649+        actual_size = len(self.test_data)
11650+        size = self.uploadable.get_size()
11651+        self.failUnlessEqual(size, actual_size)
11652+
11653+
11654+    def test_datahandle_get_size_out_of_order(self):
11655+        # We should be able to call get_size whenever we want without
11656+        # disturbing the location of the seek pointer.
11657+        chunk_size = 100
11658+        data = self.uploadable.read(chunk_size)
11659+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11660+
11661+        # Now get the size.
11662+        size = self.uploadable.get_size()
11663+        self.failUnlessEqual(size, len(self.test_data))
11664+
11665+        # Now get more data. We should be right where we left off.
11666+        more_data = self.uploadable.read(chunk_size)
11667+        start = chunk_size
11668+        end = chunk_size * 2
11669+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11670+
11671+
11672+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
11673+              PublishMixin):
11674+    def setUp(self):
11675+        GridTestMixin.setUp(self)
11676+        self.basedir = self.mktemp()
11677+        self.set_up_grid()
11678+        self.c = self.g.clients[0]
11679+        self.nm = self.c.nodemaker
11680+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11681+        self.small_data = "test data" * 10 # about 90 B; SDMF
11682+        return self.do_upload()
11683+
11684+
11685+    def do_upload(self):
11686+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11687+                                         version=MDMF_VERSION)
11688+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11689+        dl = gatherResults([d1, d2])
11690+        def _then((n1, n2)):
11691+            assert isinstance(n1, MutableFileNode)
11692+            assert isinstance(n2, MutableFileNode)
11693+
11694+            self.mdmf_node = n1
11695+            self.sdmf_node = n2
11696+        dl.addCallback(_then)
11697+        return dl
11698+
11699+
11700+    def test_get_readonly_mutable_version(self):
11701+        # Attempting to get a mutable version of a mutable file from a
11702+        # filenode initialized with a readcap should return a readonly
11703+        # version of that same node.
11704+        ro = self.mdmf_node.get_readonly()
11705+        d = ro.get_best_mutable_version()
11706+        d.addCallback(lambda version:
11707+            self.failUnless(version.is_readonly()))
11708+        d.addCallback(lambda ignored:
11709+            self.sdmf_node.get_readonly())
11710+        d.addCallback(lambda version:
11711+            self.failUnless(version.is_readonly()))
11712+        return d
11713+
11714+
11715+    def test_get_sequence_number(self):
11716+        d = self.mdmf_node.get_best_readable_version()
11717+        d.addCallback(lambda bv:
11718+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11719+        d.addCallback(lambda ignored:
11720+            self.sdmf_node.get_best_readable_version())
11721+        d.addCallback(lambda bv:
11722+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11723+        # Now update. The sequence number in both cases should be 1 in
11724+        # both cases.
11725+        def _do_update(ignored):
11726+            new_data = MutableData("foo bar baz" * 100000)
11727+            new_small_data = MutableData("foo bar baz" * 10)
11728+            d1 = self.mdmf_node.overwrite(new_data)
11729+            d2 = self.sdmf_node.overwrite(new_small_data)
11730+            dl = gatherResults([d1, d2])
11731+            return dl
11732+        d.addCallback(_do_update)
11733+        d.addCallback(lambda ignored:
11734+            self.mdmf_node.get_best_readable_version())
11735+        d.addCallback(lambda bv:
11736+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11737+        d.addCallback(lambda ignored:
11738+            self.sdmf_node.get_best_readable_version())
11739+        d.addCallback(lambda bv:
11740+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11741+        return d
11742+
11743+
11744+    def test_get_writekey(self):
11745+        d = self.mdmf_node.get_best_mutable_version()
11746+        d.addCallback(lambda bv:
11747+            self.failUnlessEqual(bv.get_writekey(),
11748+                                 self.mdmf_node.get_writekey()))
11749+        d.addCallback(lambda ignored:
11750+            self.sdmf_node.get_best_mutable_version())
11751+        d.addCallback(lambda bv:
11752+            self.failUnlessEqual(bv.get_writekey(),
11753+                                 self.sdmf_node.get_writekey()))
11754+        return d
11755+
11756+
11757+    def test_get_storage_index(self):
11758+        d = self.mdmf_node.get_best_mutable_version()
11759+        d.addCallback(lambda bv:
11760+            self.failUnlessEqual(bv.get_storage_index(),
11761+                                 self.mdmf_node.get_storage_index()))
11762+        d.addCallback(lambda ignored:
11763+            self.sdmf_node.get_best_mutable_version())
11764+        d.addCallback(lambda bv:
11765+            self.failUnlessEqual(bv.get_storage_index(),
11766+                                 self.sdmf_node.get_storage_index()))
11767+        return d
11768+
11769+
11770+    def test_get_readonly_version(self):
11771+        d = self.mdmf_node.get_best_readable_version()
11772+        d.addCallback(lambda bv:
11773+            self.failUnless(bv.is_readonly()))
11774+        d.addCallback(lambda ignored:
11775+            self.sdmf_node.get_best_readable_version())
11776+        d.addCallback(lambda bv:
11777+            self.failUnless(bv.is_readonly()))
11778+        return d
11779+
11780+
11781+    def test_get_mutable_version(self):
11782+        d = self.mdmf_node.get_best_mutable_version()
11783+        d.addCallback(lambda bv:
11784+            self.failIf(bv.is_readonly()))
11785+        d.addCallback(lambda ignored:
11786+            self.sdmf_node.get_best_mutable_version())
11787+        d.addCallback(lambda bv:
11788+            self.failIf(bv.is_readonly()))
11789+        return d
11790+
11791+
11792+    def test_toplevel_overwrite(self):
11793+        new_data = MutableData("foo bar baz" * 100000)
11794+        new_small_data = MutableData("foo bar baz" * 10)
11795+        d = self.mdmf_node.overwrite(new_data)
11796+        d.addCallback(lambda ignored:
11797+            self.mdmf_node.download_best_version())
11798+        d.addCallback(lambda data:
11799+            self.failUnlessEqual(data, "foo bar baz" * 100000))
11800+        d.addCallback(lambda ignored:
11801+            self.sdmf_node.overwrite(new_small_data))
11802+        d.addCallback(lambda ignored:
11803+            self.sdmf_node.download_best_version())
11804+        d.addCallback(lambda data:
11805+            self.failUnlessEqual(data, "foo bar baz" * 10))
11806+        return d
11807+
11808+
11809+    def test_toplevel_modify(self):
11810+        def modifier(old_contents, servermap, first_time):
11811+            return old_contents + "modified"
11812+        d = self.mdmf_node.modify(modifier)
11813+        d.addCallback(lambda ignored:
11814+            self.mdmf_node.download_best_version())
11815+        d.addCallback(lambda data:
11816+            self.failUnlessIn("modified", data))
11817+        d.addCallback(lambda ignored:
11818+            self.sdmf_node.modify(modifier))
11819+        d.addCallback(lambda ignored:
11820+            self.sdmf_node.download_best_version())
11821+        d.addCallback(lambda data:
11822+            self.failUnlessIn("modified", data))
11823+        return d
11824+
11825+
11826+    def test_version_modify(self):
11827+        # TODO: When we can publish multiple versions, alter this test
11828+        # to modify a version other than the best usable version, then
11829+        # test to see that the best recoverable version is that.
11830+        def modifier(old_contents, servermap, first_time):
11831+            return old_contents + "modified"
11832+        d = self.mdmf_node.modify(modifier)
11833+        d.addCallback(lambda ignored:
11834+            self.mdmf_node.download_best_version())
11835+        d.addCallback(lambda data:
11836+            self.failUnlessIn("modified", data))
11837+        d.addCallback(lambda ignored:
11838+            self.sdmf_node.modify(modifier))
11839+        d.addCallback(lambda ignored:
11840+            self.sdmf_node.download_best_version())
11841+        d.addCallback(lambda data:
11842+            self.failUnlessIn("modified", data))
11843+        return d
11844+
11845+
11846+    def test_download_version(self):
11847+        d = self.publish_multiple()
11848+        # We want to have two recoverable versions on the grid.
11849+        d.addCallback(lambda res:
11850+                      self._set_versions({0:0,2:0,4:0,6:0,8:0,
11851+                                          1:1,3:1,5:1,7:1,9:1}))
11852+        # Now try to download each version. We should get the plaintext
11853+        # associated with that version.
11854+        d.addCallback(lambda ignored:
11855+            self._fn.get_servermap(mode=MODE_READ))
11856+        def _got_servermap(smap):
11857+            versions = smap.recoverable_versions()
11858+            assert len(versions) == 2
11859+
11860+            self.servermap = smap
11861+            self.version1, self.version2 = versions
11862+            assert self.version1 != self.version2
11863+
11864+            self.version1_seqnum = self.version1[0]
11865+            self.version2_seqnum = self.version2[0]
11866+            self.version1_index = self.version1_seqnum - 1
11867+            self.version2_index = self.version2_seqnum - 1
11868+
11869+        d.addCallback(_got_servermap)
11870+        d.addCallback(lambda ignored:
11871+            self._fn.download_version(self.servermap, self.version1))
11872+        d.addCallback(lambda results:
11873+            self.failUnlessEqual(self.CONTENTS[self.version1_index],
11874+                                 results))
11875+        d.addCallback(lambda ignored:
11876+            self._fn.download_version(self.servermap, self.version2))
11877+        d.addCallback(lambda results:
11878+            self.failUnlessEqual(self.CONTENTS[self.version2_index],
11879+                                 results))
11880+        return d
11881+
11882+
11883+    def test_download_nonexistent_version(self):
11884+        d = self.mdmf_node.get_servermap(mode=MODE_WRITE)
11885+        def _set_servermap(servermap):
11886+            self.servermap = servermap
11887+        d.addCallback(_set_servermap)
11888+        d.addCallback(lambda ignored:
11889+           self.shouldFail(UnrecoverableFileError, "nonexistent version",
11890+                           None,
11891+                           self.mdmf_node.download_version, self.servermap,
11892+                           "not a version"))
11893+        return d
11894+
11895+
11896+    def test_partial_read(self):
11897+        # read only a few bytes at a time, and see that the results are
11898+        # what we expect.
11899+        d = self.mdmf_node.get_best_readable_version()
11900+        def _read_data(version):
11901+            c = consumer.MemoryConsumer()
11902+            d2 = defer.succeed(None)
11903+            for i in xrange(0, len(self.data), 10000):
11904+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
11905+            d2.addCallback(lambda ignored:
11906+                self.failUnlessEqual(self.data, "".join(c.chunks)))
11907+            return d2
11908+        d.addCallback(_read_data)
11909+        return d
11910+
11911+
11912+    def test_read(self):
11913+        d = self.mdmf_node.get_best_readable_version()
11914+        def _read_data(version):
11915+            c = consumer.MemoryConsumer()
11916+            d2 = defer.succeed(None)
11917+            d2.addCallback(lambda ignored: version.read(c))
11918+            d2.addCallback(lambda ignored:
11919+                self.failUnlessEqual("".join(c.chunks), self.data))
11920+            return d2
11921+        d.addCallback(_read_data)
11922+        return d
11923+
11924+
11925+    def test_download_best_version(self):
11926+        d = self.mdmf_node.download_best_version()
11927+        d.addCallback(lambda data:
11928+            self.failUnlessEqual(data, self.data))
11929+        d.addCallback(lambda ignored:
11930+            self.sdmf_node.download_best_version())
11931+        d.addCallback(lambda data:
11932+            self.failUnlessEqual(data, self.small_data))
11933+        return d
11934+
11935+
11936+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
11937+    def setUp(self):
11938+        GridTestMixin.setUp(self)
11939+        self.basedir = self.mktemp()
11940+        self.set_up_grid()
11941+        self.c = self.g.clients[0]
11942+        self.nm = self.c.nodemaker
11943+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11944+        self.small_data = "test data" * 10 # about 90 B; SDMF
11945+        return self.do_upload()
11946+
11947+
11948+    def do_upload(self):
11949+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11950+                                         version=MDMF_VERSION)
11951+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11952+        dl = gatherResults([d1, d2])
11953+        def _then((n1, n2)):
11954+            assert isinstance(n1, MutableFileNode)
11955+            assert isinstance(n2, MutableFileNode)
11956+
11957+            self.mdmf_node = n1
11958+            self.sdmf_node = n2
11959+        dl.addCallback(_then)
11960+        return dl
11961+
11962+
11963+    def test_append(self):
11964+        # We should be able to append data to the middle of a mutable
11965+        # file and get what we expect.
11966+        new_data = self.data + "appended"
11967+        d = self.mdmf_node.get_best_mutable_version()
11968+        d.addCallback(lambda mv:
11969+            mv.update(MutableData("appended"), len(self.data)))
11970+        d.addCallback(lambda ignored:
11971+            self.mdmf_node.download_best_version())
11972+        d.addCallback(lambda results:
11973+            self.failUnlessEqual(results, new_data))
11974+        return d
11975+    test_append.timeout = 15
11976+
11977+
11978+    def test_replace(self):
11979+        # We should be able to replace data in the middle of a mutable
11980+        # file and get what we expect back.
11981+        new_data = self.data[:100]
11982+        new_data += "appended"
11983+        new_data += self.data[108:]
11984+        d = self.mdmf_node.get_best_mutable_version()
11985+        d.addCallback(lambda mv:
11986+            mv.update(MutableData("appended"), 100))
11987+        d.addCallback(lambda ignored:
11988+            self.mdmf_node.download_best_version())
11989+        d.addCallback(lambda results:
11990+            self.failUnlessEqual(results, new_data))
11991+        return d
11992+
11993+
11994+    def test_replace_and_extend(self):
11995+        # We should be able to replace data in the middle of a mutable
11996+        # file and extend that mutable file and get what we expect.
11997+        new_data = self.data[:100]
11998+        new_data += "modified " * 100000
11999+        d = self.mdmf_node.get_best_mutable_version()
12000+        d.addCallback(lambda mv:
12001+            mv.update(MutableData("modified " * 100000), 100))
12002+        d.addCallback(lambda ignored:
12003+            self.mdmf_node.download_best_version())
12004+        d.addCallback(lambda results:
12005+            self.failUnlessEqual(results, new_data))
12006+        return d
12007+
12008+
12009+    def test_append_power_of_two(self):
12010+        # If we attempt to extend a mutable file so that its segment
12011+        # count crosses a power-of-two boundary, the update operation
12012+        # should know how to reencode the file.
12013+
12014+        # Note that the data populating self.mdmf_node is about 900 KiB
12015+        # long -- this is 7 segments in the default segment size. So we
12016+        # need to add 2 segments worth of data to push it over a
12017+        # power-of-two boundary.
12018+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12019+        new_data = self.data + (segment * 2)
12020+        d = self.mdmf_node.get_best_mutable_version()
12021+        d.addCallback(lambda mv:
12022+            mv.update(MutableData(segment * 2), len(self.data)))
12023+        d.addCallback(lambda ignored:
12024+            self.mdmf_node.download_best_version())
12025+        d.addCallback(lambda results:
12026+            self.failUnlessEqual(results, new_data))
12027+        return d
12028+    test_append_power_of_two.timeout = 15
12029+
12030+
12031+    def test_update_sdmf(self):
12032+        # Running update on a single-segment file should still work.
12033+        new_data = self.small_data + "appended"
12034+        d = self.sdmf_node.get_best_mutable_version()
12035+        d.addCallback(lambda mv:
12036+            mv.update(MutableData("appended"), len(self.small_data)))
12037+        d.addCallback(lambda ignored:
12038+            self.sdmf_node.download_best_version())
12039+        d.addCallback(lambda results:
12040+            self.failUnlessEqual(results, new_data))
12041+        return d
12042+
12043+    def test_replace_in_last_segment(self):
12044+        # The wrapper should know how to handle the tail segment
12045+        # appropriately.
12046+        replace_offset = len(self.data) - 100
12047+        new_data = self.data[:replace_offset] + "replaced"
12048+        rest_offset = replace_offset + len("replaced")
12049+        new_data += self.data[rest_offset:]
12050+        d = self.mdmf_node.get_best_mutable_version()
12051+        d.addCallback(lambda mv:
12052+            mv.update(MutableData("replaced"), replace_offset))
12053+        d.addCallback(lambda ignored:
12054+            self.mdmf_node.download_best_version())
12055+        d.addCallback(lambda results:
12056+            self.failUnlessEqual(results, new_data))
12057+        return d
12058+
12059+
12060+    def test_multiple_segment_replace(self):
12061+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
12062+        new_data = self.data[:replace_offset]
12063+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12064+        new_data += 2 * new_segment
12065+        new_data += "replaced"
12066+        rest_offset = len(new_data)
12067+        new_data += self.data[rest_offset:]
12068+        d = self.mdmf_node.get_best_mutable_version()
12069+        d.addCallback(lambda mv:
12070+            mv.update(MutableData((2 * new_segment) + "replaced"),
12071+                      replace_offset))
12072+        d.addCallback(lambda ignored:
12073+            self.mdmf_node.download_best_version())
12074+        d.addCallback(lambda results:
12075+            self.failUnlessEqual(results, new_data))
12076+        return d
12077hunk ./src/allmydata/test/test_sftp.py 32
12078 
12079 from allmydata.util.consumer import download_to_data
12080 from allmydata.immutable import upload
12081+from allmydata.mutable import publish
12082 from allmydata.test.no_network import GridTestMixin
12083 from allmydata.test.common import ShouldFailMixin
12084 from allmydata.test.common_util import ReallyEqualMixin
12085hunk ./src/allmydata/test/test_sftp.py 84
12086         return d
12087 
12088     def _set_up_tree(self):
12089-        d = self.client.create_mutable_file("mutable file contents")
12090+        u = publish.MutableData("mutable file contents")
12091+        d = self.client.create_mutable_file(u)
12092         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
12093         def _created_mutable(n):
12094             self.mutable = n
12095hunk ./src/allmydata/test/test_sftp.py 1334
12096         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
12097         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
12098         return d
12099+    test_makeDirectory.timeout = 15
12100 
12101     def test_execCommand_and_openShell(self):
12102         class FakeProtocol:
12103hunk ./src/allmydata/test/test_system.py 25
12104 from allmydata.monitor import Monitor
12105 from allmydata.mutable.common import NotWriteableError
12106 from allmydata.mutable import layout as mutable_layout
12107+from allmydata.mutable.publish import MutableData
12108 from foolscap.api import DeadReferenceError
12109 from twisted.python.failure import Failure
12110 from twisted.web.client import getPage
12111hunk ./src/allmydata/test/test_system.py 463
12112     def test_mutable(self):
12113         self.basedir = "system/SystemTest/test_mutable"
12114         DATA = "initial contents go here."  # 25 bytes % 3 != 0
12115+        DATA_uploadable = MutableData(DATA)
12116         NEWDATA = "new contents yay"
12117hunk ./src/allmydata/test/test_system.py 465
12118+        NEWDATA_uploadable = MutableData(NEWDATA)
12119         NEWERDATA = "this is getting old"
12120hunk ./src/allmydata/test/test_system.py 467
12121+        NEWERDATA_uploadable = MutableData(NEWERDATA)
12122 
12123         d = self.set_up_nodes(use_key_generator=True)
12124 
12125hunk ./src/allmydata/test/test_system.py 474
12126         def _create_mutable(res):
12127             c = self.clients[0]
12128             log.msg("starting create_mutable_file")
12129-            d1 = c.create_mutable_file(DATA)
12130+            d1 = c.create_mutable_file(DATA_uploadable)
12131             def _done(res):
12132                 log.msg("DONE: %s" % (res,))
12133                 self._mutable_node_1 = res
12134hunk ./src/allmydata/test/test_system.py 561
12135             self.failUnlessEqual(res, DATA)
12136             # replace the data
12137             log.msg("starting replace1")
12138-            d1 = newnode.overwrite(NEWDATA)
12139+            d1 = newnode.overwrite(NEWDATA_uploadable)
12140             d1.addCallback(lambda res: newnode.download_best_version())
12141             return d1
12142         d.addCallback(_check_download_3)
12143hunk ./src/allmydata/test/test_system.py 575
12144             newnode2 = self.clients[3].create_node_from_uri(uri)
12145             self._newnode3 = self.clients[3].create_node_from_uri(uri)
12146             log.msg("starting replace2")
12147-            d1 = newnode1.overwrite(NEWERDATA)
12148+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
12149             d1.addCallback(lambda res: newnode2.download_best_version())
12150             return d1
12151         d.addCallback(_check_download_4)
12152hunk ./src/allmydata/test/test_system.py 645
12153         def _check_empty_file(res):
12154             # make sure we can create empty files, this usually screws up the
12155             # segsize math
12156-            d1 = self.clients[2].create_mutable_file("")
12157+            d1 = self.clients[2].create_mutable_file(MutableData(""))
12158             d1.addCallback(lambda newnode: newnode.download_best_version())
12159             d1.addCallback(lambda res: self.failUnlessEqual("", res))
12160             return d1
12161hunk ./src/allmydata/test/test_system.py 676
12162                                  self.key_generator_svc.key_generator.pool_size + size_delta)
12163 
12164         d.addCallback(check_kg_poolsize, 0)
12165-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
12166+        d.addCallback(lambda junk:
12167+            self.clients[3].create_mutable_file(MutableData('hello, world')))
12168         d.addCallback(check_kg_poolsize, -1)
12169         d.addCallback(lambda junk: self.clients[3].create_dirnode())
12170         d.addCallback(check_kg_poolsize, -2)
12171hunk ./src/allmydata/test/test_web.py 28
12172 from allmydata.util.encodingutil import to_str
12173 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
12174      create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
12175-from allmydata.interfaces import IMutableFileNode
12176+from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
12177 from allmydata.mutable import servermap, publish, retrieve
12178 import allmydata.test.common_util as testutil
12179 from allmydata.test.no_network import GridTestMixin
12180hunk ./src/allmydata/test/test_web.py 57
12181         return FakeCHKFileNode(cap)
12182     def _create_mutable(self, cap):
12183         return FakeMutableFileNode(None, None, None, None).init_from_cap(cap)
12184-    def create_mutable_file(self, contents="", keysize=None):
12185+    def create_mutable_file(self, contents="", keysize=None,
12186+                            version=SDMF_VERSION):
12187         n = FakeMutableFileNode(None, None, None, None)
12188hunk ./src/allmydata/test/test_web.py 60
12189+        n.set_version(version)
12190         return n.create(contents)
12191 
12192 class FakeUploader(service.Service):
12193hunk ./src/allmydata/test/test_web.py 153
12194         self.nodemaker = FakeNodeMaker(None, self._secret_holder, None,
12195                                        self.uploader, None,
12196                                        None, None)
12197+        self.mutable_file_default = SDMF_VERSION
12198 
12199     def startService(self):
12200         return service.MultiService.startService(self)
12201hunk ./src/allmydata/test/test_web.py 756
12202                              self.PUT, base + "/@@name=/blah.txt", "")
12203         return d
12204 
12205+
12206     def test_GET_DIRURL_named_bad(self):
12207         base = "/file/%s" % urllib.quote(self._foo_uri)
12208         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
12209hunk ./src/allmydata/test/test_web.py 872
12210                                                       self.NEWFILE_CONTENTS))
12211         return d
12212 
12213+    def test_PUT_NEWFILEURL_unlinked_mdmf(self):
12214+        # this should get us a few segments of an MDMF mutable file,
12215+        # which we can then test for.
12216+        contents = self.NEWFILE_CONTENTS * 300000
12217+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12218+                     contents)
12219+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12220+        d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
12221+        return d
12222+
12223+    def test_PUT_NEWFILEURL_unlinked_sdmf(self):
12224+        contents = self.NEWFILE_CONTENTS * 300000
12225+        d = self.PUT("/uri?mutable=true&mutable-type=sdmf",
12226+                     contents)
12227+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12228+        d.addCallback(lambda json: self.failUnlessIn("sdmf", json))
12229+        return d
12230+
12231     def test_PUT_NEWFILEURL_range_bad(self):
12232         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
12233         target = self.public_url + "/foo/new.txt"
12234hunk ./src/allmydata/test/test_web.py 922
12235         return d
12236 
12237     def test_PUT_NEWFILEURL_mutable_toobig(self):
12238-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
12239-                             "413 Request Entity Too Large",
12240-                             "SDMF is limited to one segment, and 10001 > 10000",
12241-                             self.PUT,
12242-                             self.public_url + "/foo/new.txt?mutable=true",
12243-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
12244+        # It is okay to upload large mutable files, so we should be able
12245+        # to do that.
12246+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
12247+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
12248         return d
12249 
12250     def test_PUT_NEWFILEURL_replace(self):
12251hunk ./src/allmydata/test/test_web.py 1020
12252         d.addCallback(_check1)
12253         return d
12254 
12255+    def test_GET_FILEURL_json_mutable_type(self):
12256+        # The JSON should include mutable-type, which says whether the
12257+        # file is SDMF or MDMF
12258+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12259+                     self.NEWFILE_CONTENTS * 300000)
12260+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12261+        def _got_json(json, version):
12262+            data = simplejson.loads(json)
12263+            assert "filenode" == data[0]
12264+            data = data[1]
12265+            assert isinstance(data, dict)
12266+
12267+            self.failUnlessIn("mutable-type", data)
12268+            self.failUnlessEqual(data['mutable-type'], version)
12269+
12270+        d.addCallback(_got_json, "mdmf")
12271+        # Now make an SDMF file and check that it is reported correctly.
12272+        d.addCallback(lambda ignored:
12273+            self.PUT("/uri?mutable=true&mutable-type=sdmf",
12274+                      self.NEWFILE_CONTENTS * 300000))
12275+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12276+        d.addCallback(_got_json, "sdmf")
12277+        return d
12278+
12279     def test_GET_FILEURL_json_missing(self):
12280         d = self.GET(self.public_url + "/foo/missing?json")
12281         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
12282hunk ./src/allmydata/test/test_web.py 1082
12283         d.addBoth(self.should404, "test_GET_FILEURL_uri_missing")
12284         return d
12285 
12286-    def test_GET_DIRECTORY_html_banner(self):
12287+    def test_GET_DIRECTORY_html(self):
12288         d = self.GET(self.public_url + "/foo", followRedirect=True)
12289         def _check(res):
12290             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
12291hunk ./src/allmydata/test/test_web.py 1086
12292+            self.failUnlessIn("mutable-type-mdmf", res)
12293+            self.failUnlessIn("mutable-type-sdmf", res)
12294         d.addCallback(_check)
12295         return d
12296 
12297hunk ./src/allmydata/test/test_web.py 1091
12298+    def test_GET_root_html(self):
12299+        # make sure that we have the option to upload an unlinked
12300+        # mutable file in SDMF and MDMF formats.
12301+        d = self.GET("/")
12302+        def _got_html(html):
12303+            # These are radio buttons that allow the user to toggle
12304+            # whether a particular mutable file is MDMF or SDMF.
12305+            self.failUnlessIn("mutable-type-mdmf", html)
12306+            self.failUnlessIn("mutable-type-sdmf", html)
12307+        d.addCallback(_got_html)
12308+        return d
12309+
12310+    def test_mutable_type_defaults(self):
12311+        # The checked="checked" attribute of the inputs corresponding to
12312+        # the mutable-type parameter should change as expected with the
12313+        # value configured in tahoe.cfg.
12314+        #
12315+        # By default, the value configured with the client is
12316+        # SDMF_VERSION, so that should be checked.
12317+        assert self.s.mutable_file_default == SDMF_VERSION
12318+
12319+        d = self.GET("/")
12320+        def _got_html(html, value):
12321+            i = 'input checked="checked" type="radio" id="mutable-type-%s"'
12322+            self.failUnlessIn(i % value, html)
12323+        d.addCallback(_got_html, "sdmf")
12324+        d.addCallback(lambda ignored:
12325+            self.GET(self.public_url + "/foo", followRedirect=True))
12326+        d.addCallback(_got_html, "sdmf")
12327+        # Now switch the configuration value to MDMF. The MDMF radio
12328+        # buttons should now be checked on these pages.
12329+        def _swap_values(ignored):
12330+            self.s.mutable_file_default = MDMF_VERSION
12331+        d.addCallback(_swap_values)
12332+        d.addCallback(lambda ignored: self.GET("/"))
12333+        d.addCallback(_got_html, "mdmf")
12334+        d.addCallback(lambda ignored:
12335+            self.GET(self.public_url + "/foo", followRedirect=True))
12336+        d.addCallback(_got_html, "mdmf")
12337+        return d
12338+
12339     def test_GET_DIRURL(self):
12340         # the addSlash means we get a redirect here
12341         # from /uri/$URI/foo/ , we need ../../../ to get back to the root
12342hunk ./src/allmydata/test/test_web.py 1221
12343         d.addCallback(self.failUnlessIsFooJSON)
12344         return d
12345 
12346+    def test_GET_DIRURL_json_mutable_type(self):
12347+        d = self.PUT(self.public_url + \
12348+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12349+                     self.NEWFILE_CONTENTS * 300000)
12350+        d.addCallback(lambda ignored:
12351+            self.PUT(self.public_url + \
12352+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12353+                     self.NEWFILE_CONTENTS * 300000))
12354+        # Now we have an MDMF and SDMF file in the directory. If we GET
12355+        # its JSON, we should see their encodings.
12356+        d.addCallback(lambda ignored:
12357+            self.GET(self.public_url + "/foo?t=json"))
12358+        def _got_json(json):
12359+            data = simplejson.loads(json)
12360+            assert data[0] == "dirnode"
12361+
12362+            data = data[1]
12363+            kids = data['children']
12364+
12365+            mdmf_data = kids['mdmf.txt'][1]
12366+            self.failUnlessIn("mutable-type", mdmf_data)
12367+            self.failUnlessEqual(mdmf_data['mutable-type'], "mdmf")
12368+
12369+            sdmf_data = kids['sdmf.txt'][1]
12370+            self.failUnlessIn("mutable-type", sdmf_data)
12371+            self.failUnlessEqual(sdmf_data['mutable-type'], "sdmf")
12372+        d.addCallback(_got_json)
12373+        return d
12374+
12375 
12376     def test_POST_DIRURL_manifest_no_ophandle(self):
12377         d = self.shouldFail2(error.Error,
12378hunk ./src/allmydata/test/test_web.py 1804
12379         return d
12380 
12381     def test_POST_upload_no_link_mutable_toobig(self):
12382-        d = self.shouldFail2(error.Error,
12383-                             "test_POST_upload_no_link_mutable_toobig",
12384-                             "413 Request Entity Too Large",
12385-                             "SDMF is limited to one segment, and 10001 > 10000",
12386-                             self.POST,
12387-                             "/uri", t="upload", mutable="true",
12388-                             file=("new.txt",
12389-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12390+        # The SDMF size limit is no longer in place, so we should be
12391+        # able to upload mutable files that are as large as we want them
12392+        # to be.
12393+        d = self.POST("/uri", t="upload", mutable="true",
12394+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12395         return d
12396 
12397hunk ./src/allmydata/test/test_web.py 1811
12398+
12399+    def test_POST_upload_mutable_type_unlinked(self):
12400+        d = self.POST("/uri?t=upload&mutable=true&mutable-type=sdmf",
12401+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12402+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12403+        def _got_json(json, version):
12404+            data = simplejson.loads(json)
12405+            data = data[1]
12406+
12407+            self.failUnlessIn("mutable-type", data)
12408+            self.failUnlessEqual(data['mutable-type'], version)
12409+        d.addCallback(_got_json, "sdmf")
12410+        d.addCallback(lambda ignored:
12411+            self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
12412+                      file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
12413+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12414+        d.addCallback(_got_json, "mdmf")
12415+        return d
12416+
12417+    def test_POST_upload_mutable_type(self):
12418+        d = self.POST(self.public_url + \
12419+                      "/foo?t=upload&mutable=true&mutable-type=sdmf",
12420+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12421+        fn = self._foo_node
12422+        def _got_cap(filecap, filename):
12423+            filenameu = unicode(filename)
12424+            self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
12425+            return self.GET(self.public_url + "/foo/%s?t=json" % filename)
12426+        d.addCallback(_got_cap, "sdmf.txt")
12427+        def _got_json(json, version):
12428+            data = simplejson.loads(json)
12429+            data = data[1]
12430+
12431+            self.failUnlessIn("mutable-type", data)
12432+            self.failUnlessEqual(data['mutable-type'], version)
12433+        d.addCallback(_got_json, "sdmf")
12434+        d.addCallback(lambda ignored:
12435+            self.POST(self.public_url + \
12436+                      "/foo?t=upload&mutable=true&mutable-type=mdmf",
12437+                      file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
12438+        d.addCallback(_got_cap, "mdmf.txt")
12439+        d.addCallback(_got_json, "mdmf")
12440+        return d
12441+
12442     def test_POST_upload_mutable(self):
12443         # this creates a mutable file
12444         d = self.POST(self.public_url + "/foo", t="upload", mutable="true",
12445hunk ./src/allmydata/test/test_web.py 1979
12446             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
12447         d.addCallback(_got_headers)
12448 
12449-        # make sure that size errors are displayed correctly for overwrite
12450-        d.addCallback(lambda res:
12451-                      self.shouldFail2(error.Error,
12452-                                       "test_POST_upload_mutable-toobig",
12453-                                       "413 Request Entity Too Large",
12454-                                       "SDMF is limited to one segment, and 10001 > 10000",
12455-                                       self.POST,
12456-                                       self.public_url + "/foo", t="upload",
12457-                                       mutable="true",
12458-                                       file=("new.txt",
12459-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
12460-                                       ))
12461-
12462+        # make sure that outdated size limits aren't enforced anymore.
12463+        d.addCallback(lambda ignored:
12464+            self.POST(self.public_url + "/foo", t="upload",
12465+                      mutable="true",
12466+                      file=("new.txt",
12467+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
12468         d.addErrback(self.dump_error)
12469         return d
12470 
12471hunk ./src/allmydata/test/test_web.py 1989
12472     def test_POST_upload_mutable_toobig(self):
12473-        d = self.shouldFail2(error.Error,
12474-                             "test_POST_upload_mutable_toobig",
12475-                             "413 Request Entity Too Large",
12476-                             "SDMF is limited to one segment, and 10001 > 10000",
12477-                             self.POST,
12478-                             self.public_url + "/foo",
12479-                             t="upload", mutable="true",
12480-                             file=("new.txt",
12481-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12482+        # SDMF had a size limti that was removed a while ago. MDMF has
12483+        # never had a size limit. Test to make sure that we do not
12484+        # encounter errors when trying to upload large mutable files,
12485+        # since there should be no coded prohibitions regarding large
12486+        # mutable files.
12487+        d = self.POST(self.public_url + "/foo",
12488+                      t="upload", mutable="true",
12489+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12490         return d
12491 
12492     def dump_error(self, f):
12493hunk ./src/allmydata/test/test_web.py 2999
12494                                                       contents))
12495         return d
12496 
12497+    def test_PUT_NEWFILEURL_mdmf(self):
12498+        new_contents = self.NEWFILE_CONTENTS * 300000
12499+        d = self.PUT(self.public_url + \
12500+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12501+                     new_contents)
12502+        d.addCallback(lambda ignored:
12503+            self.GET(self.public_url + "/foo/mdmf.txt?t=json"))
12504+        def _got_json(json):
12505+            data = simplejson.loads(json)
12506+            data = data[1]
12507+            self.failUnlessIn("mutable-type", data)
12508+            self.failUnlessEqual(data['mutable-type'], "mdmf")
12509+        d.addCallback(_got_json)
12510+        return d
12511+
12512+    def test_PUT_NEWFILEURL_sdmf(self):
12513+        new_contents = self.NEWFILE_CONTENTS * 300000
12514+        d = self.PUT(self.public_url + \
12515+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12516+                     new_contents)
12517+        d.addCallback(lambda ignored:
12518+            self.GET(self.public_url + "/foo/sdmf.txt?t=json"))
12519+        def _got_json(json):
12520+            data = simplejson.loads(json)
12521+            data = data[1]
12522+            self.failUnlessIn("mutable-type", data)
12523+            self.failUnlessEqual(data['mutable-type'], "sdmf")
12524+        d.addCallback(_got_json)
12525+        return d
12526+
12527     def test_PUT_NEWFILEURL_uri_replace(self):
12528         contents, n, new_uri = self.makefile(8)
12529         d = self.PUT(self.public_url + "/foo/bar.txt?t=uri", new_uri)
12530hunk ./src/allmydata/test/test_web.py 3150
12531         d.addCallback(_done)
12532         return d
12533 
12534+
12535+    def test_PUT_update_at_offset(self):
12536+        file_contents = "test file" * 100000 # about 900 KiB
12537+        d = self.PUT("/uri?mutable=true", file_contents)
12538+        def _then(filecap):
12539+            self.filecap = filecap
12540+            new_data = file_contents[:100]
12541+            new = "replaced and so on"
12542+            new_data += new
12543+            new_data += file_contents[len(new_data):]
12544+            assert len(new_data) == len(file_contents)
12545+            self.new_data = new_data
12546+        d.addCallback(_then)
12547+        d.addCallback(lambda ignored:
12548+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
12549+                     "replaced and so on"))
12550+        def _get_data(filecap):
12551+            n = self.s.create_node_from_uri(filecap)
12552+            return n.download_best_version()
12553+        d.addCallback(_get_data)
12554+        d.addCallback(lambda results:
12555+            self.failUnlessEqual(results, self.new_data))
12556+        # Now try appending things to the file
12557+        d.addCallback(lambda ignored:
12558+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
12559+                     "puppies" * 100))
12560+        d.addCallback(_get_data)
12561+        d.addCallback(lambda results:
12562+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
12563+        return d
12564+
12565+
12566+    def test_PUT_update_at_offset_immutable(self):
12567+        file_contents = "Test file" * 100000
12568+        d = self.PUT("/uri", file_contents)
12569+        def _then(filecap):
12570+            self.filecap = filecap
12571+        d.addCallback(_then)
12572+        d.addCallback(lambda ignored:
12573+            self.shouldHTTPError("test immutable update",
12574+                                 400, "Bad Request",
12575+                                 "immutable",
12576+                                 self.PUT,
12577+                                 "/uri/%s?offset=50" % self.filecap,
12578+                                 "foo"))
12579+        return d
12580+
12581+
12582     def test_bad_method(self):
12583         url = self.webish_url + self.public_url + "/foo/bar.txt"
12584         d = self.shouldHTTPError("test_bad_method",
12585hunk ./src/allmydata/test/test_web.py 3451
12586         def _stash_mutable_uri(n, which):
12587             self.uris[which] = n.get_uri()
12588             assert isinstance(self.uris[which], str)
12589-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12590+        d.addCallback(lambda ign:
12591+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12592         d.addCallback(_stash_mutable_uri, "corrupt")
12593         d.addCallback(lambda ign:
12594                       c0.upload(upload.Data("literal", convergence="")))
12595hunk ./src/allmydata/test/test_web.py 3598
12596         def _stash_mutable_uri(n, which):
12597             self.uris[which] = n.get_uri()
12598             assert isinstance(self.uris[which], str)
12599-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12600+        d.addCallback(lambda ign:
12601+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12602         d.addCallback(_stash_mutable_uri, "corrupt")
12603 
12604         def _compute_fileurls(ignored):
12605hunk ./src/allmydata/test/test_web.py 4261
12606         def _stash_mutable_uri(n, which):
12607             self.uris[which] = n.get_uri()
12608             assert isinstance(self.uris[which], str)
12609-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
12610+        d.addCallback(lambda ign:
12611+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
12612         d.addCallback(_stash_mutable_uri, "mutable")
12613 
12614         def _compute_fileurls(ignored):
12615hunk ./src/allmydata/test/test_web.py 4361
12616                                                         convergence="")))
12617         d.addCallback(_stash_uri, "small")
12618 
12619-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
12620+        d.addCallback(lambda ign:
12621+            c0.create_mutable_file(publish.MutableData("mutable")))
12622         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
12623         d.addCallback(_stash_uri, "mutable")
12624 
12625}
12626
12627Context:
12628
12629[docs: doc of the download status page
12630zooko@zooko.com**20100814054117
12631 Ignore-this: a82ec33da3c39a7c0d47a7a6b5f81bbb
12632 ref: http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1169#comment:1
12633] 
12634[docs: NEWS: edit English usage, remove ticket numbers for regressions vs. 1.7.1 that were fixed again before 1.8.0c2
12635zooko@zooko.com**20100811071758
12636 Ignore-this: 993f5a1e6a9535f5b7a0bd77b93b66d0
12637] 
12638[docs: NEWS: more detail about new-downloader
12639zooko@zooko.com**20100811071303
12640 Ignore-this: 9f07da4dce9d794ce165aae287f29a1e
12641] 
12642[TAG allmydata-tahoe-1.8.0c2
12643david-sarah@jacaranda.org**20100810073847
12644 Ignore-this: c37f732b0e45f9ebfdc2f29c0899aeec
12645] 
12646[quickstart.html: update tarball link.
12647david-sarah@jacaranda.org**20100810073832
12648 Ignore-this: 4fcf9a7ec9d0de297c8ed4f29af50d71
12649] 
12650[webapi.txt: fix grammatical error.
12651david-sarah@jacaranda.org**20100810064127
12652 Ignore-this: 64f66aa71682195f82ac1066fe947e35
12653] 
12654[relnotes.txt: update revision of NEWS.
12655david-sarah@jacaranda.org**20100810063243
12656 Ignore-this: cf9eb342802d19f3a8004acd123fd46e
12657] 
12658[NEWS, relnotes and known-issues for 1.8.0c2.
12659david-sarah@jacaranda.org**20100810062851
12660 Ignore-this: bf319506558f6ba053fd896823c96a20
12661] 
12662[DownloadStatus: put real numbers in progress/status rows, not placeholders.
12663Brian Warner <warner@lothar.com>**20100810060603
12664 Ignore-this: 1f9dcd47c06cb356fc024d7bb8e24115
12665 Improve tests.
12666] 
12667[web download-status: tolerate DYHBs that haven't retired yet. Fixes #1160.
12668Brian Warner <warner@lothar.com>**20100809225100
12669 Ignore-this: cb0add71adde0a2e24f4bcc00abf9938
12670 
12671 Also add a better unit test for it.
12672] 
12673[immutable/filenode.py: put off DownloadStatus creation until first read() call
12674Brian Warner <warner@lothar.com>**20100809225055
12675 Ignore-this: 48564598f236eb73e96cd2d2a21a2445
12676 
12677 This avoids spamming the "recent uploads and downloads" /status page from
12678 FileNode instances that were created for a directory read but which nobody is
12679 ever going to read from. I also cleaned up the way DownloadStatus instances
12680 are made to only ever do it in the CiphertextFileNode, not in the
12681 higher-level plaintext FileNode. Also fixed DownloadStatus handling of read
12682 size, thanks to David-Sarah for the catch.
12683] 
12684[Share: hush log entries in the main loop() after the fetch has been completed.
12685Brian Warner <warner@lothar.com>**20100809204359
12686 Ignore-this: 72b9e262980edf5a967873ebbe1e9479
12687] 
12688[test_runner.py: correct and simplify normalization of package directory for case-insensitive filesystems.
12689david-sarah@jacaranda.org**20100808185005
12690 Ignore-this: fba96e967d4e7f33f301c7d56b577de
12691] 
12692[test_runner.py: make test_path work for test-from-installdir.
12693david-sarah@jacaranda.org**20100808171340
12694 Ignore-this: 46328d769ae6ec8d191c3cddacc91dc9
12695] 
12696[src/allmydata/__init__.py: make the package paths more accurate when we fail to get them from setuptools.
12697david-sarah@jacaranda.org**20100808171235
12698 Ignore-this: 8d534d2764d64f7434880bd70696cd75
12699] 
12700[test_runner.py: another try at calculating the rootdir correctly for test-from-egg and test-from-prefixdir.
12701david-sarah@jacaranda.org**20100808154307
12702 Ignore-this: 66737313935f2a0313d1de9b2ed68d0
12703] 
12704[test_runner.py: calculate the location of bin/tahoe correctly for test-from-prefixdir (by copying code from misc/build_helpers/run_trial.py). Also fix the false-positive check for Unicode paths in test_the_right_code, which was causing skips that should have been failures.
12705david-sarah@jacaranda.org**20100808042817
12706 Ignore-this: 1b7dfff07cbfb1a74f94141b18da2c3f
12707] 
12708[TAG allmydata-tahoe-1.8.0c1
12709david-sarah@jacaranda.org**20100807004546
12710 Ignore-this: 484ff2513774f3b48ca49c992e878b89
12711] 
12712[how_to_make_a_tahoe-lafs_release.txt: add step to check that release will report itself as the intended version.
12713david-sarah@jacaranda.org**20100807004254
12714 Ignore-this: 7709322e883f4118f38c7f042f5a9a2
12715] 
12716[relnotes.txt: 1.8.0c1 release
12717david-sarah@jacaranda.org**20100807003646
12718 Ignore-this: 1994ffcaf55089eb05e96c23c037dfee
12719] 
12720[NEWS, quickstart.html and known_issues.txt for 1.8.0c1 release.
12721david-sarah@jacaranda.org**20100806235111
12722 Ignore-this: 777cea943685cf2d48b6147a7648fca0
12723] 
12724[TAG allmydata-tahoe-1.8.0rc1
12725warner@lothar.com**20100806080450] 
12726[update NEWS and other docs in preparation for 1.8.0rc1
12727Brian Warner <warner@lothar.com>**20100806080228
12728 Ignore-this: 6ebdf11806f6dfbfde0b61115421a459
12729 
12730 in particular, merge the various 1.8.0b1/b2 sections, and remove the
12731 datestamp. NEWS gets updated just before a release, doesn't need to precisely
12732 describe pre-release candidates, and the datestamp gets updated just before
12733 the final release is tagged
12734 
12735 Also, I removed the BOM from some files. My toolchain made it hard to retain,
12736 and BOMs in UTF-8 don't make a whole lot of sense anyway. Sorry if that
12737 messes anything up.
12738] 
12739[downloader.Segmentation: unregisterProducer when asked to stopProducing, this
12740Brian Warner <warner@lothar.com>**20100806070705
12741 Ignore-this: a0a71dcf83df8a6f727deb9a61fa4fdf
12742 seems to avoid the #1155 log message which reveals the URI (and filecap).
12743 
12744 Also add an [ERROR] marker to the flog entry, since unregisterProducer also
12745 makes interrupted downloads appear "200 OK"; this makes it more obvious that
12746 the download did not complete.
12747] 
12748[TAG allmydata-tahoe-1.8.0b2
12749david-sarah@jacaranda.org**20100806052415
12750 Ignore-this: 2c1af8df5e25a6ebd90a32b49b8486dc
12751] 
12752[relnotes.txt and docs/known_issues.txt for 1.8.0beta2.
12753david-sarah@jacaranda.org**20100806040823
12754 Ignore-this: 862ad55d93ee37259ded9e2c9da78eb9
12755] 
12756[test_util.py: use SHA-256 from pycryptopp instead of MD5 from hashlib (for uses in which any hash will do), since hashlib was only added to the stdlib in Python 2.5.
12757david-sarah@jacaranda.org**20100806050051
12758 Ignore-this: 552049b5d190a5ca775a8240030dbe3f
12759] 
12760[test_runner.py: increase timeout to cater for Francois' ARM buildslave.
12761david-sarah@jacaranda.org**20100806042601
12762 Ignore-this: 6ee618cf00ac1c99cb7ddb60fd7ef078
12763] 
12764[test_util.py: remove use of 'a if p else b' syntax that requires Python 2.5.
12765david-sarah@jacaranda.org**20100806041616
12766 Ignore-this: 5fecba9aa530ef352797fcfa70d5c592
12767] 
12768[NEWS and docs/quickstart.html for 1.8.0beta2.
12769david-sarah@jacaranda.org**20100806035112
12770 Ignore-this: 3a593cfdc2ae265da8f64c6c8aebae4
12771] 
12772[docs/quickstart.html: remove link to tahoe-lafs-ticket798-1.8.0b.zip, due to appname regression. refs #1159
12773david-sarah@jacaranda.org**20100806002435
12774 Ignore-this: bad61b30cdcc3d93b4165d5800047b85
12775] 
12776[test_download.DownloadTest.test_simultaneous_goodguess: enable some disabled
12777Brian Warner <warner@lothar.com>**20100805185507
12778 Ignore-this: ac53d44643805412238ccbfae920d20c
12779 checks that used to fail but work now.
12780] 
12781[DownloadNode: fix lost-progress in fetch_failed, tolerate cancel when no segment-fetch is active. Fixes #1154.
12782Brian Warner <warner@lothar.com>**20100805185507
12783 Ignore-this: 35fd36b273b21b6dca12ab3d11ee7d2d
12784 
12785 The lost-progress bug occurred when two simultanous read() calls fetched
12786 different segments, and the first one failed (due to corruption, or the other
12787 bugs in #1154): the second read() would never complete. While in this state,
12788 cancelling the second read by having its consumer call stopProducing) would
12789 trigger the cancel-intolerance bug. Finally, in downloader.node.Cancel,
12790 prevent late cancels by adding an 'active' flag
12791] 
12792[util/spans.py: __nonzero__ cannot return a long either. for #1154
12793Brian Warner <warner@lothar.com>**20100805185507
12794 Ignore-this: 6f87fead8252e7a820bffee74a1c51a2
12795] 
12796[test_storage.py: change skip note for test_large_share to say that Windows doesn't support sparse files. refs #569
12797david-sarah@jacaranda.org**20100805022612
12798 Ignore-this: 85c807a536dc4eeb8bf14980028bb05b
12799] 
12800[One fix for bug #1154: webapi GETs with a 'Range' header broke new-downloader.
12801Brian Warner <warner@lothar.com>**20100804184549
12802 Ignore-this: ffa3e703093a905b416af125a7923b7b
12803 
12804 The Range header causes n.read() to be called with an offset= of type 'long',
12805 which eventually got used in a Spans/DataSpans object's __len__ method.
12806 Apparently python doesn't permit __len__() to return longs, only ints.
12807 Rewrote Spans/DataSpans to use s.len() instead of len(s) aka s.__len__() .
12808 Added a test in test_download. Note that test_web didn't catch this because
12809 it uses mock FileNodes for speed: it's probably time to rewrite that.
12810 
12811 There is still an unresolved error-recovery problem in #1154, so I'm not
12812 closing the ticket quite yet.
12813] 
12814[test_download: minor cleanup
12815Brian Warner <warner@lothar.com>**20100804175555
12816 Ignore-this: f4aec3c77f6a0d7f7b2c07f302755cc1
12817] 
12818[fetcher.py: improve comments
12819Brian Warner <warner@lothar.com>**20100804072814
12820 Ignore-this: 8bf74c21aef55cf0b0642e55ee4e7c5f
12821] 
12822[lazily create DownloadNode upon first read()/get_segment()
12823Brian Warner <warner@lothar.com>**20100804072808
12824 Ignore-this: 4bb1c49290cefac1dadd9d42fac46ba2
12825] 
12826[test_hung_server: update comments, remove dead "stage_4_d" code
12827Brian Warner <warner@lothar.com>**20100804072800
12828 Ignore-this: 4d18b374b568237603466f93346d00db
12829] 
12830[copy the rest of David-Sarah's changes to make my tree match 1.8.0beta
12831Brian Warner <warner@lothar.com>**20100804072752
12832 Ignore-this: 9ac7f21c9b27e53452371096146be5bb
12833] 
12834[ShareFinder: add 10s OVERDUE timer, send new requests to replace overdue ones
12835Brian Warner <warner@lothar.com>**20100804072741
12836 Ignore-this: 7fa674edbf239101b79b341bb2944349
12837 
12838 The fixed 10-second timer will eventually be replaced with a per-server
12839 value, calculated based on observed response times.
12840 
12841 test_hung_server.py: enhance to exercise DYHB=OVERDUE state. Split existing
12842 mutable+immutable tests into two pieces for clarity. Reenabled several tests.
12843 Deleted the now-obsolete "test_failover_during_stage_4".
12844] 
12845[Rewrite immutable downloader (#798). This patch adds and updates unit tests.
12846Brian Warner <warner@lothar.com>**20100804072710
12847 Ignore-this: c3c838e124d67b39edaa39e002c653e1
12848] 
12849[Rewrite immutable downloader (#798). This patch includes higher-level
12850Brian Warner <warner@lothar.com>**20100804072702
12851 Ignore-this: 40901ddb07d73505cb58d06d9bff73d9
12852 integration into the NodeMaker, and updates the web-status display to handle
12853 the new download events.
12854] 
12855[Rewrite immutable downloader (#798). This patch rearranges the rest of src/allmydata/immutable/ .
12856Brian Warner <warner@lothar.com>**20100804072639
12857 Ignore-this: 302b1427a39985bfd11ccc14a1199ea4
12858] 
12859[Rewrite immutable downloader (#798). This patch adds the new downloader itself.
12860Brian Warner <warner@lothar.com>**20100804072629
12861 Ignore-this: e9102460798123dd55ddca7653f4fc16
12862] 
12863[util/observer.py: add EventStreamObserver
12864Brian Warner <warner@lothar.com>**20100804072612
12865 Ignore-this: fb9d205f34a6db7580b9be33414dfe21
12866] 
12867[Add a byte-spans utility class, like perl's Set::IntSpan for .newsrc files.
12868Brian Warner <warner@lothar.com>**20100804072600
12869 Ignore-this: bbad42104aeb2f26b8dd0779de546128
12870 Also a data-spans class, which records a byte (instead of a bit) for each
12871 index.
12872] 
12873[check-umids: oops, forgot to add the tool
12874Brian Warner <warner@lothar.com>**20100804071713
12875 Ignore-this: bbeb74d075414f3713fabbdf66189faf
12876] 
12877[coverage tools: ignore errors, display lines-uncovered in elisp mode. Fix Makefile paths.
12878"Brian Warner <warner@lothar.com>"**20100804071131] 
12879[check-umids: new tool to check uniqueness of umids
12880"Brian Warner <warner@lothar.com>"**20100804071042] 
12881[misc/simulators/sizes.py: update, we now use SHA256 (not SHA1), so large-file overhead grows to 0.5%
12882"Brian Warner <warner@lothar.com>"**20100804070942] 
12883[storage-overhead: try to fix, probably still broken
12884"Brian Warner <warner@lothar.com>"**20100804070815] 
12885[docs/quickstart.html: link to 1.8.0beta zip, and note 'bin\tahoe' on Windows.
12886david-sarah@jacaranda.org**20100803233254
12887 Ignore-this: 3c11f249efc42a588e3a7056349739ed
12888] 
12889[docs: relnotes.txt for 1.8.0β
12890zooko@zooko.com**20100803154913
12891 Ignore-this: d9101f72572b18da3cfac3c0e272c907
12892] 
12893[test_storage.py: avoid spurious test failure by accepting either 'Next crawl in 59 minutes' or 'Next crawl in 60 minutes'. fixes #1140
12894david-sarah@jacaranda.org**20100803102058
12895 Ignore-this: aa2419fc295727e4fbccec3c7b780e76
12896] 
12897[misc/build_helpers/show-tool-versions.py: get sys.std{out,err}.encoding and 'as' version correctly, and improve formatting.
12898david-sarah@jacaranda.org**20100803101128
12899 Ignore-this: 4fd2907d86da58eb220e104010e9c6a
12900] 
12901[misc/build_helpers/show-tool-versions.py: avoid error message when 'as -version' does not create a.out.
12902david-sarah@jacaranda.org**20100803094812
12903 Ignore-this: 38fc2d639f30b4e123b9551e6931998d
12904] 
12905[CLI: further improve consistency of basedir options and add tests. addresses #118
12906david-sarah@jacaranda.org**20100803085416
12907 Ignore-this: d8f8f55738abb5ea44ed4cf24d750efe
12908] 
12909[CLI: make the synopsis for 'tahoe unlink' say unlink instead of rm.
12910david-sarah@jacaranda.org**20100803085359
12911 Ignore-this: c35d3f99f906dfab61df8f5e81a42c92
12912] 
12913[CLI: make all of the option descriptions imperative sentences.
12914david-sarah@jacaranda.org**20100803084801
12915 Ignore-this: ec80c7d2a10c6452d190fee4e1a60739
12916] 
12917[test_cli.py: make 'tahoe mkdir' tests slightly less dumb (check for 'URI:' in the output).
12918david-sarah@jacaranda.org**20100803084720
12919 Ignore-this: 31a4ae4fb5f7c123bc6b6e36a9e3911e
12920] 
12921[test_cli.py: use u-escapes instead of UTF-8.
12922david-sarah@jacaranda.org**20100803083538
12923 Ignore-this: a48af66942defe8491c6e1811c7809b5
12924] 
12925[NEWS: remove XXX comment and separate description of #890.
12926david-sarah@jacaranda.org**20100803050827
12927 Ignore-this: 6d308f34dc9d929d3d0811f7a1f5c786
12928] 
12929[docs: more updates to NEWS for 1.8.0β
12930zooko@zooko.com**20100803044618
12931 Ignore-this: 8193a1be38effe2bdcc632fdb570e9fc
12932] 
12933[docs: incomplete beginnings of a NEWS update for v1.8β
12934zooko@zooko.com**20100802072840
12935 Ignore-this: cb00fcd4f1e0eaed8c8341014a2ba4d4
12936] 
12937[docs/quickstart.html: extra step to open a new Command Prompt or log out/in on Windows.
12938david-sarah@jacaranda.org**20100803004938
12939 Ignore-this: 1334a2cd01f77e0c9eddaeccfeff2370
12940] 
12941[update bundled zetuptools with doc changes, change to script setup for Windows XP, and to have the 'develop' command run script setup.
12942david-sarah@jacaranda.org**20100803003815
12943 Ignore-this: 73c86e154f4d3f7cc9855eb31a20b1ed
12944] 
12945[bundled setuptools/command/scriptsetup.py: use SendMessageTimeoutW, to test whether that broadcasts environment changes any better.
12946david-sarah@jacaranda.org**20100802224505
12947 Ignore-this: 7788f7c2f9355e7852a376ec94182056
12948] 
12949[bundled zetuptoolz: add missing setuptools/command/scriptsetup.py
12950david-sarah@jacaranda.org**20100802072129
12951 Ignore-this: 794b1c411f6cdec76eeb716223a55d0
12952] 
12953[test_runner.py: add test_run_with_python_options, which checks that the Windows script changes haven't broken 'python <options> bin/tahoe'.
12954david-sarah@jacaranda.org**20100802062558
12955 Ignore-this: 812a2ccb7d9c7a8e01d5ca04d875aba5
12956] 
12957[test_runner.py: fix missing import of get_filesystem_encoding
12958david-sarah@jacaranda.org**20100802060902
12959 Ignore-this: 2e9e439b7feb01e0c3c94b54e802503b
12960] 
12961[Bundle setuptools-0.6c16dev (with Windows script changes, and the change to only warn if site.py wasn't generated by setuptools) instead of 0.6c15dev. addresses #565, #1073, #1074
12962david-sarah@jacaranda.org**20100802060602
12963 Ignore-this: 34ee2735e49e2c05b57e353d48f83050
12964] 
12965[.darcs-boringfile: changes needed to take account of egg directories being bundled. Also, make _trial_temp a prefix rather than exact match.
12966david-sarah@jacaranda.org**20100802050313
12967 Ignore-this: 8de6a8dbaba014ba88dec6c792fc5a9d
12968] 
12969[.darcs-boringfile: changes needed to take account of pyscript wrappers on Windows.
12970david-sarah@jacaranda.org**20100802050128
12971 Ignore-this: 7366b631e2095166696e6da5765d9180
12972] 
12973[misc/build_helpers/run_trial.py: check that the root from which the module we are testing was loaded is the current directory. This version of the patch folds in later fixes to the logic for caculating the directories to compare, and improvements to error messages. addresses #1137
12974david-sarah@jacaranda.org**20100802045535
12975 Ignore-this: 9d3c1447f0539c6308127413098eb646
12976] 
12977[Skip option arguments to the python interpreter when reconstructing Unicode argv on Windows.
12978david-sarah@jacaranda.org**20100728062731
12979 Ignore-this: 2b17fc43860bcc02a66bb6e5e050ea7c
12980] 
12981[windows/fixups.py: improve comments and reference some relevant Python bugs.
12982david-sarah@jacaranda.org**20100727181921
12983 Ignore-this: 32e61cf98dfc2e3dac60b750dda6429b
12984] 
12985[windows/fixups.py: make errors reported to original_stderr have enough information to debug even if we can't see the traceback.
12986david-sarah@jacaranda.org**20100726221904
12987 Ignore-this: e30b4629a7aa5d71554237c7e809c080
12988] 
12989[windows/fixups.py: fix paste-o in name of Unicode stderr wrapper.
12990david-sarah@jacaranda.org**20100726214736
12991 Ignore-this: cb220931f1683eb53b0c7269e18a38be
12992] 
12993[windows/fixups.py: Don't rely on buggy MSVCRT library for Unicode output, use the Win32 API instead. This should make it work on XP. Also, change how we handle the case where sys.stdout and sys.stderr are redirected, since the .encoding attribute isn't necessarily writeable.
12994david-sarah@jacaranda.org**20100726045019
12995 Ignore-this: 69267abc5065cbd5b86ca71fe4921fb6
12996] 
12997[test_runner.py: change to code for locating the bin/tahoe script that was missed when rebasing the patch for #1074.
12998david-sarah@jacaranda.org**20100725182008
12999 Ignore-this: d891a93989ecc3f4301a17110c3d196c
13000] 
13001[Add missing windows/fixups.py (for setting up Unicode args and output on Windows).
13002david-sarah@jacaranda.org**20100725092849
13003 Ignore-this: 35a1e8aeb4e1dea6e81433bf0825a6f6
13004] 
13005[Changes to Tahoe needed to work with new zetuptoolz (that does not use .exe wrappers on Windows), and to support Unicode arguments and stdout/stderr -- v5
13006david-sarah@jacaranda.org**20100725083216
13007 Ignore-this: 5041a634b1328f041130658233f6a7ce
13008] 
13009[scripts/common.py: fix an error introduced when rebasing to the ticket798 branch, which caused base directories to be duplicated in self.basedirs.
13010david-sarah@jacaranda.org**20100802064929
13011 Ignore-this: 116fd437d1f91a647879fe8d9510f513
13012] 
13013[Basedir/node directory option improvements for ticket798 branch. addresses #188, #706, #715, #772, #890
13014david-sarah@jacaranda.org**20100802043004
13015 Ignore-this: d19fc24349afa19833406518595bfdf7
13016] 
13017[scripts/create_node.py: allow nickname to be Unicode. Also ensure webport is validly encoded in config file.
13018david-sarah@jacaranda.org**20100802000212
13019 Ignore-this: fb236169280507dd1b3b70d459155f6e
13020] 
13021[test_runner.py: Fix error in message arguments to 'fail' calls.
13022david-sarah@jacaranda.org**20100802013526
13023 Ignore-this: 3bfdef19ae3cf993194811367da5d020
13024] 
13025[Additional Unicode basedir changes for ticket798 branch.
13026david-sarah@jacaranda.org**20100802010552
13027 Ignore-this: 7090d8c6b04eb6275345a55e75142028
13028] 
13029[Unicode basedir changes for ticket798 branch.
13030david-sarah@jacaranda.org**20100801235310
13031 Ignore-this: a00717eaeae8650847b5395801e04c45
13032] 
13033[fileutil: change WindowsError to OSError in abspath_expanduser_unicode, because WindowsError might not exist.
13034david-sarah@jacaranda.org**20100725222603
13035 Ignore-this: e125d503670ed049a9ade0322faa0c51
13036] 
13037[test_system: correct a failure in _test_runner caused by Unicode basedir patch on non-Unicode platforms.
13038david-sarah@jacaranda.org**20100724032123
13039 Ignore-this: 399b3953104fdd1bbed3f7564d163553
13040] 
13041[Fix test failures due to Unicode basedir patches.
13042david-sarah@jacaranda.org**20100725010318
13043 Ignore-this: fe92cd439eb3e60a56c007ae452784ed
13044] 
13045[util.encodingutil: change quote_output to do less unnecessary escaping, and to use double-quotes more consistently when needed. This version avoids u-escaping for characters that are representable in the output encoding, when double quotes are used, and includes tests. fixes #1135
13046david-sarah@jacaranda.org**20100723075314
13047 Ignore-this: b82205834d17db61612dd16436b7c5a2
13048] 
13049[Replace uses of os.path.abspath with abspath_expanduser_unicode where necessary. This makes basedir paths consistently represented as Unicode.
13050david-sarah@jacaranda.org**20100722001418
13051 Ignore-this: 9f8cb706540e695550e0dbe303c01f52
13052] 
13053[util.fileutil, test.test_util: add abspath_expanduser_unicode function, to work around <http://bugs.python.org/issue3426>. util.encodingutil: add a convenience function argv_to_abspath.
13054david-sarah@jacaranda.org**20100721231507
13055 Ignore-this: eee6904d1f65a733ff35190879844d08
13056] 
13057[setup: increase requirement on foolscap from >= 0.4.1 to >= 0.5.1 to avoid the foolscap performance bug with transferring large mutable files
13058zooko@zooko.com**20100802071748
13059 Ignore-this: 53b5b8571ebfee48e6b11e3f3a5efdb7
13060] 
13061[upload: tidy up logging messages
13062zooko@zooko.com**20100802070212
13063 Ignore-this: b3532518326f6d808d085da52c14b661
13064 reformat code to be less than 100 chars wide, refactor formatting of logging messages, add log levels to some logging messages, M-x whitespace-cleanup
13065] 
13066[tests: remove debug print
13067zooko@zooko.com**20100802063339
13068 Ignore-this: b13b8c15e946556bffca9d7ad7c890f5
13069] 
13070[docs: update the list of forums to announce Tahoe-LAFS too, add empty checkboxes
13071zooko@zooko.com**20100802063314
13072 Ignore-this: 89d0e8bd43f1749a9e85fcee2205bb04
13073] 
13074[immutable: tidy-up some code by using a set instead of list to hold homeless_shares
13075zooko@zooko.com**20100802062004
13076 Ignore-this: a70bda3cf6c48ab0f0688756b015cf8d
13077] 
13078[setup: fix a couple instances of hard-coded 'allmydata-tahoe' in the scripts, tighten the tests (as suggested by David-Sarah)
13079zooko@zooko.com**20100801164207
13080 Ignore-this: 50265b562193a9a3797293123ed8ba5c
13081] 
13082[setup: replace hardcoded 'allmydata-tahoe' with allmydata.__appname__
13083zooko@zooko.com**20100801160517
13084 Ignore-this: 55e1a98515300d228f02df10975f7ba
13085] 
13086[NEWS: describe #1055
13087zooko@zooko.com**20100801034338
13088 Ignore-this: 3a16cfa387c2b245c610ea1d1ad8d7f1
13089] 
13090[immutable: use PrefixingLogMixin to organize logging in Tahoe2PeerSelector and add more detailed messages about peer
13091zooko@zooko.com**20100719082000
13092 Ignore-this: e034c4988b327f7e138a106d913a3082
13093] 
13094[benchmarking: update bench_dirnode to be correct and use the shiniest new pyutil.benchutil features concerning what units you measure in
13095zooko@zooko.com**20100719044948
13096 Ignore-this: b72059e4ff921741b490e6b47ec687c6
13097] 
13098[trivial: rename and add in-line doc to clarify "used_peers" => "upload_servers"
13099zooko@zooko.com**20100719044744
13100 Ignore-this: 93c42081676e0dea181e55187cfc506d
13101] 
13102[abbreviate time edge case python2.5 unit test
13103jacob.lyles@gmail.com**20100729210638
13104 Ignore-this: 80f9b1dc98ee768372a50be7d0ef66af
13105] 
13106[docs: add Jacob Lyles to CREDITS
13107zooko@zooko.com**20100730230500
13108 Ignore-this: 9dbbd6a591b4b1a5a8dcb69b7b757792
13109] 
13110[web: don't use %d formatting on a potentially large negative float -- there is a bug in Python 2.5 in that case
13111jacob.lyles@gmail.com**20100730220550
13112 Ignore-this: 7080eb4bddbcce29cba5447f8f4872ee
13113 fixes #1055
13114] 
13115[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 -- fix .todo reference.
13116david-sarah@jacaranda.org**20100729152927
13117 Ignore-this: c8fe1047edcc83c87b9feb47f4aa587b
13118] 
13119[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 for consistency.
13120david-sarah@jacaranda.org**20100729142250
13121 Ignore-this: bc3aad5919ae9079ceb9968ad0f5ea5a
13122] 
13123[docs: fix licensing typo that was earlier fixed in [20090921164651-92b7f-7f97b58101d93dc588445c52a9aaa56a2c7ae336]
13124zooko@zooko.com**20100729052923
13125 Ignore-this: a975d79115911688e5469d4d869e1664
13126 I wish we didn't copies of this licensing text in several different files so that changes can be accidentally omitted from some of them.
13127] 
13128[misc/build_helpers/run-with-pythonpath.py: fix stale comment, and remove 'trial' example that is not the right way to run trial.
13129david-sarah@jacaranda.org**20100726225729
13130 Ignore-this: a61f55557ad69a1633bfb2b8172cce97
13131] 
13132[docs/specifications/dirnodes.txt: 'mesh'->'grid'.
13133david-sarah@jacaranda.org**20100723061616
13134 Ignore-this: 887bcf921ef00afba8e05e9239035bca
13135] 
13136[docs/specifications/dirnodes.txt: bring layer terminology up-to-date with architecture.txt, and a few other updates (e.g. note that the MAC is no longer verified, and that URIs can be unknown). Also 'Tahoe'->'Tahoe-LAFS'.
13137david-sarah@jacaranda.org**20100723054703
13138 Ignore-this: f3b98183e7d0a0f391225b8b93ac6c37
13139] 
13140[docs: use current cap to Zooko's wiki page in example text
13141zooko@zooko.com**20100721010543
13142 Ignore-this: 4f36f36758f9fdbaf9eb73eac23b6652
13143 fixes #1134
13144] 
13145[__init__.py: silence DeprecationWarning about BaseException.message globally. fixes #1129
13146david-sarah@jacaranda.org**20100720011939
13147 Ignore-this: 38808986ba79cb2786b010504a22f89
13148] 
13149[test_runner: test that 'tahoe --version' outputs no noise (e.g. DeprecationWarnings).
13150david-sarah@jacaranda.org**20100720011345
13151 Ignore-this: dd358b7b2e5d57282cbe133e8069702e
13152] 
13153[TAG allmydata-tahoe-1.7.1
13154zooko@zooko.com**20100719131352
13155 Ignore-this: 6942056548433dc653a746703819ad8c
13156] 
13157Patch bundle hash:
13158d4748b4f49d18057c690695eaa8563b6e985f428