Ticket #393: 393status16.dpatch

File 393status16.dpatch, 430.9 KB (added by kevan, at 2010-07-08T00:32:39Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
63
64Fri Jul  2 15:55:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * Add MDMF reader and writer, and SDMF writer
66 
67  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
68  object proxies that exist for immutable files. They abstract away
69  details of connection, state, and caching from their callers (in this
70  case, the download, servermap updater, and uploader), and expose methods
71  to get and set information on the remote server.
72 
73  MDMFSlotReadProxy reads a mutable file from the server, doing the right
74  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
75  allows callers to tell it how to batch and flush reads.
76 
77  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
78 
79  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
80 
81  This patch also includes tests for MDMFSlotReadProxy,
82  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
83
84Fri Jul  2 15:55:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/publish.py: cleanup + simplification
86
87Fri Jul  2 15:57:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * test/test_mutable.py: remove tests that are no longer relevant
89
90Tue Jul  6 14:52:17 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * interfaces.py: create IMutableUploadable
92
93Tue Jul  6 14:52:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
94  * mutable/publish.py: add MutableDataHandle and MutableFileHandle
95
96Tue Jul  6 14:55:41 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
97  * mutable/publish.py: reorganize in preparation of file-like uploadables
98
99Tue Jul  6 14:56:49 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
101
102Wed Jul  7 17:00:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
103  * Alter tests to work with the new APIs
104
105Wed Jul  7 17:07:32 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
106  * Alter mutable files to use file-like objects for publishing instead of strings.
107
108New patches:
109
110[Misc. changes to support the work I'm doing
111Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
112 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
113 
114     - Add a notion of file version number to interfaces.py
115     - Alter mutable file node interfaces to have a notion of version,
116       though this may be changed later.
117     - Alter mutable/filenode.py to conform to these changes.
118     - Add a salt hasher to util/hashutil.py
119] {
120hunk ./src/allmydata/interfaces.py 7
121      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
122 
123 HASH_SIZE=32
124+SALT_SIZE=16
125+
126+SDMF_VERSION=0
127+MDMF_VERSION=1
128 
129 Hash = StringConstraint(maxLength=HASH_SIZE,
130                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
131hunk ./src/allmydata/interfaces.py 811
132         writer-visible data using this writekey.
133         """
134 
135+    def set_version(version):
136+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
137+        we upload in SDMF for reasons of compatibility. If you want to
138+        change this, set_version will let you do that.
139+
140+        To say that this file should be uploaded in SDMF, pass in a 0. To
141+        say that the file should be uploaded as MDMF, pass in a 1.
142+        """
143+
144+    def get_version():
145+        """Returns the mutable file protocol version."""
146+
147 class NotEnoughSharesError(Exception):
148     """Download was unable to get enough shares"""
149 
150hunk ./src/allmydata/mutable/filenode.py 8
151 from twisted.internet import defer, reactor
152 from foolscap.api import eventually
153 from allmydata.interfaces import IMutableFileNode, \
154-     ICheckable, ICheckResults, NotEnoughSharesError
155+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
156 from allmydata.util import hashutil, log
157 from allmydata.util.assertutil import precondition
158 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
159hunk ./src/allmydata/mutable/filenode.py 67
160         self._sharemap = {} # known shares, shnum-to-[nodeids]
161         self._cache = ResponseCache()
162         self._most_recent_size = None
163+        # filled in after __init__ if we're being created for the first time;
164+        # filled in by the servermap updater before publishing, otherwise.
165+        # set to this default value in case neither of those things happen,
166+        # or in case the servermap can't find any shares to tell us what
167+        # to publish as.
168+        # TODO: Set this back to None, and find out why the tests fail
169+        #       with it set to None.
170+        self._protocol_version = SDMF_VERSION
171 
172         # all users of this MutableFileNode go through the serializer. This
173         # takes advantage of the fact that Deferreds discard the callbacks
174hunk ./src/allmydata/mutable/filenode.py 472
175     def _did_upload(self, res, size):
176         self._most_recent_size = size
177         return res
178+
179+
180+    def set_version(self, version):
181+        # I can be set in two ways:
182+        #  1. When the node is created.
183+        #  2. (for an existing share) when the Servermap is updated
184+        #     before I am read.
185+        assert version in (MDMF_VERSION, SDMF_VERSION)
186+        self._protocol_version = version
187+
188+
189+    def get_version(self):
190+        return self._protocol_version
191hunk ./src/allmydata/util/hashutil.py 90
192 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
193 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
194 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
195+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
196 
197 # dirnodes
198 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
199hunk ./src/allmydata/util/hashutil.py 134
200 def plaintext_segment_hasher():
201     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
202 
203+def mutable_salt_hash(data):
204+    return tagged_hash(MUTABLE_SALT_TAG, data)
205+def mutable_salt_hasher():
206+    return tagged_hasher(MUTABLE_SALT_TAG)
207+
208 KEYLEN = 16
209 IVLEN = 16
210 
211}
212[nodemaker.py: create MDMF files when asked to
213Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
214 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
215] {
216hunk ./src/allmydata/nodemaker.py 3
217 import weakref
218 from zope.interface import implements
219-from allmydata.interfaces import INodeMaker
220+from allmydata.util.assertutil import precondition
221+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
222+                                 SDMF_VERSION, MDMF_VERSION
223 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
224 from allmydata.immutable.upload import Data
225 from allmydata.mutable.filenode import MutableFileNode
226hunk ./src/allmydata/nodemaker.py 92
227             return self._create_dirnode(filenode)
228         return None
229 
230-    def create_mutable_file(self, contents=None, keysize=None):
231+    def create_mutable_file(self, contents=None, keysize=None,
232+                            version=SDMF_VERSION):
233         n = MutableFileNode(self.storage_broker, self.secret_holder,
234                             self.default_encoding_parameters, self.history)
235hunk ./src/allmydata/nodemaker.py 96
236+        n.set_version(version)
237         d = self.key_generator.generate(keysize)
238         d.addCallback(n.create_with_keys, contents)
239         d.addCallback(lambda res: n)
240hunk ./src/allmydata/nodemaker.py 102
241         return d
242 
243-    def create_new_mutable_directory(self, initial_children={}):
244+    def create_new_mutable_directory(self, initial_children={},
245+                                     version=SDMF_VERSION):
246+        # initial_children must have metadata (i.e. {} instead of None)
247+        for (name, (node, metadata)) in initial_children.iteritems():
248+            precondition(isinstance(metadata, dict),
249+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
250+            node.raise_error()
251         d = self.create_mutable_file(lambda n:
252hunk ./src/allmydata/nodemaker.py 110
253-                                     pack_children(n, initial_children))
254+                                     pack_children(n, initial_children),
255+                                     version)
256         d.addCallback(self._create_dirnode)
257         return d
258 
259}
260[storage/server.py: minor code cleanup
261Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
262 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
263] {
264hunk ./src/allmydata/storage/server.py 569
265                                          self)
266         return share
267 
268-    def remote_slot_readv(self, storage_index, shares, readv):
269+    def remote_slot_readv(self, storage_index, shares, readvs):
270         start = time.time()
271         self.count("readv")
272         si_s = si_b2a(storage_index)
273hunk ./src/allmydata/storage/server.py 590
274             if sharenum in shares or not shares:
275                 filename = os.path.join(bucketdir, sharenum_s)
276                 msf = MutableShareFile(filename, self)
277-                datavs[sharenum] = msf.readv(readv)
278+                datavs[sharenum] = msf.readv(readvs)
279         log.msg("returning shares %s" % (datavs.keys(),),
280                 facility="tahoe.storage", level=log.NOISY, parent=lp)
281         self.add_latency("readv", time.time() - start)
282}
283[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
284Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
285 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
286] {
287hunk ./src/allmydata/test/test_mutable.py 151
288             chr(ord(original[byte_offset]) ^ 0x01) +
289             original[byte_offset+1:])
290 
291+def add_two(original, byte_offset):
292+    # It isn't enough to simply flip the bit for the version number,
293+    # because 1 is a valid version number. So we add two instead.
294+    return (original[:byte_offset] +
295+            chr(ord(original[byte_offset]) ^ 0x02) +
296+            original[byte_offset+1:])
297+
298 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
299     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
300     # list of shnums to corrupt.
301hunk ./src/allmydata/test/test_mutable.py 187
302                 real_offset = offset1
303             real_offset = int(real_offset) + offset2 + offset_offset
304             assert isinstance(real_offset, int), offset
305-            shares[shnum] = flip_bit(data, real_offset)
306+            if offset1 == 0: # verbyte
307+                f = add_two
308+            else:
309+                f = flip_bit
310+            shares[shnum] = f(data, real_offset)
311     return res
312 
313 def make_storagebroker(s=None, num_peers=10):
314hunk ./src/allmydata/test/test_mutable.py 423
315         d.addCallback(_created)
316         return d
317 
318+
319     def test_modify_backoffer(self):
320         def _modifier(old_contents, servermap, first_time):
321             return old_contents + "line2"
322hunk ./src/allmydata/test/test_mutable.py 658
323         d.addCallback(_created)
324         return d
325 
326+
327     def _copy_shares(self, ignored, index):
328         shares = self._storage._peers
329         # we need a deep copy
330}
331[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
332Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
333 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
334] {
335hunk ./src/allmydata/test/test_mutable.py 168
336                 and shnum not in shnums_to_corrupt):
337                 continue
338             data = shares[shnum]
339-            (version,
340-             seqnum,
341-             root_hash,
342-             IV,
343-             k, N, segsize, datalen,
344-             o) = unpack_header(data)
345-            if isinstance(offset, tuple):
346-                offset1, offset2 = offset
347-            else:
348-                offset1 = offset
349-                offset2 = 0
350-            if offset1 == "pubkey":
351-                real_offset = 107
352-            elif offset1 in o:
353-                real_offset = o[offset1]
354-            else:
355-                real_offset = offset1
356-            real_offset = int(real_offset) + offset2 + offset_offset
357-            assert isinstance(real_offset, int), offset
358-            if offset1 == 0: # verbyte
359-                f = add_two
360-            else:
361-                f = flip_bit
362-            shares[shnum] = f(data, real_offset)
363-    return res
364+            # We're feeding the reader all of the share data, so it
365+            # won't need to use the rref that we didn't provide, nor the
366+            # storage index that we didn't provide. We do this because
367+            # the reader will work for both MDMF and SDMF.
368+            reader = MDMFSlotReadProxy(None, None, shnum, data)
369+            # We need to get the offsets for the next part.
370+            d = reader.get_verinfo()
371+            def _do_corruption(verinfo, data, shnum):
372+                (seqnum,
373+                 root_hash,
374+                 IV,
375+                 segsize,
376+                 datalen,
377+                 k, n, prefix, o) = verinfo
378+                if isinstance(offset, tuple):
379+                    offset1, offset2 = offset
380+                else:
381+                    offset1 = offset
382+                    offset2 = 0
383+                if offset1 == "pubkey":
384+                    real_offset = 107
385+                elif offset1 in o:
386+                    real_offset = o[offset1]
387+                else:
388+                    real_offset = offset1
389+                real_offset = int(real_offset) + offset2 + offset_offset
390+                assert isinstance(real_offset, int), offset
391+                if offset1 == 0: # verbyte
392+                    f = add_two
393+                else:
394+                    f = flip_bit
395+                shares[shnum] = f(data, real_offset)
396+            d.addCallback(_do_corruption, data, shnum)
397+            ds.append(d)
398+    dl = defer.DeferredList(ds)
399+    dl.addCallback(lambda ignored: res)
400+    return dl
401 
402 def make_storagebroker(s=None, num_peers=10):
403     if not s:
404hunk ./src/allmydata/test/test_mutable.py 1177
405         return d
406 
407     def test_download_fails(self):
408-        corrupt(None, self._storage, "signature")
409-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
410+        d = corrupt(None, self._storage, "signature")
411+        d.addCallback(lambda ignored:
412+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
413                             "no recoverable versions",
414                             self._fn.download_best_version)
415         return d
416hunk ./src/allmydata/test/test_mutable.py 1232
417         return d
418 
419     def test_check_all_bad_sig(self):
420-        corrupt(None, self._storage, 1) # bad sig
421-        d = self._fn.check(Monitor())
422+        d = corrupt(None, self._storage, 1) # bad sig
423+        d.addCallback(lambda ignored:
424+            self._fn.check(Monitor()))
425         d.addCallback(self.check_bad, "test_check_all_bad_sig")
426         return d
427 
428hunk ./src/allmydata/test/test_mutable.py 1239
429     def test_check_all_bad_blocks(self):
430-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
431+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
432         # the Checker won't notice this.. it doesn't look at actual data
433hunk ./src/allmydata/test/test_mutable.py 1241
434-        d = self._fn.check(Monitor())
435+        d.addCallback(lambda ignored:
436+            self._fn.check(Monitor()))
437         d.addCallback(self.check_good, "test_check_all_bad_blocks")
438         return d
439 
440hunk ./src/allmydata/test/test_mutable.py 1252
441         return d
442 
443     def test_verify_all_bad_sig(self):
444-        corrupt(None, self._storage, 1) # bad sig
445-        d = self._fn.check(Monitor(), verify=True)
446+        d = corrupt(None, self._storage, 1) # bad sig
447+        d.addCallback(lambda ignored:
448+            self._fn.check(Monitor(), verify=True))
449         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
450         return d
451 
452hunk ./src/allmydata/test/test_mutable.py 1259
453     def test_verify_one_bad_sig(self):
454-        corrupt(None, self._storage, 1, [9]) # bad sig
455-        d = self._fn.check(Monitor(), verify=True)
456+        d = corrupt(None, self._storage, 1, [9]) # bad sig
457+        d.addCallback(lambda ignored:
458+            self._fn.check(Monitor(), verify=True))
459         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
460         return d
461 
462hunk ./src/allmydata/test/test_mutable.py 1266
463     def test_verify_one_bad_block(self):
464-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
465+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
466         # the Verifier *will* notice this, since it examines every byte
467hunk ./src/allmydata/test/test_mutable.py 1268
468-        d = self._fn.check(Monitor(), verify=True)
469+        d.addCallback(lambda ignored:
470+            self._fn.check(Monitor(), verify=True))
471         d.addCallback(self.check_bad, "test_verify_one_bad_block")
472         d.addCallback(self.check_expected_failure,
473                       CorruptShareError, "block hash tree failure",
474hunk ./src/allmydata/test/test_mutable.py 1277
475         return d
476 
477     def test_verify_one_bad_sharehash(self):
478-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
479-        d = self._fn.check(Monitor(), verify=True)
480+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
481+        d.addCallback(lambda ignored:
482+            self._fn.check(Monitor(), verify=True))
483         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
484         d.addCallback(self.check_expected_failure,
485                       CorruptShareError, "corrupt hashes",
486hunk ./src/allmydata/test/test_mutable.py 1287
487         return d
488 
489     def test_verify_one_bad_encprivkey(self):
490-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
491-        d = self._fn.check(Monitor(), verify=True)
492+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
493+        d.addCallback(lambda ignored:
494+            self._fn.check(Monitor(), verify=True))
495         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
496         d.addCallback(self.check_expected_failure,
497                       CorruptShareError, "invalid privkey",
498hunk ./src/allmydata/test/test_mutable.py 1297
499         return d
500 
501     def test_verify_one_bad_encprivkey_uncheckable(self):
502-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
503+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
504         readonly_fn = self._fn.get_readonly()
505         # a read-only node has no way to validate the privkey
506hunk ./src/allmydata/test/test_mutable.py 1300
507-        d = readonly_fn.check(Monitor(), verify=True)
508+        d.addCallback(lambda ignored:
509+            readonly_fn.check(Monitor(), verify=True))
510         d.addCallback(self.check_good,
511                       "test_verify_one_bad_encprivkey_uncheckable")
512         return d
513}
514[Alter the ServermapUpdater to find MDMF files
515Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
516 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
517 
518 The servermapupdater should find MDMF files on a grid in the same way
519 that it finds SDMF files. This patch makes it do that.
520] {
521hunk ./src/allmydata/mutable/servermap.py 7
522 from itertools import count
523 from twisted.internet import defer
524 from twisted.python import failure
525-from foolscap.api import DeadReferenceError, RemoteException, eventually
526+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
527+                         fireEventually
528 from allmydata.util import base32, hashutil, idlib, log
529 from allmydata.storage.server import si_b2a
530 from allmydata.interfaces import IServermapUpdaterStatus
531hunk ./src/allmydata/mutable/servermap.py 17
532 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
533      DictOfSets, CorruptShareError, NeedMoreDataError
534 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
535-     SIGNED_PREFIX_LENGTH
536+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
537 
538 class UpdateStatus:
539     implements(IServermapUpdaterStatus)
540hunk ./src/allmydata/mutable/servermap.py 254
541         """Return a set of versionids, one for each version that is currently
542         recoverable."""
543         versionmap = self.make_versionmap()
544-
545         recoverable_versions = set()
546         for (verinfo, shares) in versionmap.items():
547             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
548hunk ./src/allmydata/mutable/servermap.py 366
549         self._servers_responded = set()
550 
551         # how much data should we read?
552+        # SDMF:
553         #  * if we only need the checkstring, then [0:75]
554         #  * if we need to validate the checkstring sig, then [543ish:799ish]
555         #  * if we need the verification key, then [107:436ish]
556hunk ./src/allmydata/mutable/servermap.py 374
557         #  * if we need the encrypted private key, we want [-1216ish:]
558         #   * but we can't read from negative offsets
559         #   * the offset table tells us the 'ish', also the positive offset
560-        # A future version of the SMDF slot format should consider using
561-        # fixed-size slots so we can retrieve less data. For now, we'll just
562-        # read 2000 bytes, which also happens to read enough actual data to
563-        # pre-fetch a 9-entry dirnode.
564+        # MDMF:
565+        #  * Checkstring? [0:72]
566+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
567+        #    the offset table will tell us for sure.
568+        #  * If we need the verification key, we have to consult the offset
569+        #    table as well.
570+        # At this point, we don't know which we are. Our filenode can
571+        # tell us, but it might be lying -- in some cases, we're
572+        # responsible for telling it which kind of file it is.
573         self._read_size = 4000
574         if mode == MODE_CHECK:
575             # we use unpack_prefix_and_signature, so we need 1k
576hunk ./src/allmydata/mutable/servermap.py 432
577         self._queries_completed = 0
578 
579         sb = self._storage_broker
580+        # All of the peers, permuted by the storage index, as usual.
581         full_peerlist = sb.get_servers_for_index(self._storage_index)
582         self.full_peerlist = full_peerlist # for use later, immutable
583         self.extra_peers = full_peerlist[:] # peers are removed as we use them
584hunk ./src/allmydata/mutable/servermap.py 439
585         self._good_peers = set() # peers who had some shares
586         self._empty_peers = set() # peers who don't have any shares
587         self._bad_peers = set() # peers to whom our queries failed
588+        self._readers = {} # peerid -> dict(sharewriters), filled in
589+                           # after responses come in.
590 
591         k = self._node.get_required_shares()
592hunk ./src/allmydata/mutable/servermap.py 443
593+        # For what cases can these conditions work?
594         if k is None:
595             # make a guess
596             k = 3
597hunk ./src/allmydata/mutable/servermap.py 456
598         self.num_peers_to_query = k + self.EPSILON
599 
600         if self.mode == MODE_CHECK:
601+            # We want to query all of the peers.
602             initial_peers_to_query = dict(full_peerlist)
603             must_query = set(initial_peers_to_query.keys())
604             self.extra_peers = []
605hunk ./src/allmydata/mutable/servermap.py 464
606             # we're planning to replace all the shares, so we want a good
607             # chance of finding them all. We will keep searching until we've
608             # seen epsilon that don't have a share.
609+            # We don't query all of the peers because that could take a while.
610             self.num_peers_to_query = N + self.EPSILON
611             initial_peers_to_query, must_query = self._build_initial_querylist()
612             self.required_num_empty_peers = self.EPSILON
613hunk ./src/allmydata/mutable/servermap.py 474
614             # might also avoid the round trip required to read the encrypted
615             # private key.
616 
617-        else:
618+        else: # MODE_READ, MODE_ANYTHING
619+            # 2k peers is good enough.
620             initial_peers_to_query, must_query = self._build_initial_querylist()
621 
622         # this is a set of peers that we are required to get responses from:
623hunk ./src/allmydata/mutable/servermap.py 490
624         # before we can consider ourselves finished, and self.extra_peers
625         # contains the overflow (peers that we should tap if we don't get
626         # enough responses)
627+        # I guess that self._must_query is a subset of
628+        # initial_peers_to_query?
629+        assert set(must_query).issubset(set(initial_peers_to_query))
630 
631         self._send_initial_requests(initial_peers_to_query)
632         self._status.timings["initial_queries"] = time.time() - self._started
633hunk ./src/allmydata/mutable/servermap.py 549
634         # errors that aren't handled by _query_failed (and errors caused by
635         # _query_failed) get logged, but we still want to check for doneness.
636         d.addErrback(log.err)
637-        d.addBoth(self._check_for_done)
638         d.addErrback(self._fatal_error)
639hunk ./src/allmydata/mutable/servermap.py 550
640+        d.addCallback(self._check_for_done)
641         return d
642 
643     def _do_read(self, ss, peerid, storage_index, shnums, readv):
644hunk ./src/allmydata/mutable/servermap.py 569
645         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
646         return d
647 
648+
649+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
650+        """
651+        I am called when a remote server returns a corrupt share in
652+        response to one of our queries. By corrupt, I mean a share
653+        without a valid signature. I then record the failure, notify the
654+        server of the corruption, and record the share as bad.
655+        """
656+        f = failure.Failure(e)
657+        self.log(format="bad share: %(f_value)s", f_value=str(f),
658+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
659+        # Notify the server that its share is corrupt.
660+        self.notify_server_corruption(peerid, shnum, str(e))
661+        # By flagging this as a bad peer, we won't count any of
662+        # the other shares on that peer as valid, though if we
663+        # happen to find a valid version string amongst those
664+        # shares, we'll keep track of it so that we don't need
665+        # to validate the signature on those again.
666+        self._bad_peers.add(peerid)
667+        self._last_failure = f
668+        # XXX: Use the reader for this?
669+        checkstring = data[:SIGNED_PREFIX_LENGTH]
670+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
671+        self._servermap.problems.append(f)
672+
673+
674+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
675+        """
676+        If one of my queries returns successfully (which means that we
677+        were able to and successfully did validate the signature), I
678+        cache the data that we initially fetched from the storage
679+        server. This will help reduce the number of roundtrips that need
680+        to occur when the file is downloaded, or when the file is
681+        updated.
682+        """
683+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
684+
685+
686     def _got_results(self, datavs, peerid, readsize, stuff, started):
687         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
688                       peerid=idlib.shortnodeid_b2a(peerid),
689hunk ./src/allmydata/mutable/servermap.py 630
690         else:
691             self._empty_peers.add(peerid)
692 
693-        last_verinfo = None
694-        last_shnum = None
695+        ss, storage_index = stuff
696+        ds = []
697+
698         for shnum,datav in datavs.items():
699             data = datav[0]
700hunk ./src/allmydata/mutable/servermap.py 635
701-            try:
702-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
703-                last_verinfo = verinfo
704-                last_shnum = shnum
705-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
706-            except CorruptShareError, e:
707-                # log it and give the other shares a chance to be processed
708-                f = failure.Failure()
709-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
710-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
711-                self.notify_server_corruption(peerid, shnum, str(e))
712-                self._bad_peers.add(peerid)
713-                self._last_failure = f
714-                checkstring = data[:SIGNED_PREFIX_LENGTH]
715-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
716-                self._servermap.problems.append(f)
717-                pass
718-
719-        self._status.timings["cumulative_verify"] += (time.time() - now)
720+            reader = MDMFSlotReadProxy(ss,
721+                                       storage_index,
722+                                       shnum,
723+                                       data)
724+            self._readers.setdefault(peerid, dict())[shnum] = reader
725+            # our goal, with each response, is to validate the version
726+            # information and share data as best we can at this point --
727+            # we do this by validating the signature. To do this, we
728+            # need to do the following:
729+            #   - If we don't already have the public key, fetch the
730+            #     public key. We use this to validate the signature.
731+            if not self._node.get_pubkey():
732+                # fetch and set the public key.
733+                d = reader.get_verification_key()
734+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
735+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
736+                # XXX: Make self._pubkey_query_failed?
737+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
738+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
739+            else:
740+                # we already have the public key.
741+                d = defer.succeed(None)
742+            # Neither of these two branches return anything of
743+            # consequence, so the first entry in our deferredlist will
744+            # be None.
745 
746hunk ./src/allmydata/mutable/servermap.py 661
747-        if self._need_privkey and last_verinfo:
748-            # send them a request for the privkey. We send one request per
749-            # server.
750-            lp2 = self.log("sending privkey request",
751-                           parent=lp, level=log.NOISY)
752-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
753-             offsets_tuple) = last_verinfo
754-            o = dict(offsets_tuple)
755+            # - Next, we need the version information. We almost
756+            #   certainly got this by reading the first thousand or so
757+            #   bytes of the share on the storage server, so we
758+            #   shouldn't need to fetch anything at this step.
759+            d2 = reader.get_verinfo()
760+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
761+                self._got_corrupt_share(error, shnum, peerid, data, lp))
762+            # - Next, we need the signature. For an SDMF share, it is
763+            #   likely that we fetched this when doing our initial fetch
764+            #   to get the version information. In MDMF, this lives at
765+            #   the end of the share, so unless the file is quite small,
766+            #   we'll need to do a remote fetch to get it.
767+            d3 = reader.get_signature()
768+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
769+                self._got_corrupt_share(error, shnum, peerid, data, lp))
770+            #  Once we have all three of these responses, we can move on
771+            #  to validating the signature
772 
773hunk ./src/allmydata/mutable/servermap.py 679
774-            self._queries_outstanding.add(peerid)
775-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
776-            ss = self._servermap.connections[peerid]
777-            privkey_started = time.time()
778-            d = self._do_read(ss, peerid, self._storage_index,
779-                              [last_shnum], readv)
780-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
781-                          privkey_started, lp2)
782-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
783-            d.addErrback(log.err)
784-            d.addCallback(self._check_for_done)
785-            d.addErrback(self._fatal_error)
786+            # Does the node already have a privkey? If not, we'll try to
787+            # fetch it here.
788+            if self._need_privkey:
789+                d4 = reader.get_encprivkey()
790+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
791+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
792+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
793+                    self._privkey_query_failed(error, shnum, data, lp))
794+            else:
795+                d4 = defer.succeed(None)
796 
797hunk ./src/allmydata/mutable/servermap.py 690
798+            dl = defer.DeferredList([d, d2, d3, d4])
799+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
800+                self._got_signature_one_share(results, shnum, peerid, lp))
801+            dl.addErrback(lambda error, shnum=shnum, data=data:
802+               self._got_corrupt_share(error, shnum, peerid, data, lp))
803+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
804+                self._cache_good_sharedata(verinfo, shnum, now, data))
805+            ds.append(dl)
806+        # dl is a deferred list that will fire when all of the shares
807+        # that we found on this peer are done processing. When dl fires,
808+        # we know that processing is done, so we can decrement the
809+        # semaphore-like thing that we incremented earlier.
810+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
811+        # Are we done? Done means that there are no more queries to
812+        # send, that there are no outstanding queries, and that we
813+        # haven't received any queries that are still processing. If we
814+        # are done, self._check_for_done will cause the done deferred
815+        # that we returned to our caller to fire, which tells them that
816+        # they have a complete servermap, and that we won't be touching
817+        # the servermap anymore.
818+        dl.addCallback(self._check_for_done)
819+        dl.addErrback(self._fatal_error)
820         # all done!
821         self.log("_got_results done", parent=lp, level=log.NOISY)
822hunk ./src/allmydata/mutable/servermap.py 714
823+        return dl
824+
825+
826+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
827+        if self._node.get_pubkey():
828+            return # don't go through this again if we don't have to
829+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
830+        assert len(fingerprint) == 32
831+        if fingerprint != self._node.get_fingerprint():
832+            raise CorruptShareError(peerid, shnum,
833+                                "pubkey doesn't match fingerprint")
834+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
835+        assert self._node.get_pubkey()
836+
837 
838     def notify_server_corruption(self, peerid, shnum, reason):
839         ss = self._servermap.connections[peerid]
840hunk ./src/allmydata/mutable/servermap.py 734
841         ss.callRemoteOnly("advise_corrupt_share",
842                           "mutable", self._storage_index, shnum, reason)
843 
844-    def _got_results_one_share(self, shnum, data, peerid, lp):
845+
846+    def _got_signature_one_share(self, results, shnum, peerid, lp):
847+        # It is our job to give versioninfo to our caller. We need to
848+        # raise CorruptShareError if the share is corrupt for any
849+        # reason, something that our caller will handle.
850         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
851                  shnum=shnum,
852                  peerid=idlib.shortnodeid_b2a(peerid),
853hunk ./src/allmydata/mutable/servermap.py 744
854                  level=log.NOISY,
855                  parent=lp)
856-
857-        # this might raise NeedMoreDataError, if the pubkey and signature
858-        # live at some weird offset. That shouldn't happen, so I'm going to
859-        # treat it as a bad share.
860-        (seqnum, root_hash, IV, k, N, segsize, datalength,
861-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
862-
863-        if not self._node.get_pubkey():
864-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
865-            assert len(fingerprint) == 32
866-            if fingerprint != self._node.get_fingerprint():
867-                raise CorruptShareError(peerid, shnum,
868-                                        "pubkey doesn't match fingerprint")
869-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
870-
871-        if self._need_privkey:
872-            self._try_to_extract_privkey(data, peerid, shnum, lp)
873-
874-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
875-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
876+        _, verinfo, signature, __ = results
877+        (seqnum,
878+         root_hash,
879+         saltish,
880+         segsize,
881+         datalen,
882+         k,
883+         n,
884+         prefix,
885+         offsets) = verinfo[1]
886         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
887 
888hunk ./src/allmydata/mutable/servermap.py 756
889-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
890+        # XXX: This should be done for us in the method, so
891+        # presumably you can go in there and fix it.
892+        verinfo = (seqnum,
893+                   root_hash,
894+                   saltish,
895+                   segsize,
896+                   datalen,
897+                   k,
898+                   n,
899+                   prefix,
900                    offsets_tuple)
901hunk ./src/allmydata/mutable/servermap.py 767
902+        # This tuple uniquely identifies a share on the grid; we use it
903+        # to keep track of the ones that we've already seen.
904 
905         if verinfo not in self._valid_versions:
906hunk ./src/allmydata/mutable/servermap.py 771
907-            # it's a new pair. Verify the signature.
908-            valid = self._node.get_pubkey().verify(prefix, signature)
909+            # This is a new version tuple, and we need to validate it
910+            # against the public key before keeping track of it.
911+            assert self._node.get_pubkey()
912+            valid = self._node.get_pubkey().verify(prefix, signature[1])
913             if not valid:
914hunk ./src/allmydata/mutable/servermap.py 776
915-                raise CorruptShareError(peerid, shnum, "signature is invalid")
916+                raise CorruptShareError(peerid, shnum,
917+                                        "signature is invalid")
918 
919hunk ./src/allmydata/mutable/servermap.py 779
920-            # ok, it's a valid verinfo. Add it to the list of validated
921-            # versions.
922-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
923-                     % (seqnum, base32.b2a(root_hash)[:4],
924-                        idlib.shortnodeid_b2a(peerid), shnum,
925-                        k, N, segsize, datalength),
926-                     parent=lp)
927-            self._valid_versions.add(verinfo)
928-        # We now know that this is a valid candidate verinfo.
929+        # ok, it's a valid verinfo. Add it to the list of validated
930+        # versions.
931+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
932+                 % (seqnum, base32.b2a(root_hash)[:4],
933+                    idlib.shortnodeid_b2a(peerid), shnum,
934+                    k, n, segsize, datalen),
935+                    parent=lp)
936+        self._valid_versions.add(verinfo)
937+        # We now know that this is a valid candidate verinfo. Whether or
938+        # not this instance of it is valid is a matter for the next
939+        # statement; at this point, we just know that if we see this
940+        # version info again, that its signature checks out and that
941+        # we're okay to skip the signature-checking step.
942 
943hunk ./src/allmydata/mutable/servermap.py 793
944+        # (peerid, shnum) are bound in the method invocation.
945         if (peerid, shnum) in self._servermap.bad_shares:
946             # we've been told that the rest of the data in this share is
947             # unusable, so don't add it to the servermap.
948hunk ./src/allmydata/mutable/servermap.py 808
949         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
950         return verinfo
951 
952+
953     def _deserialize_pubkey(self, pubkey_s):
954         verifier = rsa.create_verifying_key_from_string(pubkey_s)
955         return verifier
956hunk ./src/allmydata/mutable/servermap.py 813
957 
958-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
959-        try:
960-            r = unpack_share(data)
961-        except NeedMoreDataError, e:
962-            # this share won't help us. oh well.
963-            offset = e.encprivkey_offset
964-            length = e.encprivkey_length
965-            self.log("shnum %d on peerid %s: share was too short (%dB) "
966-                     "to get the encprivkey; [%d:%d] ought to hold it" %
967-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
968-                      offset, offset+length),
969-                     parent=lp)
970-            # NOTE: if uncoordinated writes are taking place, someone might
971-            # change the share (and most probably move the encprivkey) before
972-            # we get a chance to do one of these reads and fetch it. This
973-            # will cause us to see a NotEnoughSharesError(unable to fetch
974-            # privkey) instead of an UncoordinatedWriteError . This is a
975-            # nuisance, but it will go away when we move to DSA-based mutable
976-            # files (since the privkey will be small enough to fit in the
977-            # write cap).
978-
979-            return
980-
981-        (seqnum, root_hash, IV, k, N, segsize, datalen,
982-         pubkey, signature, share_hash_chain, block_hash_tree,
983-         share_data, enc_privkey) = r
984-
985-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
986 
987     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
988hunk ./src/allmydata/mutable/servermap.py 815
989-
990+        """
991+        Given a writekey from a remote server, I validate it against the
992+        writekey stored in my node. If it is valid, then I set the
993+        privkey and encprivkey properties of the node.
994+        """
995         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
996         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
997         if alleged_writekey != self._node.get_writekey():
998hunk ./src/allmydata/mutable/servermap.py 892
999         self._queries_completed += 1
1000         self._last_failure = f
1001 
1002-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
1003-        now = time.time()
1004-        elapsed = now - started
1005-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
1006-        self._queries_outstanding.discard(peerid)
1007-        if not self._need_privkey:
1008-            return
1009-        if shnum not in datavs:
1010-            self.log("privkey wasn't there when we asked it",
1011-                     level=log.WEIRD, umid="VA9uDQ")
1012-            return
1013-        datav = datavs[shnum]
1014-        enc_privkey = datav[0]
1015-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
1016 
1017     def _privkey_query_failed(self, f, peerid, shnum, lp):
1018         self._queries_outstanding.discard(peerid)
1019hunk ./src/allmydata/mutable/servermap.py 906
1020         self._servermap.problems.append(f)
1021         self._last_failure = f
1022 
1023+
1024     def _check_for_done(self, res):
1025         # exit paths:
1026         #  return self._send_more_queries(outstanding) : send some more queries
1027hunk ./src/allmydata/mutable/servermap.py 912
1028         #  return self._done() : all done
1029         #  return : keep waiting, no new queries
1030-
1031         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1032                               "%(outstanding)d queries outstanding, "
1033                               "%(extra)d extra peers available, "
1034hunk ./src/allmydata/mutable/servermap.py 1117
1035         self._servermap.last_update_time = self._started
1036         # the servermap will not be touched after this
1037         self.log("servermap: %s" % self._servermap.summarize_versions())
1038+
1039         eventually(self._done_deferred.callback, self._servermap)
1040 
1041     def _fatal_error(self, f):
1042hunk ./src/allmydata/test/test_mutable.py 637
1043         d.addCallback(_created)
1044         return d
1045 
1046-    def publish_multiple(self):
1047+    def publish_mdmf(self):
1048+        # like publish_one, except that the result is guaranteed to be
1049+        # an MDMF file.
1050+        # self.CONTENTS should have more than one segment.
1051+        self.CONTENTS = "This is an MDMF file" * 100000
1052+        self._storage = FakeStorage()
1053+        self._nodemaker = make_nodemaker(self._storage)
1054+        self._storage_broker = self._nodemaker.storage_broker
1055+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1056+        def _created(node):
1057+            self._fn = node
1058+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1059+        d.addCallback(_created)
1060+        return d
1061+
1062+
1063+    def publish_sdmf(self):
1064+        # like publish_one, except that the result is guaranteed to be
1065+        # an SDMF file
1066+        self.CONTENTS = "This is an SDMF file" * 1000
1067+        self._storage = FakeStorage()
1068+        self._nodemaker = make_nodemaker(self._storage)
1069+        self._storage_broker = self._nodemaker.storage_broker
1070+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1071+        def _created(node):
1072+            self._fn = node
1073+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1074+        d.addCallback(_created)
1075+        return d
1076+
1077+
1078+    def publish_multiple(self, version=0):
1079         self.CONTENTS = ["Contents 0",
1080                          "Contents 1",
1081                          "Contents 2",
1082hunk ./src/allmydata/test/test_mutable.py 677
1083         self._copied_shares = {}
1084         self._storage = FakeStorage()
1085         self._nodemaker = make_nodemaker(self._storage)
1086-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1087+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1088         def _created(node):
1089             self._fn = node
1090             # now create multiple versions of the same file, and accumulate
1091hunk ./src/allmydata/test/test_mutable.py 906
1092         return d
1093 
1094 
1095+    def test_servermapupdater_finds_mdmf_files(self):
1096+        # setUp already published an MDMF file for us. We just need to
1097+        # make sure that when we run the ServermapUpdater, the file is
1098+        # reported to have one recoverable version.
1099+        d = defer.succeed(None)
1100+        d.addCallback(lambda ignored:
1101+            self.publish_mdmf())
1102+        d.addCallback(lambda ignored:
1103+            self.make_servermap(mode=MODE_CHECK))
1104+        # Calling make_servermap also updates the servermap in the mode
1105+        # that we specify, so we just need to see what it says.
1106+        def _check_servermap(sm):
1107+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1108+        d.addCallback(_check_servermap)
1109+        return d
1110+
1111+
1112+    def test_servermapupdater_finds_sdmf_files(self):
1113+        d = defer.succeed(None)
1114+        d.addCallback(lambda ignored:
1115+            self.publish_sdmf())
1116+        d.addCallback(lambda ignored:
1117+            self.make_servermap(mode=MODE_CHECK))
1118+        d.addCallback(lambda servermap:
1119+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1120+        return d
1121+
1122 
1123 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1124     def setUp(self):
1125hunk ./src/allmydata/test/test_mutable.py 1050
1126         return d
1127     test_no_servers_download.timeout = 15
1128 
1129+
1130     def _test_corrupt_all(self, offset, substring,
1131                           should_succeed=False, corrupt_early=True,
1132                           failure_checker=None):
1133}
1134[Make a segmented mutable uploader
1135Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1136 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1137 
1138 The mutable file uploader should be able to publish files with one
1139 segment and files with multiple segments. This patch makes it do that.
1140 This is still incomplete, and rather ugly -- I need to flesh out error
1141 handling, I need to write tests, and I need to remove some of the uglier
1142 kludges in the process before I can call this done.
1143] {
1144hunk ./src/allmydata/mutable/publish.py 8
1145 from zope.interface import implements
1146 from twisted.internet import defer
1147 from twisted.python import failure
1148-from allmydata.interfaces import IPublishStatus
1149+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1150 from allmydata.util import base32, hashutil, mathutil, idlib, log
1151 from allmydata import hashtree, codec
1152 from allmydata.storage.server import si_b2a
1153hunk ./src/allmydata/mutable/publish.py 19
1154      UncoordinatedWriteError, NotEnoughServersError
1155 from allmydata.mutable.servermap import ServerMap
1156 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1157-     unpack_checkstring, SIGNED_PREFIX
1158+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1159+
1160+KiB = 1024
1161+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1162 
1163 class PublishStatus:
1164     implements(IPublishStatus)
1165hunk ./src/allmydata/mutable/publish.py 112
1166         self._status.set_helper(False)
1167         self._status.set_progress(0.0)
1168         self._status.set_active(True)
1169+        # We use this to control how the file is written.
1170+        version = self._node.get_version()
1171+        assert version in (SDMF_VERSION, MDMF_VERSION)
1172+        self._version = version
1173 
1174     def get_status(self):
1175         return self._status
1176hunk ./src/allmydata/mutable/publish.py 134
1177         simultaneous write.
1178         """
1179 
1180-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1181-        # 2: perform peer selection, get candidate servers
1182-        #  2a: send queries to n+epsilon servers, to determine current shares
1183-        #  2b: based upon responses, create target map
1184-        # 3: send slot_testv_and_readv_and_writev messages
1185-        # 4: as responses return, update share-dispatch table
1186-        # 4a: may need to run recovery algorithm
1187-        # 5: when enough responses are back, we're done
1188+        # 0. Setup encoding parameters, encoder, and other such things.
1189+        # 1. Encrypt, encode, and publish segments.
1190 
1191         self.log("starting publish, datalen is %s" % len(newdata))
1192         self._status.set_size(len(newdata))
1193hunk ./src/allmydata/mutable/publish.py 187
1194         self.bad_peers = set() # peerids who have errbacked/refused requests
1195 
1196         self.newdata = newdata
1197-        self.salt = os.urandom(16)
1198 
1199hunk ./src/allmydata/mutable/publish.py 188
1200+        # This will set self.segment_size, self.num_segments, and
1201+        # self.fec.
1202         self.setup_encoding_parameters()
1203 
1204         # if we experience any surprises (writes which were rejected because
1205hunk ./src/allmydata/mutable/publish.py 238
1206             self.bad_share_checkstrings[key] = old_checkstring
1207             self.connections[peerid] = self._servermap.connections[peerid]
1208 
1209-        # create the shares. We'll discard these as they are delivered. SDMF:
1210-        # we're allowed to hold everything in memory.
1211+        # Now, the process dovetails -- if this is an SDMF file, we need
1212+        # to write an SDMF file. Otherwise, we need to write an MDMF
1213+        # file.
1214+        if self._version == MDMF_VERSION:
1215+            return self._publish_mdmf()
1216+        else:
1217+            return self._publish_sdmf()
1218+        #return self.done_deferred
1219+
1220+    def _publish_mdmf(self):
1221+        # Next, we find homes for all of the shares that we don't have
1222+        # homes for yet.
1223+        # TODO: Make this part do peer selection.
1224+        self.update_goal()
1225+        self.writers = {}
1226+        # For each (peerid, shnum) in self.goal, we make an
1227+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1228+        # shares to the peer.
1229+        for key in self.goal:
1230+            peerid, shnum = key
1231+            write_enabler = self._node.get_write_enabler(peerid)
1232+            renew_secret = self._node.get_renewal_secret(peerid)
1233+            cancel_secret = self._node.get_cancel_secret(peerid)
1234+            secrets = (write_enabler, renew_secret, cancel_secret)
1235+
1236+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1237+                                                      self.connections[peerid],
1238+                                                      self._storage_index,
1239+                                                      secrets,
1240+                                                      self._new_seqnum,
1241+                                                      self.required_shares,
1242+                                                      self.total_shares,
1243+                                                      self.segment_size,
1244+                                                      len(self.newdata))
1245+            if (peerid, shnum) in self._servermap.servermap:
1246+                old_versionid, old_timestamp = self._servermap.servermap[key]
1247+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1248+                 old_datalength, old_k, old_N, old_prefix,
1249+                 old_offsets_tuple) = old_versionid
1250+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1251+
1252+        # Now, we start pushing shares.
1253+        self._status.timings["setup"] = time.time() - self._started
1254+        def _start_pushing(res):
1255+            self._started_pushing = time.time()
1256+            return res
1257+
1258+        # First, we encrypt, encode, and publish the shares that we need
1259+        # to encrypt, encode, and publish.
1260+
1261+        # This will eventually hold the block hash chain for each share
1262+        # that we publish. We define it this way so that empty publishes
1263+        # will still have something to write to the remote slot.
1264+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1265+        self.sharehash_leaves = None # eventually [sharehashes]
1266+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1267+                              # validate the share]
1268 
1269hunk ./src/allmydata/mutable/publish.py 296
1270+        d = defer.succeed(None)
1271+        self.log("Starting push")
1272+        for i in xrange(self.num_segments - 1):
1273+            d.addCallback(lambda ignored, i=i:
1274+                self.push_segment(i))
1275+            d.addCallback(self._turn_barrier)
1276+        # We have at least one segment, so we will have a tail segment
1277+        if self.num_segments > 0:
1278+            d.addCallback(lambda ignored:
1279+                self.push_tail_segment())
1280+
1281+        d.addCallback(lambda ignored:
1282+            self.push_encprivkey())
1283+        d.addCallback(lambda ignored:
1284+            self.push_blockhashes())
1285+        d.addCallback(lambda ignored:
1286+            self.push_sharehashes())
1287+        d.addCallback(lambda ignored:
1288+            self.push_toplevel_hashes_and_signature())
1289+        d.addCallback(lambda ignored:
1290+            self.finish_publishing())
1291+        return d
1292+
1293+
1294+    def _publish_sdmf(self):
1295         self._status.timings["setup"] = time.time() - self._started
1296hunk ./src/allmydata/mutable/publish.py 322
1297+        self.salt = os.urandom(16)
1298+
1299         d = self._encrypt_and_encode()
1300         d.addCallback(self._generate_shares)
1301         def _start_pushing(res):
1302hunk ./src/allmydata/mutable/publish.py 335
1303 
1304         return self.done_deferred
1305 
1306+
1307     def setup_encoding_parameters(self):
1308hunk ./src/allmydata/mutable/publish.py 337
1309-        segment_size = len(self.newdata)
1310+        if self._version == MDMF_VERSION:
1311+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1312+        else:
1313+            segment_size = len(self.newdata) # SDMF is only one segment
1314         # this must be a multiple of self.required_shares
1315         segment_size = mathutil.next_multiple(segment_size,
1316                                               self.required_shares)
1317hunk ./src/allmydata/mutable/publish.py 350
1318                                                   segment_size)
1319         else:
1320             self.num_segments = 0
1321-        assert self.num_segments in [0, 1,] # SDMF restrictions
1322+        if self._version == SDMF_VERSION:
1323+            assert self.num_segments in (0, 1) # SDMF
1324+            return
1325+        # calculate the tail segment size.
1326+        self.tail_segment_size = len(self.newdata) % segment_size
1327+
1328+        if self.tail_segment_size == 0:
1329+            # The tail segment is the same size as the other segments.
1330+            self.tail_segment_size = segment_size
1331+
1332+        # We'll make an encoder ahead-of-time for the normal-sized
1333+        # segments (defined as any segment of segment_size size.
1334+        # (the part of the code that puts the tail segment will make its
1335+        #  own encoder for that part)
1336+        fec = codec.CRSEncoder()
1337+        fec.set_params(self.segment_size,
1338+                       self.required_shares, self.total_shares)
1339+        self.piece_size = fec.get_block_size()
1340+        self.fec = fec
1341+
1342+
1343+    def push_segment(self, segnum):
1344+        started = time.time()
1345+        segsize = self.segment_size
1346+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1347+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1348+        assert len(data) == segsize
1349+
1350+        salt = os.urandom(16)
1351+
1352+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1353+        enc = AES(key)
1354+        crypttext = enc.process(data)
1355+        assert len(crypttext) == len(data)
1356+
1357+        now = time.time()
1358+        self._status.timings["encrypt"] = now - started
1359+        started = now
1360+
1361+        # now apply FEC
1362+
1363+        self._status.set_status("Encoding")
1364+        crypttext_pieces = [None] * self.required_shares
1365+        piece_size = self.piece_size
1366+        for i in range(len(crypttext_pieces)):
1367+            offset = i * piece_size
1368+            piece = crypttext[offset:offset+piece_size]
1369+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1370+            crypttext_pieces[i] = piece
1371+            assert len(piece) == piece_size
1372+        d = self.fec.encode(crypttext_pieces)
1373+        def _done_encoding(res):
1374+            elapsed = time.time() - started
1375+            self._status.timings["encode"] = elapsed
1376+            return res
1377+        d.addCallback(_done_encoding)
1378+
1379+        def _push_shares_and_salt(results):
1380+            shares, shareids = results
1381+            dl = []
1382+            for i in xrange(len(shares)):
1383+                sharedata = shares[i]
1384+                shareid = shareids[i]
1385+                block_hash = hashutil.block_hash(salt + sharedata)
1386+                self.blockhashes[shareid].append(block_hash)
1387+
1388+                # find the writer for this share
1389+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1390+                dl.append(d)
1391+            # TODO: Naturally, we need to check on the results of these.
1392+            return defer.DeferredList(dl)
1393+        d.addCallback(_push_shares_and_salt)
1394+        return d
1395+
1396+
1397+    def push_tail_segment(self):
1398+        # This is essentially the same as push_segment, except that we
1399+        # don't use the cached encoder that we use elsewhere.
1400+        self.log("Pushing tail segment")
1401+        started = time.time()
1402+        segsize = self.segment_size
1403+        data = self.newdata[segsize * (self.num_segments-1):]
1404+        assert len(data) == self.tail_segment_size
1405+        salt = os.urandom(16)
1406+
1407+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1408+        enc = AES(key)
1409+        crypttext = enc.process(data)
1410+        assert len(crypttext) == len(data)
1411+
1412+        now = time.time()
1413+        self._status.timings['encrypt'] = now - started
1414+        started = now
1415+
1416+        self._status.set_status("Encoding")
1417+        tail_fec = codec.CRSEncoder()
1418+        tail_fec.set_params(self.tail_segment_size,
1419+                            self.required_shares,
1420+                            self.total_shares)
1421+
1422+        crypttext_pieces = [None] * self.required_shares
1423+        piece_size = tail_fec.get_block_size()
1424+        for i in range(len(crypttext_pieces)):
1425+            offset = i * piece_size
1426+            piece = crypttext[offset:offset+piece_size]
1427+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1428+            crypttext_pieces[i] = piece
1429+            assert len(piece) == piece_size
1430+        d = tail_fec.encode(crypttext_pieces)
1431+        def _push_shares_and_salt(results):
1432+            shares, shareids = results
1433+            dl = []
1434+            for i in xrange(len(shares)):
1435+                sharedata = shares[i]
1436+                shareid = shareids[i]
1437+                block_hash = hashutil.block_hash(salt + sharedata)
1438+                self.blockhashes[shareid].append(block_hash)
1439+                # find the writer for this share
1440+                d = self.writers[shareid].put_block(sharedata,
1441+                                                    self.num_segments - 1,
1442+                                                    salt)
1443+                dl.append(d)
1444+            # TODO: Naturally, we need to check on the results of these.
1445+            return defer.DeferredList(dl)
1446+        d.addCallback(_push_shares_and_salt)
1447+        return d
1448+
1449+
1450+    def push_encprivkey(self):
1451+        started = time.time()
1452+        encprivkey = self._encprivkey
1453+        dl = []
1454+        def _spy_on_writer(results):
1455+            print results
1456+            return results
1457+        for shnum, writer in self.writers.iteritems():
1458+            d = writer.put_encprivkey(encprivkey)
1459+            dl.append(d)
1460+        d = defer.DeferredList(dl)
1461+        return d
1462+
1463+
1464+    def push_blockhashes(self):
1465+        started = time.time()
1466+        dl = []
1467+        def _spy_on_results(results):
1468+            print results
1469+            return results
1470+        self.sharehash_leaves = [None] * len(self.blockhashes)
1471+        for shnum, blockhashes in self.blockhashes.iteritems():
1472+            t = hashtree.HashTree(blockhashes)
1473+            self.blockhashes[shnum] = list(t)
1474+            # set the leaf for future use.
1475+            self.sharehash_leaves[shnum] = t[0]
1476+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1477+            dl.append(d)
1478+        d = defer.DeferredList(dl)
1479+        return d
1480+
1481+
1482+    def push_sharehashes(self):
1483+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1484+        share_hash_chain = {}
1485+        ds = []
1486+        def _spy_on_results(results):
1487+            print results
1488+            return results
1489+        for shnum in xrange(len(self.sharehash_leaves)):
1490+            needed_indices = share_hash_tree.needed_hashes(shnum)
1491+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1492+                                             for i in needed_indices] )
1493+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1494+            ds.append(d)
1495+        self.root_hash = share_hash_tree[0]
1496+        d = defer.DeferredList(ds)
1497+        return d
1498+
1499+
1500+    def push_toplevel_hashes_and_signature(self):
1501+        # We need to to three things here:
1502+        #   - Push the root hash and salt hash
1503+        #   - Get the checkstring of the resulting layout; sign that.
1504+        #   - Push the signature
1505+        ds = []
1506+        def _spy_on_results(results):
1507+            print results
1508+            return results
1509+        for shnum in xrange(self.total_shares):
1510+            d = self.writers[shnum].put_root_hash(self.root_hash)
1511+            ds.append(d)
1512+        d = defer.DeferredList(ds)
1513+        def _make_and_place_signature(ignored):
1514+            signable = self.writers[0].get_signable()
1515+            self.signature = self._privkey.sign(signable)
1516+
1517+            ds = []
1518+            for (shnum, writer) in self.writers.iteritems():
1519+                d = writer.put_signature(self.signature)
1520+                ds.append(d)
1521+            return defer.DeferredList(ds)
1522+        d.addCallback(_make_and_place_signature)
1523+        return d
1524+
1525+
1526+    def finish_publishing(self):
1527+        # We're almost done -- we just need to put the verification key
1528+        # and the offsets
1529+        ds = []
1530+        verification_key = self._pubkey.serialize()
1531+
1532+        def _spy_on_results(results):
1533+            print results
1534+            return results
1535+        for (shnum, writer) in self.writers.iteritems():
1536+            d = writer.put_verification_key(verification_key)
1537+            d.addCallback(lambda ignored, writer=writer:
1538+                writer.finish_publishing())
1539+            ds.append(d)
1540+        return defer.DeferredList(ds)
1541+
1542+
1543+    def _turn_barrier(self, res):
1544+        # putting this method in a Deferred chain imposes a guaranteed
1545+        # reactor turn between the pre- and post- portions of that chain.
1546+        # This can be useful to limit memory consumption: since Deferreds do
1547+        # not do tail recursion, code which uses defer.succeed(result) for
1548+        # consistency will cause objects to live for longer than you might
1549+        # normally expect.
1550+        return fireEventually(res)
1551+
1552 
1553     def _fatal_error(self, f):
1554         self.log("error during loop", failure=f, level=log.UNUSUAL)
1555hunk ./src/allmydata/mutable/publish.py 716
1556             self.log_goal(self.goal, "after update: ")
1557 
1558 
1559-
1560     def _encrypt_and_encode(self):
1561         # this returns a Deferred that fires with a list of (sharedata,
1562         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1563hunk ./src/allmydata/mutable/publish.py 757
1564         d.addCallback(_done_encoding)
1565         return d
1566 
1567+
1568     def _generate_shares(self, shares_and_shareids):
1569         # this sets self.shares and self.root_hash
1570         self.log("_generate_shares")
1571hunk ./src/allmydata/mutable/publish.py 1145
1572             self._status.set_progress(1.0)
1573         eventually(self.done_deferred.callback, res)
1574 
1575-
1576hunk ./src/allmydata/test/test_mutable.py 248
1577         d.addCallback(_created)
1578         return d
1579 
1580+
1581+    def test_create_mdmf(self):
1582+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1583+        def _created(n):
1584+            self.failUnless(isinstance(n, MutableFileNode))
1585+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1586+            sb = self.nodemaker.storage_broker
1587+            peer0 = sorted(sb.get_all_serverids())[0]
1588+            shnums = self._storage._peers[peer0].keys()
1589+            self.failUnlessEqual(len(shnums), 1)
1590+        d.addCallback(_created)
1591+        return d
1592+
1593+
1594     def test_serialize(self):
1595         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1596         calls = []
1597hunk ./src/allmydata/test/test_mutable.py 334
1598         d.addCallback(_created)
1599         return d
1600 
1601+
1602+    def test_create_mdmf_with_initial_contents(self):
1603+        initial_contents = "foobarbaz" * 131072 # 900KiB
1604+        d = self.nodemaker.create_mutable_file(initial_contents,
1605+                                               version=MDMF_VERSION)
1606+        def _created(n):
1607+            d = n.download_best_version()
1608+            d.addCallback(lambda data:
1609+                self.failUnlessEqual(data, initial_contents))
1610+            d.addCallback(lambda ignored:
1611+                n.overwrite(initial_contents + "foobarbaz"))
1612+            d.addCallback(lambda ignored:
1613+                n.download_best_version())
1614+            d.addCallback(lambda data:
1615+                self.failUnlessEqual(data, initial_contents +
1616+                                           "foobarbaz"))
1617+            return d
1618+        d.addCallback(_created)
1619+        return d
1620+
1621+
1622     def test_create_with_initial_contents_function(self):
1623         data = "initial contents"
1624         def _make_contents(n):
1625hunk ./src/allmydata/test/test_mutable.py 370
1626         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1627         return d
1628 
1629+
1630+    def test_create_mdmf_with_initial_contents_function(self):
1631+        data = "initial contents" * 100000
1632+        def _make_contents(n):
1633+            self.failUnless(isinstance(n, MutableFileNode))
1634+            key = n.get_writekey()
1635+            self.failUnless(isinstance(key, str), key)
1636+            self.failUnlessEqual(len(key), 16)
1637+            return data
1638+        d = self.nodemaker.create_mutable_file(_make_contents,
1639+                                               version=MDMF_VERSION)
1640+        d.addCallback(lambda n:
1641+            n.download_best_version())
1642+        d.addCallback(lambda data2:
1643+            self.failUnlessEqual(data2, data))
1644+        return d
1645+
1646+
1647     def test_create_with_too_large_contents(self):
1648         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1649         d = self.nodemaker.create_mutable_file(BIG)
1650}
1651[Write a segmented mutable downloader
1652Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1653 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1654 
1655 The segmented mutable downloader can deal with MDMF files (files with
1656 one or more segments in MDMF format) and SDMF files (files with one
1657 segment in SDMF format). It is backwards compatible with the old
1658 file format.
1659 
1660 This patch also contains tests for the segmented mutable downloader.
1661] {
1662hunk ./src/allmydata/mutable/retrieve.py 8
1663 from twisted.internet import defer
1664 from twisted.python import failure
1665 from foolscap.api import DeadReferenceError, eventually, fireEventually
1666-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1667-from allmydata.util import hashutil, idlib, log
1668+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1669+                                 MDMF_VERSION, SDMF_VERSION
1670+from allmydata.util import hashutil, idlib, log, mathutil
1671 from allmydata import hashtree, codec
1672 from allmydata.storage.server import si_b2a
1673 from pycryptopp.cipher.aes import AES
1674hunk ./src/allmydata/mutable/retrieve.py 17
1675 from pycryptopp.publickey import rsa
1676 
1677 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1678-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1679+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1680+                                     MDMFSlotReadProxy
1681 
1682 class RetrieveStatus:
1683     implements(IRetrieveStatus)
1684hunk ./src/allmydata/mutable/retrieve.py 104
1685         self.verinfo = verinfo
1686         # during repair, we may be called upon to grab the private key, since
1687         # it wasn't picked up during a verify=False checker run, and we'll
1688-        # need it for repair to generate the a new version.
1689+        # need it for repair to generate a new version.
1690         self._need_privkey = fetch_privkey
1691         if self._node.get_privkey():
1692             self._need_privkey = False
1693hunk ./src/allmydata/mutable/retrieve.py 109
1694 
1695+        if self._need_privkey:
1696+            # TODO: Evaluate the need for this. We'll use it if we want
1697+            # to limit how many queries are on the wire for the privkey
1698+            # at once.
1699+            self._privkey_query_markers = [] # one Marker for each time we've
1700+                                             # tried to get the privkey.
1701+
1702         self._status = RetrieveStatus()
1703         self._status.set_storage_index(self._storage_index)
1704         self._status.set_helper(False)
1705hunk ./src/allmydata/mutable/retrieve.py 125
1706          offsets_tuple) = self.verinfo
1707         self._status.set_size(datalength)
1708         self._status.set_encoding(k, N)
1709+        self.readers = {}
1710 
1711     def get_status(self):
1712         return self._status
1713hunk ./src/allmydata/mutable/retrieve.py 149
1714         self.remaining_sharemap = DictOfSets()
1715         for (shnum, peerid, timestamp) in shares:
1716             self.remaining_sharemap.add(shnum, peerid)
1717+            # If the servermap update fetched anything, it fetched at least 1
1718+            # KiB, so we ask for that much.
1719+            # TODO: Change the cache methods to allow us to fetch all of the
1720+            # data that they have, then change this method to do that.
1721+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1722+                                                               shnum,
1723+                                                               0,
1724+                                                               1000)
1725+            ss = self.servermap.connections[peerid]
1726+            reader = MDMFSlotReadProxy(ss,
1727+                                       self._storage_index,
1728+                                       shnum,
1729+                                       any_cache)
1730+            reader.peerid = peerid
1731+            self.readers[shnum] = reader
1732+
1733 
1734         self.shares = {} # maps shnum to validated blocks
1735hunk ./src/allmydata/mutable/retrieve.py 167
1736+        self._active_readers = [] # list of active readers for this dl.
1737+        self._validated_readers = set() # set of readers that we have
1738+                                        # validated the prefix of
1739+        self._block_hash_trees = {} # shnum => hashtree
1740+        # TODO: Make this into a file-backed consumer or something to
1741+        # conserve memory.
1742+        self._plaintext = ""
1743 
1744         # how many shares do we need?
1745hunk ./src/allmydata/mutable/retrieve.py 176
1746-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1747+        (seqnum,
1748+         root_hash,
1749+         IV,
1750+         segsize,
1751+         datalength,
1752+         k,
1753+         N,
1754+         prefix,
1755          offsets_tuple) = self.verinfo
1756hunk ./src/allmydata/mutable/retrieve.py 185
1757-        assert len(self.remaining_sharemap) >= k
1758-        # we start with the lowest shnums we have available, since FEC is
1759-        # faster if we're using "primary shares"
1760-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1761-        for shnum in self.active_shnums:
1762-            # we use an arbitrary peer who has the share. If shares are
1763-            # doubled up (more than one share per peer), we could make this
1764-            # run faster by spreading the load among multiple peers. But the
1765-            # algorithm to do that is more complicated than I want to write
1766-            # right now, and a well-provisioned grid shouldn't have multiple
1767-            # shares per peer.
1768-            peerid = list(self.remaining_sharemap[shnum])[0]
1769-            self.get_data(shnum, peerid)
1770 
1771hunk ./src/allmydata/mutable/retrieve.py 186
1772-        # control flow beyond this point: state machine. Receiving responses
1773-        # from queries is the input. We might send out more queries, or we
1774-        # might produce a result.
1775 
1776hunk ./src/allmydata/mutable/retrieve.py 187
1777+        # We need one share hash tree for the entire file; its leaves
1778+        # are the roots of the block hash trees for the shares that
1779+        # comprise it, and its root is in the verinfo.
1780+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1781+        self.share_hash_tree.set_hashes({0: root_hash})
1782+
1783+        # This will set up both the segment decoder and the tail segment
1784+        # decoder, as well as a variety of other instance variables that
1785+        # the download process will use.
1786+        self._setup_encoding_parameters()
1787+        assert len(self.remaining_sharemap) >= k
1788+
1789+        self.log("starting download")
1790+        self._add_active_peers()
1791+        # The download process beyond this is a state machine.
1792+        # _add_active_peers will select the peers that we want to use
1793+        # for the download, and then attempt to start downloading. After
1794+        # each segment, it will check for doneness, reacting to broken
1795+        # peers and corrupt shares as necessary. If it runs out of good
1796+        # peers before downloading all of the segments, _done_deferred
1797+        # will errback.  Otherwise, it will eventually callback with the
1798+        # contents of the mutable file.
1799         return self._done_deferred
1800 
1801hunk ./src/allmydata/mutable/retrieve.py 211
1802-    def get_data(self, shnum, peerid):
1803-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1804-                 shnum=shnum,
1805-                 peerid=idlib.shortnodeid_b2a(peerid),
1806-                 level=log.NOISY)
1807-        ss = self.servermap.connections[peerid]
1808-        started = time.time()
1809-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1810+
1811+    def _setup_encoding_parameters(self):
1812+        """
1813+        I set up the encoding parameters, including k, n, the number
1814+        of segments associated with this file, and the segment decoder.
1815+        """
1816+        (seqnum,
1817+         root_hash,
1818+         IV,
1819+         segsize,
1820+         datalength,
1821+         k,
1822+         n,
1823+         known_prefix,
1824          offsets_tuple) = self.verinfo
1825hunk ./src/allmydata/mutable/retrieve.py 226
1826-        offsets = dict(offsets_tuple)
1827+        self._required_shares = k
1828+        self._total_shares = n
1829+        self._segment_size = segsize
1830+        self._data_length = datalength
1831+
1832+        if not IV:
1833+            self._version = MDMF_VERSION
1834+        else:
1835+            self._version = SDMF_VERSION
1836+
1837+        if datalength and segsize:
1838+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1839+            self._tail_data_size = datalength % segsize
1840+        else:
1841+            self._num_segments = 0
1842+            self._tail_data_size = 0
1843 
1844hunk ./src/allmydata/mutable/retrieve.py 243
1845-        # we read the checkstring, to make sure that the data we grab is from
1846-        # the right version.
1847-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1848+        self._segment_decoder = codec.CRSDecoder()
1849+        self._segment_decoder.set_params(segsize, k, n)
1850+        self._current_segment = 0
1851 
1852hunk ./src/allmydata/mutable/retrieve.py 247
1853-        # We also read the data, and the hashes necessary to validate them
1854-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1855-        # signature or the pubkey, since that was handled during the
1856-        # servermap phase, and we'll be comparing the share hash chain
1857-        # against the roothash that was validated back then.
1858+        if  not self._tail_data_size:
1859+            self._tail_data_size = segsize
1860 
1861hunk ./src/allmydata/mutable/retrieve.py 250
1862-        readv.append( (offsets['share_hash_chain'],
1863-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1864+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1865+                                                         self._required_shares)
1866+        if self._tail_segment_size == self._segment_size:
1867+            self._tail_decoder = self._segment_decoder
1868+        else:
1869+            self._tail_decoder = codec.CRSDecoder()
1870+            self._tail_decoder.set_params(self._tail_segment_size,
1871+                                          self._required_shares,
1872+                                          self._total_shares)
1873 
1874hunk ./src/allmydata/mutable/retrieve.py 260
1875-        # if we need the private key (for repair), we also fetch that
1876-        if self._need_privkey:
1877-            readv.append( (offsets['enc_privkey'],
1878-                           offsets['EOF'] - offsets['enc_privkey']) )
1879+        self.log("got encoding parameters: "
1880+                 "k: %d "
1881+                 "n: %d "
1882+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1883+                 (k, n, self._num_segments, self._segment_size,
1884+                  self._tail_segment_size))
1885 
1886hunk ./src/allmydata/mutable/retrieve.py 267
1887-        m = Marker()
1888-        self._outstanding_queries[m] = (peerid, shnum, started)
1889+        for i in xrange(self._total_shares):
1890+            # So we don't have to do this later.
1891+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1892 
1893hunk ./src/allmydata/mutable/retrieve.py 271
1894-        # ask the cache first
1895-        got_from_cache = False
1896-        datavs = []
1897-        for (offset, length) in readv:
1898-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1899-                                                            offset, length)
1900-            if data is not None:
1901-                datavs.append(data)
1902-        if len(datavs) == len(readv):
1903-            self.log("got data from cache")
1904-            got_from_cache = True
1905-            d = fireEventually({shnum: datavs})
1906-            # datavs is a dict mapping shnum to a pair of strings
1907-        else:
1908-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1909-        self.remaining_sharemap.discard(shnum, peerid)
1910+        # If we have more than one segment, we are an SDMF file, which
1911+        # means that we need to validate the salts as we receive them.
1912+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1913+        self._salt_hash_tree[0] = IV # from the prefix.
1914 
1915hunk ./src/allmydata/mutable/retrieve.py 276
1916-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1917-        d.addErrback(self._query_failed, m, peerid)
1918-        # errors that aren't handled by _query_failed (and errors caused by
1919-        # _query_failed) get logged, but we still want to check for doneness.
1920-        def _oops(f):
1921-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1922-                     shnum=shnum,
1923-                     peerid=idlib.shortnodeid_b2a(peerid),
1924-                     failure=f,
1925-                     level=log.WEIRD, umid="W0xnQA")
1926-        d.addErrback(_oops)
1927-        d.addBoth(self._check_for_done)
1928-        # any error during _check_for_done means the download fails. If the
1929-        # download is successful, _check_for_done will fire _done by itself.
1930-        d.addErrback(self._done)
1931-        d.addErrback(log.err)
1932-        return d # purely for testing convenience
1933 
1934hunk ./src/allmydata/mutable/retrieve.py 277
1935-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1936-        # isolate the callRemote to a separate method, so tests can subclass
1937-        # Publish and override it
1938-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1939-        return d
1940+    def _add_active_peers(self):
1941+        """
1942+        I populate self._active_readers with enough active readers to
1943+        retrieve the contents of this mutable file. I am called before
1944+        downloading starts, and (eventually) after each validation
1945+        error, connection error, or other problem in the download.
1946+        """
1947+        # TODO: It would be cool to investigate other heuristics for
1948+        # reader selection. For instance, the cost (in time the user
1949+        # spends waiting for their file) of selecting a really slow peer
1950+        # that happens to have a primary share is probably more than
1951+        # selecting a really fast peer that doesn't have a primary
1952+        # share. Maybe the servermap could be extended to provide this
1953+        # information; it could keep track of latency information while
1954+        # it gathers more important data, and then this routine could
1955+        # use that to select active readers.
1956+        #
1957+        # (these and other questions would be easier to answer with a
1958+        #  robust, configurable tahoe-lafs simulator, which modeled node
1959+        #  failures, differences in node speed, and other characteristics
1960+        #  that we expect storage servers to have.  You could have
1961+        #  presets for really stable grids (like allmydata.com),
1962+        #  friendnets, make it easy to configure your own settings, and
1963+        #  then simulate the effect of big changes on these use cases
1964+        #  instead of just reasoning about what the effect might be. Out
1965+        #  of scope for MDMF, though.)
1966 
1967hunk ./src/allmydata/mutable/retrieve.py 304
1968-    def remove_peer(self, peerid):
1969-        for shnum in list(self.remaining_sharemap.keys()):
1970-            self.remaining_sharemap.discard(shnum, peerid)
1971+        # We need at least self._required_shares readers to download a
1972+        # segment.
1973+        needed = self._required_shares - len(self._active_readers)
1974+        # XXX: Why don't format= log messages work here?
1975+        self.log("adding %d peers to the active peers list" % needed)
1976 
1977hunk ./src/allmydata/mutable/retrieve.py 310
1978-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1979-        now = time.time()
1980-        elapsed = now - started
1981-        if not got_from_cache:
1982-            self._status.add_fetch_timing(peerid, elapsed)
1983-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1984-                 shares=len(datavs),
1985-                 peerid=idlib.shortnodeid_b2a(peerid),
1986-                 level=log.NOISY)
1987-        self._outstanding_queries.pop(marker, None)
1988-        if not self._running:
1989-            return
1990+        # We favor lower numbered shares, since FEC is faster with
1991+        # primary shares than with other shares, and lower-numbered
1992+        # shares are more likely to be primary than higher numbered
1993+        # shares.
1994+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
1995+        # We shouldn't consider adding shares that we already have; this
1996+        # will cause problems later.
1997+        active_shnums -= set([reader.shnum for reader in self._active_readers])
1998+        active_shnums = list(active_shnums)[:needed]
1999+        if len(active_shnums) < needed:
2000+            # We don't have enough readers to retrieve the file; fail.
2001+            return self._failed()
2002 
2003hunk ./src/allmydata/mutable/retrieve.py 323
2004-        # note that we only ask for a single share per query, so we only
2005-        # expect a single share back. On the other hand, we use the extra
2006-        # shares if we get them.. seems better than an assert().
2007+        for shnum in active_shnums:
2008+            self._active_readers.append(self.readers[shnum])
2009+            self.log("added reader for share %d" % shnum)
2010+        assert len(self._active_readers) == self._required_shares
2011+        # Conceptually, this is part of the _add_active_peers step. It
2012+        # validates the prefixes of newly added readers to make sure
2013+        # that they match what we are expecting for self.verinfo. If
2014+        # validation is successful, _validate_active_prefixes will call
2015+        # _download_current_segment for us. If validation is
2016+        # unsuccessful, then _validate_prefixes will remove the peer and
2017+        # call _add_active_peers again, where we will attempt to rectify
2018+        # the problem by choosing another peer.
2019+        return self._validate_active_prefixes()
2020 
2021hunk ./src/allmydata/mutable/retrieve.py 337
2022-        for shnum,datav in datavs.items():
2023-            (prefix, hash_and_data) = datav[:2]
2024-            try:
2025-                self._got_results_one_share(shnum, peerid,
2026-                                            prefix, hash_and_data)
2027-            except CorruptShareError, e:
2028-                # log it and give the other shares a chance to be processed
2029-                f = failure.Failure()
2030-                self.log(format="bad share: %(f_value)s",
2031-                         f_value=str(f.value), failure=f,
2032-                         level=log.WEIRD, umid="7fzWZw")
2033-                self.notify_server_corruption(peerid, shnum, str(e))
2034-                self.remove_peer(peerid)
2035-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2036-                self._bad_shares.add( (peerid, shnum) )
2037-                self._status.problems[peerid] = f
2038-                self._last_failure = f
2039-                pass
2040-            if self._need_privkey and len(datav) > 2:
2041-                lp = None
2042-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2043-        # all done!
2044 
2045hunk ./src/allmydata/mutable/retrieve.py 338
2046-    def notify_server_corruption(self, peerid, shnum, reason):
2047-        ss = self.servermap.connections[peerid]
2048-        ss.callRemoteOnly("advise_corrupt_share",
2049-                          "mutable", self._storage_index, shnum, reason)
2050+    def _validate_active_prefixes(self):
2051+        """
2052+        I check to make sure that the prefixes on the peers that I am
2053+        currently reading from match the prefix that we want to see, as
2054+        said in self.verinfo.
2055 
2056hunk ./src/allmydata/mutable/retrieve.py 344
2057-    def _got_results_one_share(self, shnum, peerid,
2058-                               got_prefix, got_hash_and_data):
2059-        self.log("_got_results: got shnum #%d from peerid %s"
2060-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2061-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2062+        If I find that all of the active peers have acceptable prefixes,
2063+        I pass control to _download_current_segment, which will use
2064+        those peers to do cool things. If I find that some of the active
2065+        peers have unacceptable prefixes, I will remove them from active
2066+        peers (and from further consideration) and call
2067+        _add_active_peers to attempt to rectify the situation. I keep
2068+        track of which peers I have already validated so that I don't
2069+        need to do so again.
2070+        """
2071+        assert self._active_readers, "No more active readers"
2072+
2073+        ds = []
2074+        new_readers = set(self._active_readers) - self._validated_readers
2075+        self.log('validating %d newly-added active readers' % len(new_readers))
2076+
2077+        for reader in new_readers:
2078+            # We force a remote read here -- otherwise, we are relying
2079+            # on cached data that we already verified as valid, and we
2080+            # won't detect an uncoordinated write that has occurred
2081+            # since the last servermap update.
2082+            d = reader.get_prefix(force_remote=True)
2083+            d.addCallback(self._try_to_validate_prefix, reader)
2084+            ds.append(d)
2085+        dl = defer.DeferredList(ds, consumeErrors=True)
2086+        def _check_results(results):
2087+            # Each result in results will be of the form (success, msg).
2088+            # We don't care about msg, but success will tell us whether
2089+            # or not the checkstring validated. If it didn't, we need to
2090+            # remove the offending (peer,share) from our active readers,
2091+            # and ensure that active readers is again populated.
2092+            bad_readers = []
2093+            for i, result in enumerate(results):
2094+                if not result[0]:
2095+                    reader = self._active_readers[i]
2096+                    f = result[1]
2097+                    assert isinstance(f, failure.Failure)
2098+
2099+                    self.log("The reader %s failed to "
2100+                             "properly validate: %s" % \
2101+                             (reader, str(f.value)))
2102+                    bad_readers.append((reader, f))
2103+                else:
2104+                    reader = self._active_readers[i]
2105+                    self.log("the reader %s checks out, so we'll use it" % \
2106+                             reader)
2107+                    self._validated_readers.add(reader)
2108+                    # Each time we validate a reader, we check to see if
2109+                    # we need the private key. If we do, we politely ask
2110+                    # for it and then continue computing. If we find
2111+                    # that we haven't gotten it at the end of
2112+                    # segment decoding, then we'll take more drastic
2113+                    # measures.
2114+                    if self._need_privkey:
2115+                        d = reader.get_encprivkey()
2116+                        d.addCallback(self._try_to_validate_privkey, reader)
2117+            if bad_readers:
2118+                # We do them all at once, or else we screw up list indexing.
2119+                for (reader, f) in bad_readers:
2120+                    self._mark_bad_share(reader, f)
2121+                return self._add_active_peers()
2122+            else:
2123+                return self._download_current_segment()
2124+            # The next step will assert that it has enough active
2125+            # readers to fetch shares; we just need to remove it.
2126+        dl.addCallback(_check_results)
2127+        return dl
2128+
2129+
2130+    def _try_to_validate_prefix(self, prefix, reader):
2131+        """
2132+        I check that the prefix returned by a candidate server for
2133+        retrieval matches the prefix that the servermap knows about
2134+        (and, hence, the prefix that was validated earlier). If it does,
2135+        I return True, which means that I approve of the use of the
2136+        candidate server for segment retrieval. If it doesn't, I return
2137+        False, which means that another server must be chosen.
2138+        """
2139+        (seqnum,
2140+         root_hash,
2141+         IV,
2142+         segsize,
2143+         datalength,
2144+         k,
2145+         N,
2146+         known_prefix,
2147          offsets_tuple) = self.verinfo
2148hunk ./src/allmydata/mutable/retrieve.py 430
2149-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2150-        if got_prefix != prefix:
2151-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2152-            raise UncoordinatedWriteError(msg)
2153-        (share_hash_chain, block_hash_tree,
2154-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2155+        if known_prefix != prefix:
2156+            self.log("prefix from share %d doesn't match" % reader.shnum)
2157+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2158+                                          "indicate an uncoordinated write")
2159+        # Otherwise, we're okay -- no issues.
2160 
2161hunk ./src/allmydata/mutable/retrieve.py 436
2162-        assert isinstance(share_data, str)
2163-        # build the block hash tree. SDMF has only one leaf.
2164-        leaves = [hashutil.block_hash(share_data)]
2165-        t = hashtree.HashTree(leaves)
2166-        if list(t) != block_hash_tree:
2167-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2168-        share_hash_leaf = t[0]
2169-        t2 = hashtree.IncompleteHashTree(N)
2170-        # root_hash was checked by the signature
2171-        t2.set_hashes({0: root_hash})
2172-        try:
2173-            t2.set_hashes(hashes=share_hash_chain,
2174-                          leaves={shnum: share_hash_leaf})
2175-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2176-                IndexError), e:
2177-            msg = "corrupt hashes: %s" % (e,)
2178-            raise CorruptShareError(peerid, shnum, msg)
2179-        self.log(" data valid! len=%d" % len(share_data))
2180-        # each query comes down to this: placing validated share data into
2181-        # self.shares
2182-        self.shares[shnum] = share_data
2183 
2184hunk ./src/allmydata/mutable/retrieve.py 437
2185-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2186+    def _remove_reader(self, reader):
2187+        """
2188+        At various points, we will wish to remove a peer from
2189+        consideration and/or use. These include, but are not necessarily
2190+        limited to:
2191 
2192hunk ./src/allmydata/mutable/retrieve.py 443
2193-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2194-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2195-        if alleged_writekey != self._node.get_writekey():
2196-            self.log("invalid privkey from %s shnum %d" %
2197-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2198-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2199-            return
2200+            - A connection error.
2201+            - A mismatched prefix (that is, a prefix that does not match
2202+              our conception of the version information string).
2203+            - A failing block hash, salt hash, or share hash, which can
2204+              indicate disk failure/bit flips, or network trouble.
2205 
2206hunk ./src/allmydata/mutable/retrieve.py 449
2207-        # it's good
2208-        self.log("got valid privkey from shnum %d on peerid %s" %
2209-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2210-                 parent=lp)
2211-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2212-        self._node._populate_encprivkey(enc_privkey)
2213-        self._node._populate_privkey(privkey)
2214-        self._need_privkey = False
2215+        This method will do that. I will make sure that the
2216+        (shnum,reader) combination represented by my reader argument is
2217+        not used for anything else during this download. I will not
2218+        advise the reader of any corruption, something that my callers
2219+        may wish to do on their own.
2220+        """
2221+        # TODO: When you're done writing this, see if this is ever
2222+        # actually used for something that _mark_bad_share isn't. I have
2223+        # a feeling that they will be used for very similar things, and
2224+        # that having them both here is just going to be an epic amount
2225+        # of code duplication.
2226+        #
2227+        # (well, okay, not epic, but meaningful)
2228+        self.log("removing reader %s" % reader)
2229+        # Remove the reader from _active_readers
2230+        self._active_readers.remove(reader)
2231+        # TODO: self.readers.remove(reader)?
2232+        for shnum in list(self.remaining_sharemap.keys()):
2233+            self.remaining_sharemap.discard(shnum, reader.peerid)
2234 
2235hunk ./src/allmydata/mutable/retrieve.py 469
2236-    def _query_failed(self, f, marker, peerid):
2237-        self.log(format="query to [%(peerid)s] failed",
2238-                 peerid=idlib.shortnodeid_b2a(peerid),
2239-                 level=log.NOISY)
2240-        self._status.problems[peerid] = f
2241-        self._outstanding_queries.pop(marker, None)
2242-        if not self._running:
2243-            return
2244-        self._last_failure = f
2245-        self.remove_peer(peerid)
2246-        level = log.WEIRD
2247-        if f.check(DeadReferenceError):
2248-            level = log.UNUSUAL
2249-        self.log(format="error during query: %(f_value)s",
2250-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2251 
2252hunk ./src/allmydata/mutable/retrieve.py 470
2253-    def _check_for_done(self, res):
2254-        # exit paths:
2255-        #  return : keep waiting, no new queries
2256-        #  return self._send_more_queries(outstanding) : send some more queries
2257-        #  fire self._done(plaintext) : download successful
2258-        #  raise exception : download fails
2259+    def _mark_bad_share(self, reader, f):
2260+        """
2261+        I mark the (peerid, shnum) encapsulated by my reader argument as
2262+        a bad share, which means that it will not be used anywhere else.
2263 
2264hunk ./src/allmydata/mutable/retrieve.py 475
2265-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2266-                 running=self._running, decoding=self._decoding,
2267-                 level=log.NOISY)
2268-        if not self._running:
2269-            return
2270-        if self._decoding:
2271-            return
2272-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2273-         offsets_tuple) = self.verinfo
2274+        There are several reasons to want to mark something as a bad
2275+        share. These include:
2276 
2277hunk ./src/allmydata/mutable/retrieve.py 478
2278-        if len(self.shares) < k:
2279-            # we don't have enough shares yet
2280-            return self._maybe_send_more_queries(k)
2281-        if self._need_privkey:
2282-            # we got k shares, but none of them had a valid privkey. TODO:
2283-            # look further. Adding code to do this is a bit complicated, and
2284-            # I want to avoid that complication, and this should be pretty
2285-            # rare (k shares with bitflips in the enc_privkey but not in the
2286-            # data blocks). If we actually do get here, the subsequent repair
2287-            # will fail for lack of a privkey.
2288-            self.log("got k shares but still need_privkey, bummer",
2289-                     level=log.WEIRD, umid="MdRHPA")
2290+            - A connection error to the peer.
2291+            - A mismatched prefix (that is, a prefix that does not match
2292+              our local conception of the version information string).
2293+            - A failing block hash, salt hash, share hash, or other
2294+              integrity check.
2295 
2296hunk ./src/allmydata/mutable/retrieve.py 484
2297-        # we have enough to finish. All the shares have had their hashes
2298-        # checked, so if something fails at this point, we don't know how
2299-        # to fix it, so the download will fail.
2300+        This method will ensure that readers that we wish to mark bad
2301+        (for these reasons or other reasons) are not used for the rest
2302+        of the download. Additionally, it will attempt to tell the
2303+        remote peer (with no guarantee of success) that its share is
2304+        corrupt.
2305+        """
2306+        self.log("marking share %d on server %s as bad" % \
2307+                 (reader.shnum, reader))
2308+        self._remove_reader(reader)
2309+        self._bad_shares.add((reader.peerid, reader.shnum))
2310+        self._status.problems[reader.peerid] = f
2311+        self._last_failure = f
2312+        self.notify_server_corruption(reader.peerid, reader.shnum,
2313+                                      str(f.value))
2314 
2315hunk ./src/allmydata/mutable/retrieve.py 499
2316-        self._decoding = True # avoid reentrancy
2317-        self._status.set_status("decoding")
2318-        now = time.time()
2319-        elapsed = now - self._started
2320-        self._status.timings["fetch"] = elapsed
2321 
2322hunk ./src/allmydata/mutable/retrieve.py 500
2323-        d = defer.maybeDeferred(self._decode)
2324-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2325-        d.addBoth(self._done)
2326-        return d # purely for test convenience
2327+    def _download_current_segment(self):
2328+        """
2329+        I download, validate, decode, decrypt, and assemble the segment
2330+        that this Retrieve is currently responsible for downloading.
2331+        """
2332+        assert len(self._active_readers) >= self._required_shares
2333+        if self._current_segment < self._num_segments:
2334+            d = self._process_segment(self._current_segment)
2335+        else:
2336+            d = defer.succeed(None)
2337+        d.addCallback(self._check_for_done)
2338+        return d
2339 
2340hunk ./src/allmydata/mutable/retrieve.py 513
2341-    def _maybe_send_more_queries(self, k):
2342-        # we don't have enough shares yet. Should we send out more queries?
2343-        # There are some number of queries outstanding, each for a single
2344-        # share. If we can generate 'needed_shares' additional queries, we do
2345-        # so. If we can't, then we know this file is a goner, and we raise
2346-        # NotEnoughSharesError.
2347-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2348-                         "outstanding=%(outstanding)d"),
2349-                 have=len(self.shares), k=k,
2350-                 outstanding=len(self._outstanding_queries),
2351-                 level=log.NOISY)
2352 
2353hunk ./src/allmydata/mutable/retrieve.py 514
2354-        remaining_shares = k - len(self.shares)
2355-        needed = remaining_shares - len(self._outstanding_queries)
2356-        if not needed:
2357-            # we have enough queries in flight already
2358+    def _process_segment(self, segnum):
2359+        """
2360+        I download, validate, decode, and decrypt one segment of the
2361+        file that this Retrieve is retrieving. This means coordinating
2362+        the process of getting k blocks of that file, validating them,
2363+        assembling them into one segment with the decoder, and then
2364+        decrypting them.
2365+        """
2366+        self.log("processing segment %d" % segnum)
2367 
2368hunk ./src/allmydata/mutable/retrieve.py 524
2369-            # TODO: but if they've been in flight for a long time, and we
2370-            # have reason to believe that new queries might respond faster
2371-            # (i.e. we've seen other queries come back faster, then consider
2372-            # sending out new queries. This could help with peers which have
2373-            # silently gone away since the servermap was updated, for which
2374-            # we're still waiting for the 15-minute TCP disconnect to happen.
2375-            self.log("enough queries are in flight, no more are needed",
2376-                     level=log.NOISY)
2377-            return
2378+        # TODO: The old code uses a marker. Should this code do that
2379+        # too? What did the Marker do?
2380+        assert len(self._active_readers) >= self._required_shares
2381+
2382+        # We need to ask each of our active readers for its block and
2383+        # salt. We will then validate those. If validation is
2384+        # successful, we will assemble the results into plaintext.
2385+        ds = []
2386+        for reader in self._active_readers:
2387+            d = reader.get_block_and_salt(segnum, queue=True)
2388+            d2 = self._get_needed_hashes(reader, segnum)
2389+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2390+            dl.addCallback(self._validate_block, segnum, reader)
2391+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2392+            ds.append(dl)
2393+            reader.flush()
2394+        dl = defer.DeferredList(ds)
2395+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2396+        return dl
2397 
2398hunk ./src/allmydata/mutable/retrieve.py 544
2399-        outstanding_shnums = set([shnum
2400-                                  for (peerid, shnum, started)
2401-                                  in self._outstanding_queries.values()])
2402-        # prefer low-numbered shares, they are more likely to be primary
2403-        available_shnums = sorted(self.remaining_sharemap.keys())
2404-        for shnum in available_shnums:
2405-            if shnum in outstanding_shnums:
2406-                # skip ones that are already in transit
2407-                continue
2408-            if shnum not in self.remaining_sharemap:
2409-                # no servers for that shnum. note that DictOfSets removes
2410-                # empty sets from the dict for us.
2411-                continue
2412-            peerid = list(self.remaining_sharemap[shnum])[0]
2413-            # get_data will remove that peerid from the sharemap, and add the
2414-            # query to self._outstanding_queries
2415-            self._status.set_status("Retrieving More Shares")
2416-            self.get_data(shnum, peerid)
2417-            needed -= 1
2418-            if not needed:
2419+
2420+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2421+        """
2422+        I take the results of fetching and validating the blocks from a
2423+        callback chain in another method. If the results are such that
2424+        they tell me that validation and fetching succeeded without
2425+        incident, I will proceed with decoding and decryption.
2426+        Otherwise, I will do nothing.
2427+        """
2428+        self.log("trying to decode and decrypt segment %d" % segnum)
2429+        failures = False
2430+        for block_and_salt in blocks_and_salts:
2431+            if not block_and_salt[0] or block_and_salt[1] == None:
2432+                self.log("some validation operations failed; not proceeding")
2433+                failures = True
2434                 break
2435hunk ./src/allmydata/mutable/retrieve.py 560
2436+        if not failures:
2437+            self.log("everything looks ok, building segment %d" % segnum)
2438+            d = self._decode_blocks(blocks_and_salts, segnum)
2439+            d.addCallback(self._decrypt_segment)
2440+            d.addErrback(self._validation_or_decoding_failed,
2441+                         self._active_readers)
2442+            d.addCallback(self._set_segment)
2443+            return d
2444+        else:
2445+            return defer.succeed(None)
2446+
2447+
2448+    def _set_segment(self, segment):
2449+        """
2450+        Given a plaintext segment, I register that segment with the
2451+        target that is handling the file download.
2452+        """
2453+        self.log("got plaintext for segment %d" % self._current_segment)
2454+        self._plaintext += segment
2455+        self._current_segment += 1
2456 
2457hunk ./src/allmydata/mutable/retrieve.py 581
2458-        # at this point, we have as many outstanding queries as we can. If
2459-        # needed!=0 then we might not have enough to recover the file.
2460-        if needed:
2461-            format = ("ran out of peers: "
2462-                      "have %(have)d shares (k=%(k)d), "
2463-                      "%(outstanding)d queries in flight, "
2464-                      "need %(need)d more, "
2465-                      "found %(bad)d bad shares")
2466-            args = {"have": len(self.shares),
2467-                    "k": k,
2468-                    "outstanding": len(self._outstanding_queries),
2469-                    "need": needed,
2470-                    "bad": len(self._bad_shares),
2471-                    }
2472-            self.log(format=format,
2473-                     level=log.WEIRD, umid="ezTfjw", **args)
2474-            err = NotEnoughSharesError("%s, last failure: %s" %
2475-                                      (format % args, self._last_failure))
2476-            if self._bad_shares:
2477-                self.log("We found some bad shares this pass. You should "
2478-                         "update the servermap and try again to check "
2479-                         "more peers",
2480-                         level=log.WEIRD, umid="EFkOlA")
2481-                err.servermap = self.servermap
2482-            raise err
2483 
2484hunk ./src/allmydata/mutable/retrieve.py 582
2485+    def _validation_or_decoding_failed(self, f, readers):
2486+        """
2487+        I am called when a block or a salt fails to correctly validate, or when
2488+        the decryption or decoding operation fails for some reason.  I react to
2489+        this failure by notifying the remote server of corruption, and then
2490+        removing the remote peer from further activity.
2491+        """
2492+        assert isinstance(readers, list)
2493+        bad_shnums = [reader.shnum for reader in readers]
2494+
2495+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2496+                 ", segment %d: %s" % \
2497+                 (bad_shnums, readers, self._current_segment, str(f)))
2498+        for reader in readers:
2499+            self._mark_bad_share(reader, f)
2500         return
2501 
2502hunk ./src/allmydata/mutable/retrieve.py 599
2503-    def _decode(self):
2504-        started = time.time()
2505-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2506-         offsets_tuple) = self.verinfo
2507 
2508hunk ./src/allmydata/mutable/retrieve.py 600
2509-        # shares_dict is a dict mapping shnum to share data, but the codec
2510-        # wants two lists.
2511-        shareids = []; shares = []
2512-        for shareid, share in self.shares.items():
2513+    def _validate_block(self, results, segnum, reader):
2514+        """
2515+        I validate a block from one share on a remote server.
2516+        """
2517+        # Grab the part of the block hash tree that is necessary to
2518+        # validate this block, then generate the block hash root.
2519+        self.log("validating share %d for segment %d" % (reader.shnum,
2520+                                                             segnum))
2521+        # Did we fail to fetch either of the things that we were
2522+        # supposed to? Fail if so.
2523+        if not results[0][0] and results[1][0]:
2524+            # handled by the errback handler.
2525+
2526+            # These all get batched into one query, so the resulting
2527+            # failure should be the same for all of them, so we can just
2528+            # use the first one.
2529+            assert isinstance(results[0][1], failure.Failure)
2530+
2531+            f = results[0][1]
2532+            raise CorruptShareError(reader.peerid,
2533+                                    reader.shnum,
2534+                                    "Connection error: %s" % str(f))
2535+
2536+        block_and_salt, block_and_sharehashes = results
2537+        block, salt = block_and_salt[1]
2538+        blockhashes, sharehashes = block_and_sharehashes[1]
2539+
2540+        blockhashes = dict(enumerate(blockhashes[1]))
2541+        self.log("the reader gave me the following blockhashes: %s" % \
2542+                 blockhashes.keys())
2543+        self.log("the reader gave me the following sharehashes: %s" % \
2544+                 sharehashes[1].keys())
2545+        bht = self._block_hash_trees[reader.shnum]
2546+
2547+        if bht.needed_hashes(segnum, include_leaf=True):
2548+            try:
2549+                bht.set_hashes(blockhashes)
2550+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2551+                    IndexError), e:
2552+                raise CorruptShareError(reader.peerid,
2553+                                        reader.shnum,
2554+                                        "block hash tree failure: %s" % e)
2555+
2556+        if self._version == MDMF_VERSION:
2557+            blockhash = hashutil.block_hash(salt + block)
2558+        else:
2559+            blockhash = hashutil.block_hash(block)
2560+        # If this works without an error, then validation is
2561+        # successful.
2562+        try:
2563+           bht.set_hashes(leaves={segnum: blockhash})
2564+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2565+                IndexError), e:
2566+            raise CorruptShareError(reader.peerid,
2567+                                    reader.shnum,
2568+                                    "block hash tree failure: %s" % e)
2569+
2570+        # Reaching this point means that we know that this segment
2571+        # is correct. Now we need to check to see whether the share
2572+        # hash chain is also correct.
2573+        # SDMF wrote share hash chains that didn't contain the
2574+        # leaves, which would be produced from the block hash tree.
2575+        # So we need to validate the block hash tree first. If
2576+        # successful, then bht[0] will contain the root for the
2577+        # shnum, which will be a leaf in the share hash tree, which
2578+        # will allow us to validate the rest of the tree.
2579+        if self.share_hash_tree.needed_hashes(reader.shnum,
2580+                                               include_leaf=True):
2581+            try:
2582+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2583+                                            leaves={reader.shnum: bht[0]})
2584+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2585+                    IndexError), e:
2586+                raise CorruptShareError(reader.peerid,
2587+                                        reader.shnum,
2588+                                        "corrupt hashes: %s" % e)
2589+
2590+        # TODO: Validate the salt, too.
2591+        self.log('share %d is valid for segment %d' % (reader.shnum,
2592+                                                       segnum))
2593+        return {reader.shnum: (block, salt)}
2594+
2595+
2596+    def _get_needed_hashes(self, reader, segnum):
2597+        """
2598+        I get the hashes needed to validate segnum from the reader, then return
2599+        to my caller when this is done.
2600+        """
2601+        bht = self._block_hash_trees[reader.shnum]
2602+        needed = bht.needed_hashes(segnum, include_leaf=True)
2603+        # The root of the block hash tree is also a leaf in the share
2604+        # hash tree. So we don't need to fetch it from the remote
2605+        # server. In the case of files with one segment, this means that
2606+        # we won't fetch any block hash tree from the remote server,
2607+        # since the hash of each share of the file is the entire block
2608+        # hash tree, and is a leaf in the share hash tree. This is fine,
2609+        # since any share corruption will be detected in the share hash
2610+        # tree.
2611+        #needed.discard(0)
2612+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2613+                 (segnum, reader.shnum, str(needed)))
2614+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2615+        if self.share_hash_tree.needed_hashes(reader.shnum):
2616+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2617+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2618+                                                                 str(need)))
2619+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2620+        else:
2621+            d2 = defer.succeed({}) # the logic in the next method
2622+                                   # expects a dict
2623+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2624+        return dl
2625+
2626+
2627+    def _decode_blocks(self, blocks_and_salts, segnum):
2628+        """
2629+        I take a list of k blocks and salts, and decode that into a
2630+        single encrypted segment.
2631+        """
2632+        d = {}
2633+        # We want to merge our dictionaries to the form
2634+        # {shnum: blocks_and_salts}
2635+        #
2636+        # The dictionaries come from validate block that way, so we just
2637+        # need to merge them.
2638+        for block_and_salt in blocks_and_salts:
2639+            d.update(block_and_salt[1])
2640+
2641+        # All of these blocks should have the same salt; in SDMF, it is
2642+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2643+        # either case, we just need to get one of them and use it.
2644+        #
2645+        # d.items()[0] is like (shnum, (block, salt))
2646+        # d.items()[0][1] is like (block, salt)
2647+        # d.items()[0][1][1] is the salt.
2648+        salt = d.items()[0][1][1]
2649+        # Next, extract just the blocks from the dict. We'll use the
2650+        # salt in the next step.
2651+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2652+        d2 = dict(share_and_shareids)
2653+        shareids = []
2654+        shares = []
2655+        for shareid, share in d2.items():
2656             shareids.append(shareid)
2657             shares.append(share)
2658 
2659hunk ./src/allmydata/mutable/retrieve.py 746
2660-        assert len(shareids) >= k, len(shareids)
2661+        assert len(shareids) >= self._required_shares, len(shareids)
2662         # zfec really doesn't want extra shares
2663hunk ./src/allmydata/mutable/retrieve.py 748
2664-        shareids = shareids[:k]
2665-        shares = shares[:k]
2666-
2667-        fec = codec.CRSDecoder()
2668-        fec.set_params(segsize, k, N)
2669-
2670-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2671-        self.log("about to decode, shareids=%s" % (shareids,))
2672-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2673-        def _done(buffers):
2674-            self._status.timings["decode"] = time.time() - started
2675-            self.log(" decode done, %d buffers" % len(buffers))
2676+        shareids = shareids[:self._required_shares]
2677+        shares = shares[:self._required_shares]
2678+        self.log("decoding segment %d" % segnum)
2679+        if segnum == self._num_segments - 1:
2680+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2681+        else:
2682+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2683+        def _process(buffers):
2684             segment = "".join(buffers)
2685hunk ./src/allmydata/mutable/retrieve.py 757
2686+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2687+                     segnum=segnum,
2688+                     numsegs=self._num_segments,
2689+                     level=log.NOISY)
2690             self.log(" joined length %d, datalength %d" %
2691hunk ./src/allmydata/mutable/retrieve.py 762
2692-                     (len(segment), datalength))
2693-            segment = segment[:datalength]
2694+                     (len(segment), self._data_length))
2695+            if segnum == self._num_segments - 1:
2696+                size_to_use = self._tail_data_size
2697+            else:
2698+                size_to_use = self._segment_size
2699+            segment = segment[:size_to_use]
2700             self.log(" segment len=%d" % len(segment))
2701hunk ./src/allmydata/mutable/retrieve.py 769
2702-            return segment
2703-        def _err(f):
2704-            self.log(" decode failed: %s" % f)
2705-            return f
2706-        d.addCallback(_done)
2707-        d.addErrback(_err)
2708+            return segment, salt
2709+        d.addCallback(_process)
2710         return d
2711 
2712hunk ./src/allmydata/mutable/retrieve.py 773
2713-    def _decrypt(self, crypttext, IV, readkey):
2714+
2715+    def _decrypt_segment(self, segment_and_salt):
2716+        """
2717+        I take a single segment and its salt, and decrypt it. I return
2718+        the plaintext of the segment that is in my argument.
2719+        """
2720+        segment, salt = segment_and_salt
2721         self._status.set_status("decrypting")
2722hunk ./src/allmydata/mutable/retrieve.py 781
2723+        self.log("decrypting segment %d" % self._current_segment)
2724         started = time.time()
2725hunk ./src/allmydata/mutable/retrieve.py 783
2726-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2727+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2728         decryptor = AES(key)
2729hunk ./src/allmydata/mutable/retrieve.py 785
2730-        plaintext = decryptor.process(crypttext)
2731+        plaintext = decryptor.process(segment)
2732         self._status.timings["decrypt"] = time.time() - started
2733         return plaintext
2734 
2735hunk ./src/allmydata/mutable/retrieve.py 789
2736-    def _done(self, res):
2737-        if not self._running:
2738+
2739+    def notify_server_corruption(self, peerid, shnum, reason):
2740+        ss = self.servermap.connections[peerid]
2741+        ss.callRemoteOnly("advise_corrupt_share",
2742+                          "mutable", self._storage_index, shnum, reason)
2743+
2744+
2745+    def _try_to_validate_privkey(self, enc_privkey, reader):
2746+
2747+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2748+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2749+        if alleged_writekey != self._node.get_writekey():
2750+            self.log("invalid privkey from %s shnum %d" %
2751+                     (reader, reader.shnum),
2752+                     level=log.WEIRD, umid="YIw4tA")
2753             return
2754hunk ./src/allmydata/mutable/retrieve.py 805
2755-        self._running = False
2756-        self._status.set_active(False)
2757-        self._status.timings["total"] = time.time() - self._started
2758-        # res is either the new contents, or a Failure
2759-        if isinstance(res, failure.Failure):
2760-            self.log("Retrieve done, with failure", failure=res,
2761-                     level=log.UNUSUAL)
2762-            self._status.set_status("Failed")
2763-        else:
2764-            self.log("Retrieve done, success!")
2765-            self._status.set_status("Finished")
2766-            self._status.set_progress(1.0)
2767-            # remember the encoding parameters, use them again next time
2768-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2769-             offsets_tuple) = self.verinfo
2770-            self._node._populate_required_shares(k)
2771-            self._node._populate_total_shares(N)
2772-        eventually(self._done_deferred.callback, res)
2773 
2774hunk ./src/allmydata/mutable/retrieve.py 806
2775+        # it's good
2776+        self.log("got valid privkey from shnum %d on reader %s" %
2777+                 (reader.shnum, reader))
2778+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2779+        self._node._populate_encprivkey(enc_privkey)
2780+        self._node._populate_privkey(privkey)
2781+        self._need_privkey = False
2782+
2783+
2784+    def _check_for_done(self, res):
2785+        """
2786+        I check to see if this Retrieve object has successfully finished
2787+        its work.
2788+
2789+        I can exit in the following ways:
2790+            - If there are no more segments to download, then I exit by
2791+              causing self._done_deferred to fire with the plaintext
2792+              content requested by the caller.
2793+            - If there are still segments to be downloaded, and there
2794+              are enough active readers (readers which have not broken
2795+              and have not given us corrupt data) to continue
2796+              downloading, I send control back to
2797+              _download_current_segment.
2798+            - If there are still segments to be downloaded but there are
2799+              not enough active peers to download them, I ask
2800+              _add_active_peers to add more peers. If it is successful,
2801+              it will call _download_current_segment. If there are not
2802+              enough peers to retrieve the file, then that will cause
2803+              _done_deferred to errback.
2804+        """
2805+        self.log("checking for doneness")
2806+        if self._current_segment == self._num_segments:
2807+            # No more segments to download, we're done.
2808+            self.log("got plaintext, done")
2809+            return self._done()
2810+
2811+        if len(self._active_readers) >= self._required_shares:
2812+            # More segments to download, but we have enough good peers
2813+            # in self._active_readers that we can do that without issue,
2814+            # so go nab the next segment.
2815+            self.log("not done yet: on segment %d of %d" % \
2816+                     (self._current_segment + 1, self._num_segments))
2817+            return self._download_current_segment()
2818+
2819+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2820+                 (self._current_segment + 1, self._num_segments))
2821+        return self._add_active_peers()
2822+
2823+
2824+    def _done(self):
2825+        """
2826+        I am called by _check_for_done when the download process has
2827+        finished successfully. After making some useful logging
2828+        statements, I return the decrypted contents to the owner of this
2829+        Retrieve object through self._done_deferred.
2830+        """
2831+        eventually(self._done_deferred.callback, self._plaintext)
2832+
2833+
2834+    def _failed(self):
2835+        """
2836+        I am called by _add_active_peers when there are not enough
2837+        active peers left to complete the download. After making some
2838+        useful logging statements, I return an exception to that effect
2839+        to the caller of this Retrieve object through
2840+        self._done_deferred.
2841+        """
2842+        format = ("ran out of peers: "
2843+                  "have %(have)d of %(total)d segments "
2844+                  "found %(bad)d bad shares "
2845+                  "encoding %(k)d-of-%(n)d")
2846+        args = {"have": self._current_segment,
2847+                "total": self._num_segments,
2848+                "k": self._required_shares,
2849+                "n": self._total_shares,
2850+                "bad": len(self._bad_shares)}
2851+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2852+                                                        str(self._last_failure)))
2853+        f = failure.Failure(e)
2854+        eventually(self._done_deferred.callback, f)
2855hunk ./src/allmydata/test/test_mutable.py 12
2856 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2857      ssk_pubkey_fingerprint_hash
2858 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2859-     NotEnoughSharesError
2860+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2861 from allmydata.monitor import Monitor
2862 from allmydata.test.common import ShouldFailMixin
2863 from allmydata.test.no_network import GridTestMixin
2864hunk ./src/allmydata/test/test_mutable.py 28
2865 from allmydata.mutable.retrieve import Retrieve
2866 from allmydata.mutable.publish import Publish
2867 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2868-from allmydata.mutable.layout import unpack_header, unpack_share
2869+from allmydata.mutable.layout import unpack_header, unpack_share, \
2870+                                     MDMFSlotReadProxy
2871 from allmydata.mutable.repairer import MustForceRepairError
2872 
2873 import allmydata.test.common_util as testutil
2874hunk ./src/allmydata/test/test_mutable.py 104
2875         d = fireEventually()
2876         d.addCallback(lambda res: _call())
2877         return d
2878+
2879     def callRemoteOnly(self, methname, *args, **kwargs):
2880         d = self.callRemote(methname, *args, **kwargs)
2881         d.addBoth(lambda ignore: None)
2882hunk ./src/allmydata/test/test_mutable.py 163
2883 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2884     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2885     # list of shnums to corrupt.
2886+    ds = []
2887     for peerid in s._peers:
2888         shares = s._peers[peerid]
2889         for shnum in shares:
2890hunk ./src/allmydata/test/test_mutable.py 190
2891                 else:
2892                     offset1 = offset
2893                     offset2 = 0
2894-                if offset1 == "pubkey":
2895+                if offset1 == "pubkey" and IV:
2896                     real_offset = 107
2897hunk ./src/allmydata/test/test_mutable.py 192
2898+                elif offset1 == "share_data" and not IV:
2899+                    real_offset = 104
2900                 elif offset1 in o:
2901                     real_offset = o[offset1]
2902                 else:
2903hunk ./src/allmydata/test/test_mutable.py 327
2904         d.addCallback(_created)
2905         return d
2906 
2907+
2908+    def test_upload_and_download_mdmf(self):
2909+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2910+        def _created(n):
2911+            d = defer.succeed(None)
2912+            d.addCallback(lambda ignored:
2913+                n.get_servermap(MODE_READ))
2914+            def _then(servermap):
2915+                dumped = servermap.dump(StringIO())
2916+                self.failUnlessIn("3-of-10", dumped.getvalue())
2917+            d.addCallback(_then)
2918+            # Now overwrite the contents with some new contents. We want
2919+            # to make them big enough to force the file to be uploaded
2920+            # in more than one segment.
2921+            big_contents = "contents1" * 100000 # about 900 KiB
2922+            d.addCallback(lambda ignored:
2923+                n.overwrite(big_contents))
2924+            d.addCallback(lambda ignored:
2925+                n.download_best_version())
2926+            d.addCallback(lambda data:
2927+                self.failUnlessEqual(data, big_contents))
2928+            # Overwrite the contents again with some new contents. As
2929+            # before, they need to be big enough to force multiple
2930+            # segments, so that we make the downloader deal with
2931+            # multiple segments.
2932+            bigger_contents = "contents2" * 1000000 # about 9MiB
2933+            d.addCallback(lambda ignored:
2934+                n.overwrite(bigger_contents))
2935+            d.addCallback(lambda ignored:
2936+                n.download_best_version())
2937+            d.addCallback(lambda data:
2938+                self.failUnlessEqual(data, bigger_contents))
2939+            return d
2940+        d.addCallback(_created)
2941+        return d
2942+
2943+
2944     def test_create_with_initial_contents(self):
2945         d = self.nodemaker.create_mutable_file("contents 1")
2946         def _created(n):
2947hunk ./src/allmydata/test/test_mutable.py 1147
2948 
2949 
2950     def _test_corrupt_all(self, offset, substring,
2951-                          should_succeed=False, corrupt_early=True,
2952-                          failure_checker=None):
2953+                          should_succeed=False,
2954+                          corrupt_early=True,
2955+                          failure_checker=None,
2956+                          fetch_privkey=False):
2957         d = defer.succeed(None)
2958         if corrupt_early:
2959             d.addCallback(corrupt, self._storage, offset)
2960hunk ./src/allmydata/test/test_mutable.py 1167
2961                     self.failUnlessIn(substring, "".join(allproblems))
2962                 return servermap
2963             if should_succeed:
2964-                d1 = self._fn.download_version(servermap, ver)
2965+                d1 = self._fn.download_version(servermap, ver,
2966+                                               fetch_privkey)
2967                 d1.addCallback(lambda new_contents:
2968                                self.failUnlessEqual(new_contents, self.CONTENTS))
2969             else:
2970hunk ./src/allmydata/test/test_mutable.py 1175
2971                 d1 = self.shouldFail(NotEnoughSharesError,
2972                                      "_corrupt_all(offset=%s)" % (offset,),
2973                                      substring,
2974-                                     self._fn.download_version, servermap, ver)
2975+                                     self._fn.download_version, servermap,
2976+                                                                ver,
2977+                                                                fetch_privkey)
2978             if failure_checker:
2979                 d1.addCallback(failure_checker)
2980             d1.addCallback(lambda res: servermap)
2981hunk ./src/allmydata/test/test_mutable.py 1186
2982         return d
2983 
2984     def test_corrupt_all_verbyte(self):
2985-        # when the version byte is not 0, we hit an UnknownVersionError error
2986-        # in unpack_share().
2987+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
2988+        # error in unpack_share().
2989         d = self._test_corrupt_all(0, "UnknownVersionError")
2990         def _check_servermap(servermap):
2991             # and the dump should mention the problems
2992hunk ./src/allmydata/test/test_mutable.py 1193
2993             s = StringIO()
2994             dump = servermap.dump(s).getvalue()
2995-            self.failUnless("10 PROBLEMS" in dump, dump)
2996+            self.failUnless("30 PROBLEMS" in dump, dump)
2997         d.addCallback(_check_servermap)
2998         return d
2999 
3000hunk ./src/allmydata/test/test_mutable.py 1263
3001         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
3002 
3003 
3004+    def test_corrupt_all_encprivkey_late(self):
3005+        # this should work for the same reason as above, but we corrupt
3006+        # after the servermap update to exercise the error handling
3007+        # code.
3008+        # We need to remove the privkey from the node, or the retrieve
3009+        # process won't know to update it.
3010+        self._fn._privkey = None
3011+        return self._test_corrupt_all("enc_privkey",
3012+                                      None, # this shouldn't fail
3013+                                      should_succeed=True,
3014+                                      corrupt_early=False,
3015+                                      fetch_privkey=True)
3016+
3017+
3018     def test_corrupt_all_seqnum_late(self):
3019         # corrupting the seqnum between mapupdate and retrieve should result
3020         # in NotEnoughSharesError, since each share will look invalid
3021hunk ./src/allmydata/test/test_mutable.py 1283
3022         def _check(res):
3023             f = res[0]
3024             self.failUnless(f.check(NotEnoughSharesError))
3025-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3026+            self.failUnless("uncoordinated write" in str(f))
3027         return self._test_corrupt_all(1, "ran out of peers",
3028                                       corrupt_early=False,
3029                                       failure_checker=_check)
3030hunk ./src/allmydata/test/test_mutable.py 1333
3031                       self.failUnlessEqual(new_contents, self.CONTENTS))
3032         return d
3033 
3034-    def test_corrupt_some(self):
3035-        # corrupt the data of first five shares (so the servermap thinks
3036-        # they're good but retrieve marks them as bad), so that the
3037-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3038-        # retry with more servers.
3039-        corrupt(None, self._storage, "share_data", range(5))
3040-        d = self.make_servermap()
3041+
3042+    def _test_corrupt_some(self, offset, mdmf=False):
3043+        if mdmf:
3044+            d = self.publish_mdmf()
3045+        else:
3046+            d = defer.succeed(None)
3047+        d.addCallback(lambda ignored:
3048+            corrupt(None, self._storage, offset, range(5)))
3049+        d.addCallback(lambda ignored:
3050+            self.make_servermap())
3051         def _do_retrieve(servermap):
3052             ver = servermap.best_recoverable_version()
3053             self.failUnless(ver)
3054hunk ./src/allmydata/test/test_mutable.py 1349
3055             return self._fn.download_best_version()
3056         d.addCallback(_do_retrieve)
3057         d.addCallback(lambda new_contents:
3058-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3059+            self.failUnlessEqual(new_contents, self.CONTENTS))
3060         return d
3061 
3062hunk ./src/allmydata/test/test_mutable.py 1352
3063+
3064+    def test_corrupt_some(self):
3065+        # corrupt the data of first five shares (so the servermap thinks
3066+        # they're good but retrieve marks them as bad), so that the
3067+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3068+        # retry with more servers.
3069+        return self._test_corrupt_some("share_data")
3070+
3071+
3072     def test_download_fails(self):
3073         d = corrupt(None, self._storage, "signature")
3074         d.addCallback(lambda ignored:
3075hunk ./src/allmydata/test/test_mutable.py 1366
3076             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3077                             "no recoverable versions",
3078-                            self._fn.download_best_version)
3079+                            self._fn.download_best_version))
3080         return d
3081 
3082 
3083hunk ./src/allmydata/test/test_mutable.py 1370
3084+
3085+    def test_corrupt_mdmf_block_hash_tree(self):
3086+        d = self.publish_mdmf()
3087+        d.addCallback(lambda ignored:
3088+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3089+                                   "block hash tree failure",
3090+                                   corrupt_early=False,
3091+                                   should_succeed=False))
3092+        return d
3093+
3094+
3095+    def test_corrupt_mdmf_block_hash_tree_late(self):
3096+        d = self.publish_mdmf()
3097+        d.addCallback(lambda ignored:
3098+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3099+                                   "block hash tree failure",
3100+                                   corrupt_early=True,
3101+                                   should_succeed=False))
3102+        return d
3103+
3104+
3105+    def test_corrupt_mdmf_share_data(self):
3106+        d = self.publish_mdmf()
3107+        d.addCallback(lambda ignored:
3108+            # TODO: Find out what the block size is and corrupt a
3109+            # specific block, rather than just guessing.
3110+            self._test_corrupt_all(("share_data", 12 * 40),
3111+                                    "block hash tree failure",
3112+                                    corrupt_early=True,
3113+                                    should_succeed=False))
3114+        return d
3115+
3116+
3117+    def test_corrupt_some_mdmf(self):
3118+        return self._test_corrupt_some(("share_data", 12 * 40),
3119+                                       mdmf=True)
3120+
3121+
3122 class CheckerMixin:
3123     def check_good(self, r, where):
3124         self.failUnless(r.is_healthy(), where)
3125hunk ./src/allmydata/test/test_mutable.py 2116
3126             d.addCallback(lambda res:
3127                           self.shouldFail(NotEnoughSharesError,
3128                                           "test_retrieve_surprise",
3129-                                          "ran out of peers: have 0 shares (k=3)",
3130+                                          "ran out of peers: have 0 of 1",
3131                                           n.download_version,
3132                                           self.old_map,
3133                                           self.old_map.best_recoverable_version(),
3134hunk ./src/allmydata/test/test_mutable.py 2125
3135         d.addCallback(_created)
3136         return d
3137 
3138+
3139     def test_unexpected_shares(self):
3140         # upload the file, take a servermap, shut down one of the servers,
3141         # upload it again (causing shares to appear on a new server), then
3142hunk ./src/allmydata/test/test_mutable.py 2329
3143         self.basedir = "mutable/Problems/test_privkey_query_missing"
3144         self.set_up_grid(num_servers=20)
3145         nm = self.g.clients[0].nodemaker
3146-        LARGE = "These are Larger contents" * 2000 # about 50KB
3147+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3148         nm._node_cache = DevNullDictionary() # disable the nodecache
3149 
3150         d = nm.create_mutable_file(LARGE)
3151hunk ./src/allmydata/test/test_mutable.py 2342
3152         d.addCallback(_created)
3153         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3154         return d
3155+
3156+
3157+    def test_block_and_hash_query_error(self):
3158+        # This tests for what happens when a query to a remote server
3159+        # fails in either the hash validation step or the block getting
3160+        # step (because of batching, this is the same actual query).
3161+        # We need to have the storage server persist up until the point
3162+        # that its prefix is validated, then suddenly die. This
3163+        # exercises some exception handling code in Retrieve.
3164+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3165+        self.set_up_grid(num_servers=20)
3166+        nm = self.g.clients[0].nodemaker
3167+        CONTENTS = "contents" * 2000
3168+        d = nm.create_mutable_file(CONTENTS)
3169+        def _created(node):
3170+            self._node = node
3171+        d.addCallback(_created)
3172+        d.addCallback(lambda ignored:
3173+            self._node.get_servermap(MODE_READ))
3174+        def _then(servermap):
3175+            # we have our servermap. Now we set up the servers like the
3176+            # tests above -- the first one that gets a read call should
3177+            # start throwing errors, but only after returning its prefix
3178+            # for validation. Since we'll download without fetching the
3179+            # private key, the next query to the remote server will be
3180+            # for either a block and salt or for hashes, either of which
3181+            # will exercise the error handling code.
3182+            killer = FirstServerGetsKilled()
3183+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3184+                ss.post_call_notifier = killer.notify
3185+            ver = servermap.best_recoverable_version()
3186+            assert ver
3187+            return self._node.download_version(servermap, ver)
3188+        d.addCallback(_then)
3189+        d.addCallback(lambda data:
3190+            self.failUnlessEqual(data, CONTENTS))
3191+        return d
3192}
3193[mutable/checker.py: check MDMF files
3194Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3195 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3196 
3197 This patch adapts the mutable file checker and verifier to check and
3198 verify MDMF files. It does this by using the new segmented downloader,
3199 which is trained to perform verification operations on request. This
3200 removes some code duplication.
3201] {
3202hunk ./src/allmydata/mutable/checker.py 12
3203 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3204 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3205 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3206+from allmydata.mutable.retrieve import Retrieve # for verifying
3207 
3208 class MutableChecker:
3209 
3210hunk ./src/allmydata/mutable/checker.py 29
3211 
3212     def check(self, verify=False, add_lease=False):
3213         servermap = ServerMap()
3214+        # Updating the servermap in MODE_CHECK will stand a good chance
3215+        # of finding all of the shares, and getting a good idea of
3216+        # recoverability, etc, without verifying.
3217         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3218                              servermap, MODE_CHECK, add_lease=add_lease)
3219         if self._history:
3220hunk ./src/allmydata/mutable/checker.py 55
3221         if num_recoverable:
3222             self.best_version = servermap.best_recoverable_version()
3223 
3224+        # The file is unhealthy and needs to be repaired if:
3225+        # - There are unrecoverable versions.
3226         if servermap.unrecoverable_versions():
3227             self.need_repair = True
3228hunk ./src/allmydata/mutable/checker.py 59
3229+        # - There isn't a recoverable version.
3230         if num_recoverable != 1:
3231             self.need_repair = True
3232hunk ./src/allmydata/mutable/checker.py 62
3233+        # - The best recoverable version is missing some shares.
3234         if self.best_version:
3235             available_shares = servermap.shares_available()
3236             (num_distinct_shares, k, N) = available_shares[self.best_version]
3237hunk ./src/allmydata/mutable/checker.py 73
3238 
3239     def _verify_all_shares(self, servermap):
3240         # read every byte of each share
3241+        #
3242+        # This logic is going to be very nearly the same as the
3243+        # downloader. I bet we could pass the downloader a flag that
3244+        # makes it do this, and piggyback onto that instead of
3245+        # duplicating a bunch of code.
3246+        #
3247+        # Like:
3248+        #  r = Retrieve(blah, blah, blah, verify=True)
3249+        #  d = r.download()
3250+        #  (wait, wait, wait, d.callback)
3251+        # 
3252+        #  Then, when it has finished, we can check the servermap (which
3253+        #  we provided to Retrieve) to figure out which shares are bad,
3254+        #  since the Retrieve process will have updated the servermap as
3255+        #  it went along.
3256+        #
3257+        #  By passing the verify=True flag to the constructor, we are
3258+        #  telling the downloader a few things.
3259+        #
3260+        #  1. It needs to download all N shares, not just K shares.
3261+        #  2. It doesn't need to decrypt or decode the shares, only
3262+        #     verify them.
3263         if not self.best_version:
3264             return
3265hunk ./src/allmydata/mutable/checker.py 97
3266-        versionmap = servermap.make_versionmap()
3267-        shares = versionmap[self.best_version]
3268-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3269-         offsets_tuple) = self.best_version
3270-        offsets = dict(offsets_tuple)
3271-        readv = [ (0, offsets["EOF"]) ]
3272-        dl = []
3273-        for (shnum, peerid, timestamp) in shares:
3274-            ss = servermap.connections[peerid]
3275-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3276-            d.addCallback(self._got_answer, peerid, servermap)
3277-            dl.append(d)
3278-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3279 
3280hunk ./src/allmydata/mutable/checker.py 98
3281-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3282-        # isolate the callRemote to a separate method, so tests can subclass
3283-        # Publish and override it
3284-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3285+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3286+        d = r.download()
3287+        d.addCallback(self._process_bad_shares)
3288         return d
3289 
3290hunk ./src/allmydata/mutable/checker.py 103
3291-    def _got_answer(self, datavs, peerid, servermap):
3292-        for shnum,datav in datavs.items():
3293-            data = datav[0]
3294-            try:
3295-                self._got_results_one_share(shnum, peerid, data)
3296-            except CorruptShareError:
3297-                f = failure.Failure()
3298-                self.need_repair = True
3299-                self.bad_shares.append( (peerid, shnum, f) )
3300-                prefix = data[:SIGNED_PREFIX_LENGTH]
3301-                servermap.mark_bad_share(peerid, shnum, prefix)
3302-                ss = servermap.connections[peerid]
3303-                self.notify_server_corruption(ss, shnum, str(f.value))
3304-
3305-    def check_prefix(self, peerid, shnum, data):
3306-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3307-         offsets_tuple) = self.best_version
3308-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3309-        if got_prefix != prefix:
3310-            raise CorruptShareError(peerid, shnum,
3311-                                    "prefix mismatch: share changed while we were reading it")
3312-
3313-    def _got_results_one_share(self, shnum, peerid, data):
3314-        self.check_prefix(peerid, shnum, data)
3315-
3316-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3317-        # which checks their signature against the pubkey known to be
3318-        # associated with this file.
3319 
3320hunk ./src/allmydata/mutable/checker.py 104
3321-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3322-         share_hash_chain, block_hash_tree, share_data,
3323-         enc_privkey) = unpack_share(data)
3324-
3325-        # validate [share_hash_chain,block_hash_tree,share_data]
3326-
3327-        leaves = [hashutil.block_hash(share_data)]
3328-        t = hashtree.HashTree(leaves)
3329-        if list(t) != block_hash_tree:
3330-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3331-        share_hash_leaf = t[0]
3332-        t2 = hashtree.IncompleteHashTree(N)
3333-        # root_hash was checked by the signature
3334-        t2.set_hashes({0: root_hash})
3335-        try:
3336-            t2.set_hashes(hashes=share_hash_chain,
3337-                          leaves={shnum: share_hash_leaf})
3338-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3339-                IndexError), e:
3340-            msg = "corrupt hashes: %s" % (e,)
3341-            raise CorruptShareError(peerid, shnum, msg)
3342-
3343-        # validate enc_privkey: only possible if we have a write-cap
3344-        if not self._node.is_readonly():
3345-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3346-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3347-            if alleged_writekey != self._node.get_writekey():
3348-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3349+    def _process_bad_shares(self, bad_shares):
3350+        if bad_shares:
3351+            self.need_repair = True
3352+        self.bad_shares = bad_shares
3353 
3354hunk ./src/allmydata/mutable/checker.py 109
3355-    def notify_server_corruption(self, ss, shnum, reason):
3356-        ss.callRemoteOnly("advise_corrupt_share",
3357-                          "mutable", self._storage_index, shnum, reason)
3358 
3359     def _count_shares(self, smap, version):
3360         available_shares = smap.shares_available()
3361hunk ./src/allmydata/test/test_mutable.py 193
3362                 if offset1 == "pubkey" and IV:
3363                     real_offset = 107
3364                 elif offset1 == "share_data" and not IV:
3365-                    real_offset = 104
3366+                    real_offset = 107
3367                 elif offset1 in o:
3368                     real_offset = o[offset1]
3369                 else:
3370hunk ./src/allmydata/test/test_mutable.py 395
3371             return d
3372         d.addCallback(_created)
3373         return d
3374+    test_create_mdmf_with_initial_contents.timeout = 20
3375 
3376 
3377     def test_create_with_initial_contents_function(self):
3378hunk ./src/allmydata/test/test_mutable.py 700
3379                                            k, N, segsize, datalen)
3380                 self.failUnless(p._pubkey.verify(sig_material, signature))
3381                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3382-                self.failUnless(isinstance(share_hash_chain, dict))
3383-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3384+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3385                 for shnum,share_hash in share_hash_chain.items():
3386                     self.failUnless(isinstance(shnum, int))
3387                     self.failUnless(isinstance(share_hash, str))
3388hunk ./src/allmydata/test/test_mutable.py 820
3389                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3390 
3391 
3392+
3393+
3394 class Servermap(unittest.TestCase, PublishMixin):
3395     def setUp(self):
3396         return self.publish_one()
3397hunk ./src/allmydata/test/test_mutable.py 951
3398         self._storage._peers = {} # delete all shares
3399         ms = self.make_servermap
3400         d = defer.succeed(None)
3401-
3402+#
3403         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3404         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3405 
3406hunk ./src/allmydata/test/test_mutable.py 1440
3407         d.addCallback(self.check_good, "test_check_good")
3408         return d
3409 
3410+    def test_check_mdmf_good(self):
3411+        d = self.publish_mdmf()
3412+        d.addCallback(lambda ignored:
3413+            self._fn.check(Monitor()))
3414+        d.addCallback(self.check_good, "test_check_mdmf_good")
3415+        return d
3416+
3417     def test_check_no_shares(self):
3418         for shares in self._storage._peers.values():
3419             shares.clear()
3420hunk ./src/allmydata/test/test_mutable.py 1454
3421         d.addCallback(self.check_bad, "test_check_no_shares")
3422         return d
3423 
3424+    def test_check_mdmf_no_shares(self):
3425+        d = self.publish_mdmf()
3426+        def _then(ignored):
3427+            for share in self._storage._peers.values():
3428+                share.clear()
3429+        d.addCallback(_then)
3430+        d.addCallback(lambda ignored:
3431+            self._fn.check(Monitor()))
3432+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3433+        return d
3434+
3435     def test_check_not_enough_shares(self):
3436         for shares in self._storage._peers.values():
3437             for shnum in shares.keys():
3438hunk ./src/allmydata/test/test_mutable.py 1474
3439         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3440         return d
3441 
3442+    def test_check_mdmf_not_enough_shares(self):
3443+        d = self.publish_mdmf()
3444+        def _then(ignored):
3445+            for shares in self._storage._peers.values():
3446+                for shnum in shares.keys():
3447+                    if shnum > 0:
3448+                        del shares[shnum]
3449+        d.addCallback(_then)
3450+        d.addCallback(lambda ignored:
3451+            self._fn.check(Monitor()))
3452+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3453+        return d
3454+
3455+
3456     def test_check_all_bad_sig(self):
3457         d = corrupt(None, self._storage, 1) # bad sig
3458         d.addCallback(lambda ignored:
3459hunk ./src/allmydata/test/test_mutable.py 1495
3460         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3461         return d
3462 
3463+    def test_check_mdmf_all_bad_sig(self):
3464+        d = self.publish_mdmf()
3465+        d.addCallback(lambda ignored:
3466+            corrupt(None, self._storage, 1))
3467+        d.addCallback(lambda ignored:
3468+            self._fn.check(Monitor()))
3469+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3470+        return d
3471+
3472     def test_check_all_bad_blocks(self):
3473         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3474         # the Checker won't notice this.. it doesn't look at actual data
3475hunk ./src/allmydata/test/test_mutable.py 1512
3476         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3477         return d
3478 
3479+
3480+    def test_check_mdmf_all_bad_blocks(self):
3481+        d = self.publish_mdmf()
3482+        d.addCallback(lambda ignored:
3483+            corrupt(None, self._storage, "share_data"))
3484+        d.addCallback(lambda ignored:
3485+            self._fn.check(Monitor()))
3486+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3487+        return d
3488+
3489     def test_verify_good(self):
3490         d = self._fn.check(Monitor(), verify=True)
3491         d.addCallback(self.check_good, "test_verify_good")
3492hunk ./src/allmydata/test/test_mutable.py 1582
3493                       "test_verify_one_bad_encprivkey_uncheckable")
3494         return d
3495 
3496+
3497+    def test_verify_mdmf_good(self):
3498+        d = self.publish_mdmf()
3499+        d.addCallback(lambda ignored:
3500+            self._fn.check(Monitor(), verify=True))
3501+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3502+        return d
3503+
3504+
3505+    def test_verify_mdmf_one_bad_block(self):
3506+        d = self.publish_mdmf()
3507+        d.addCallback(lambda ignored:
3508+            corrupt(None, self._storage, "share_data", [1]))
3509+        d.addCallback(lambda ignored:
3510+            self._fn.check(Monitor(), verify=True))
3511+        # We should find one bad block here
3512+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3513+        d.addCallback(self.check_expected_failure,
3514+                      CorruptShareError, "block hash tree failure",
3515+                      "test_verify_mdmf_one_bad_block")
3516+        return d
3517+
3518+
3519+    def test_verify_mdmf_bad_encprivkey(self):
3520+        d = self.publish_mdmf()
3521+        d.addCallback(lambda ignored:
3522+            corrupt(None, self._storage, "enc_privkey", [1]))
3523+        d.addCallback(lambda ignored:
3524+            self._fn.check(Monitor(), verify=True))
3525+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3526+        d.addCallback(self.check_expected_failure,
3527+                      CorruptShareError, "privkey",
3528+                      "test_verify_mdmf_bad_encprivkey")
3529+        return d
3530+
3531+
3532+    def test_verify_mdmf_bad_sig(self):
3533+        d = self.publish_mdmf()
3534+        d.addCallback(lambda ignored:
3535+            corrupt(None, self._storage, 1, [1]))
3536+        d.addCallback(lambda ignored:
3537+            self._fn.check(Monitor(), verify=True))
3538+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3539+        return d
3540+
3541+
3542+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3543+        d = self.publish_mdmf()
3544+        d.addCallback(lambda ignored:
3545+            corrupt(None, self._storage, "enc_privkey", [1]))
3546+        d.addCallback(lambda ignored:
3547+            self._fn.get_readonly())
3548+        d.addCallback(lambda fn:
3549+            fn.check(Monitor(), verify=True))
3550+        d.addCallback(self.check_good,
3551+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3552+        return d
3553+
3554+
3555 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3556 
3557     def get_shares(self, s):
3558hunk ./src/allmydata/test/test_mutable.py 1706
3559         current_shares = self.old_shares[-1]
3560         self.failUnlessEqual(old_shares, current_shares)
3561 
3562+
3563     def test_unrepairable_0shares(self):
3564         d = self.publish_one()
3565         def _delete_all_shares(ign):
3566hunk ./src/allmydata/test/test_mutable.py 1721
3567         d.addCallback(_check)
3568         return d
3569 
3570+    def test_mdmf_unrepairable_0shares(self):
3571+        d = self.publish_mdmf()
3572+        def _delete_all_shares(ign):
3573+            shares = self._storage._peers
3574+            for peerid in shares:
3575+                shares[peerid] = {}
3576+        d.addCallback(_delete_all_shares)
3577+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3578+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3579+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3580+        return d
3581+
3582+
3583     def test_unrepairable_1share(self):
3584         d = self.publish_one()
3585         def _delete_all_shares(ign):
3586hunk ./src/allmydata/test/test_mutable.py 1750
3587         d.addCallback(_check)
3588         return d
3589 
3590+    def test_mdmf_unrepairable_1share(self):
3591+        d = self.publish_mdmf()
3592+        def _delete_all_shares(ign):
3593+            shares = self._storage._peers
3594+            for peerid in shares:
3595+                for shnum in list(shares[peerid]):
3596+                    if shnum > 0:
3597+                        del shares[peerid][shnum]
3598+        d.addCallback(_delete_all_shares)
3599+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3600+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3601+        def _check(crr):
3602+            self.failUnlessEqual(crr.get_successful(), False)
3603+        d.addCallback(_check)
3604+        return d
3605+
3606+    def test_repairable_5shares(self):
3607+        d = self.publish_mdmf()
3608+        def _delete_all_shares(ign):
3609+            shares = self._storage._peers
3610+            for peerid in shares:
3611+                for shnum in list(shares[peerid]):
3612+                    if shnum > 4:
3613+                        del shares[peerid][shnum]
3614+        d.addCallback(_delete_all_shares)
3615+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3616+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3617+        def _check(crr):
3618+            self.failUnlessEqual(crr.get_successful(), True)
3619+        d.addCallback(_check)
3620+        return d
3621+
3622+    def test_mdmf_repairable_5shares(self):
3623+        d = self.publish_mdmf()
3624+        def _delete_all_shares(ign):
3625+            shares = self._storage._peers
3626+            for peerid in shares:
3627+                for shnum in list(shares[peerid]):
3628+                    if shnum > 5:
3629+                        del shares[peerid][shnum]
3630+        d.addCallback(_delete_all_shares)
3631+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3632+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3633+        def _check(crr):
3634+            self.failUnlessEqual(crr.get_successful(), True)
3635+        d.addCallback(_check)
3636+        return d
3637+
3638+
3639     def test_merge(self):
3640         self.old_shares = []
3641         d = self.publish_multiple()
3642}
3643[mutable/retrieve.py: learn how to verify mutable files
3644Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3645 Ignore-this: 989af7800c47589620918461ec989483
3646] {
3647hunk ./src/allmydata/mutable/retrieve.py 86
3648     # Retrieve object will remain tied to a specific version of the file, and
3649     # will use a single ServerMap instance.
3650 
3651-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3652+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3653+                 verify=False):
3654         self._node = filenode
3655         assert self._node.get_pubkey()
3656         self._storage_index = filenode.get_storage_index()
3657hunk ./src/allmydata/mutable/retrieve.py 106
3658         # during repair, we may be called upon to grab the private key, since
3659         # it wasn't picked up during a verify=False checker run, and we'll
3660         # need it for repair to generate a new version.
3661-        self._need_privkey = fetch_privkey
3662-        if self._node.get_privkey():
3663+        self._need_privkey = fetch_privkey or verify
3664+        if self._node.get_privkey() and not verify:
3665             self._need_privkey = False
3666 
3667         if self._need_privkey:
3668hunk ./src/allmydata/mutable/retrieve.py 117
3669             self._privkey_query_markers = [] # one Marker for each time we've
3670                                              # tried to get the privkey.
3671 
3672+        # verify means that we are using the downloader logic to verify all
3673+        # of our shares. This tells the downloader a few things.
3674+        #
3675+        # 1. We need to download all of the shares.
3676+        # 2. We don't need to decode or decrypt the shares, since our
3677+        #    caller doesn't care about the plaintext, only the
3678+        #    information about which shares are or are not valid.
3679+        # 3. When we are validating readers, we need to validate the
3680+        #    signature on the prefix. Do we? We already do this in the
3681+        #    servermap update?
3682+        #
3683+        # (just work on 1 and 2 for now, I guess)
3684+        self._verify = False
3685+        if verify:
3686+            self._verify = True
3687+
3688         self._status = RetrieveStatus()
3689         self._status.set_storage_index(self._storage_index)
3690         self._status.set_helper(False)
3691hunk ./src/allmydata/mutable/retrieve.py 323
3692 
3693         # We need at least self._required_shares readers to download a
3694         # segment.
3695-        needed = self._required_shares - len(self._active_readers)
3696+        if self._verify:
3697+            needed = self._total_shares
3698+        else:
3699+            needed = self._required_shares - len(self._active_readers)
3700         # XXX: Why don't format= log messages work here?
3701         self.log("adding %d peers to the active peers list" % needed)
3702 
3703hunk ./src/allmydata/mutable/retrieve.py 339
3704         # will cause problems later.
3705         active_shnums -= set([reader.shnum for reader in self._active_readers])
3706         active_shnums = list(active_shnums)[:needed]
3707-        if len(active_shnums) < needed:
3708+        if len(active_shnums) < needed and not self._verify:
3709             # We don't have enough readers to retrieve the file; fail.
3710             return self._failed()
3711 
3712hunk ./src/allmydata/mutable/retrieve.py 346
3713         for shnum in active_shnums:
3714             self._active_readers.append(self.readers[shnum])
3715             self.log("added reader for share %d" % shnum)
3716-        assert len(self._active_readers) == self._required_shares
3717+        assert len(self._active_readers) >= self._required_shares
3718         # Conceptually, this is part of the _add_active_peers step. It
3719         # validates the prefixes of newly added readers to make sure
3720         # that they match what we are expecting for self.verinfo. If
3721hunk ./src/allmydata/mutable/retrieve.py 416
3722                     # that we haven't gotten it at the end of
3723                     # segment decoding, then we'll take more drastic
3724                     # measures.
3725-                    if self._need_privkey:
3726+                    if self._need_privkey and not self._node.is_readonly():
3727                         d = reader.get_encprivkey()
3728                         d.addCallback(self._try_to_validate_privkey, reader)
3729             if bad_readers:
3730hunk ./src/allmydata/mutable/retrieve.py 423
3731                 # We do them all at once, or else we screw up list indexing.
3732                 for (reader, f) in bad_readers:
3733                     self._mark_bad_share(reader, f)
3734-                return self._add_active_peers()
3735+                if self._verify:
3736+                    if len(self._active_readers) >= self._required_shares:
3737+                        return self._download_current_segment()
3738+                    else:
3739+                        return self._failed()
3740+                else:
3741+                    return self._add_active_peers()
3742             else:
3743                 return self._download_current_segment()
3744             # The next step will assert that it has enough active
3745hunk ./src/allmydata/mutable/retrieve.py 518
3746         """
3747         self.log("marking share %d on server %s as bad" % \
3748                  (reader.shnum, reader))
3749+        prefix = self.verinfo[-2]
3750+        self.servermap.mark_bad_share(reader.peerid,
3751+                                      reader.shnum,
3752+                                      prefix)
3753         self._remove_reader(reader)
3754hunk ./src/allmydata/mutable/retrieve.py 523
3755-        self._bad_shares.add((reader.peerid, reader.shnum))
3756+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3757         self._status.problems[reader.peerid] = f
3758         self._last_failure = f
3759         self.notify_server_corruption(reader.peerid, reader.shnum,
3760hunk ./src/allmydata/mutable/retrieve.py 571
3761             ds.append(dl)
3762             reader.flush()
3763         dl = defer.DeferredList(ds)
3764-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3765+        if self._verify:
3766+            dl.addCallback(lambda ignored: "")
3767+            dl.addCallback(self._set_segment)
3768+        else:
3769+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3770         return dl
3771 
3772 
3773hunk ./src/allmydata/mutable/retrieve.py 701
3774         # shnum, which will be a leaf in the share hash tree, which
3775         # will allow us to validate the rest of the tree.
3776         if self.share_hash_tree.needed_hashes(reader.shnum,
3777-                                               include_leaf=True):
3778+                                              include_leaf=True) or \
3779+                                              self._verify:
3780             try:
3781                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3782                                             leaves={reader.shnum: bht[0]})
3783hunk ./src/allmydata/mutable/retrieve.py 832
3784 
3785 
3786     def _try_to_validate_privkey(self, enc_privkey, reader):
3787-
3788         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3789         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3790         if alleged_writekey != self._node.get_writekey():
3791hunk ./src/allmydata/mutable/retrieve.py 838
3792             self.log("invalid privkey from %s shnum %d" %
3793                      (reader, reader.shnum),
3794                      level=log.WEIRD, umid="YIw4tA")
3795+            if self._verify:
3796+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3797+                                              self.verinfo[-2])
3798+                e = CorruptShareError(reader.peerid,
3799+                                      reader.shnum,
3800+                                      "invalid privkey")
3801+                f = failure.Failure(e)
3802+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3803             return
3804 
3805         # it's good
3806hunk ./src/allmydata/mutable/retrieve.py 904
3807         statements, I return the decrypted contents to the owner of this
3808         Retrieve object through self._done_deferred.
3809         """
3810-        eventually(self._done_deferred.callback, self._plaintext)
3811+        if self._verify:
3812+            ret = list(self._bad_shares)
3813+            self.log("done verifying, found %d bad shares" % len(ret))
3814+        else:
3815+            ret = self._plaintext
3816+        eventually(self._done_deferred.callback, ret)
3817 
3818 
3819     def _failed(self):
3820hunk ./src/allmydata/mutable/retrieve.py 920
3821         to the caller of this Retrieve object through
3822         self._done_deferred.
3823         """
3824-        format = ("ran out of peers: "
3825-                  "have %(have)d of %(total)d segments "
3826-                  "found %(bad)d bad shares "
3827-                  "encoding %(k)d-of-%(n)d")
3828-        args = {"have": self._current_segment,
3829-                "total": self._num_segments,
3830-                "k": self._required_shares,
3831-                "n": self._total_shares,
3832-                "bad": len(self._bad_shares)}
3833-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3834-                                                        str(self._last_failure)))
3835-        f = failure.Failure(e)
3836-        eventually(self._done_deferred.callback, f)
3837+        if self._verify:
3838+            ret = list(self._bad_shares)
3839+        else:
3840+            format = ("ran out of peers: "
3841+                      "have %(have)d of %(total)d segments "
3842+                      "found %(bad)d bad shares "
3843+                      "encoding %(k)d-of-%(n)d")
3844+            args = {"have": self._current_segment,
3845+                    "total": self._num_segments,
3846+                    "k": self._required_shares,
3847+                    "n": self._total_shares,
3848+                    "bad": len(self._bad_shares)}
3849+            e = NotEnoughSharesError("%s, last failure: %s" % \
3850+                                     (format % args, str(self._last_failure)))
3851+            f = failure.Failure(e)
3852+            ret = f
3853+        eventually(self._done_deferred.callback, ret)
3854}
3855[interfaces.py: add IMutableSlotWriter
3856Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3857 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3858] hunk ./src/allmydata/interfaces.py 418
3859         """
3860 
3861 
3862+class IMutableSlotWriter(Interface):
3863+    """
3864+    The interface for a writer around a mutable slot on a remote server.
3865+    """
3866+    def set_checkstring(checkstring, *args):
3867+        """
3868+        Set the checkstring that I will pass to the remote server when
3869+        writing.
3870+
3871+            @param checkstring A packed checkstring to use.
3872+
3873+        Note that implementations can differ in which semantics they
3874+        wish to support for set_checkstring -- they can, for example,
3875+        build the checkstring themselves from its constituents, or
3876+        some other thing.
3877+        """
3878+
3879+    def get_checkstring():
3880+        """
3881+        Get the checkstring that I think currently exists on the remote
3882+        server.
3883+        """
3884+
3885+    def put_block(data, segnum, salt):
3886+        """
3887+        Add a block and salt to the share.
3888+        """
3889+
3890+    def put_encprivey(encprivkey):
3891+        """
3892+        Add the encrypted private key to the share.
3893+        """
3894+
3895+    def put_blockhashes(blockhashes=list):
3896+        """
3897+        Add the block hash tree to the share.
3898+        """
3899+
3900+    def put_sharehashes(sharehashes=dict):
3901+        """
3902+        Add the share hash chain to the share.
3903+        """
3904+
3905+    def get_signable():
3906+        """
3907+        Return the part of the share that needs to be signed.
3908+        """
3909+
3910+    def put_signature(signature):
3911+        """
3912+        Add the signature to the share.
3913+        """
3914+
3915+    def put_verification_key(verification_key):
3916+        """
3917+        Add the verification key to the share.
3918+        """
3919+
3920+    def finish_publishing():
3921+        """
3922+        Do anything necessary to finish writing the share to a remote
3923+        server. I require that no further publishing needs to take place
3924+        after this method has been called.
3925+        """
3926+
3927+
3928 class IURI(Interface):
3929     def init_from_string(uri):
3930         """Accept a string (as created by my to_string() method) and populate
3931[test/test_mutable.py: temporarily disable two tests that are now irrelevant
3932Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
3933 Ignore-this: 701e143567f3954812ca6960af1d6ac7
3934] {
3935hunk ./src/allmydata/test/test_mutable.py 651
3936             self.failUnlessEqual(len(share_ids), 10)
3937         d.addCallback(_done)
3938         return d
3939+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
3940 
3941     def test_generate(self):
3942         nm = make_nodemaker()
3943hunk ./src/allmydata/test/test_mutable.py 713
3944                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
3945         d.addCallback(_generated)
3946         return d
3947+    test_generate.todo = "Write an equivalent of this for the new uploader"
3948 
3949     # TODO: when we publish to 20 peers, we should get one share per peer on 10
3950     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
3951}
3952[Add MDMF reader and writer, and SDMF writer
3953Kevan Carstensen <kevan@isnotajoke.com>**20100702225531
3954 Ignore-this: bf6276a91d27dcb4e779b0eb82ea1843
3955 
3956 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
3957 object proxies that exist for immutable files. They abstract away
3958 details of connection, state, and caching from their callers (in this
3959 case, the download, servermap updater, and uploader), and expose methods
3960 to get and set information on the remote server.
3961 
3962 MDMFSlotReadProxy reads a mutable file from the server, doing the right
3963 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
3964 allows callers to tell it how to batch and flush reads.
3965 
3966 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
3967 
3968 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
3969 
3970 This patch also includes tests for MDMFSlotReadProxy,
3971 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
3972] {
3973hunk ./src/allmydata/mutable/layout.py 4
3974 
3975 import struct
3976 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
3977+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
3978+                                 MDMF_VERSION, IMutableSlotWriter
3979+from allmydata.util import mathutil, observer
3980+from twisted.python import failure
3981+from twisted.internet import defer
3982+from zope.interface import implements
3983+
3984+
3985+# These strings describe the format of the packed structs they help process
3986+# Here's what they mean:
3987+#
3988+#  PREFIX:
3989+#    >: Big-endian byte order; the most significant byte is first (leftmost).
3990+#    B: The version information; an 8 bit version identifier. Stored as
3991+#       an unsigned char. This is currently 00 00 00 00; our modifications
3992+#       will turn it into 00 00 00 01.
3993+#    Q: The sequence number; this is sort of like a revision history for
3994+#       mutable files; they start at 1 and increase as they are changed after
3995+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
3996+#       length.
3997+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
3998+#       characters = 32 bytes to store the value.
3999+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4000+#       16 characters.
4001+#
4002+#  SIGNED_PREFIX additions, things that are covered by the signature:
4003+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4004+#       which is convenient because our erasure coding scheme cannot
4005+#       encode if you ask for more than 255 pieces.
4006+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4007+#       same reasons as above.
4008+#    Q: The segment size of the uploaded file. This will essentially be the
4009+#       length of the file in SDMF. An unsigned long long, so we can store
4010+#       files of quite large size.
4011+#    Q: The data length of the uploaded file. Modulo padding, this will be
4012+#       the same of the data length field. Like the data length field, it is
4013+#       an unsigned long long and can be quite large.
4014+#
4015+#   HEADER additions:
4016+#     L: The offset of the signature of this. An unsigned long.
4017+#     L: The offset of the share hash chain. An unsigned long.
4018+#     L: The offset of the block hash tree. An unsigned long.
4019+#     L: The offset of the share data. An unsigned long.
4020+#     Q: The offset of the encrypted private key. An unsigned long long, to
4021+#        account for the possibility of a lot of share data.
4022+#     Q: The offset of the EOF. An unsigned long long, to account for the
4023+#        possibility of a lot of share data.
4024+#
4025+#  After all of these, we have the following:
4026+#    - The verification key: Occupies the space between the end of the header
4027+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4028+#    - The signature, which goes from the signature offset to the share hash
4029+#      chain offset.
4030+#    - The share hash chain, which goes from the share hash chain offset to
4031+#      the block hash tree offset.
4032+#    - The share data, which goes from the share data offset to the encrypted
4033+#      private key offset.
4034+#    - The encrypted private key offset, which goes until the end of the file.
4035+#
4036+#  The block hash tree in this encoding has only one share, so the offset of
4037+#  the share data will be 32 bits more than the offset of the block hash tree.
4038+#  Given this, we may need to check to see how many bytes a reasonably sized
4039+#  block hash tree will take up.
4040 
4041 PREFIX = ">BQ32s16s" # each version has a different prefix
4042 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4043hunk ./src/allmydata/mutable/layout.py 73
4044 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4045 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4046 HEADER_LENGTH = struct.calcsize(HEADER)
4047+OFFSETS = ">LLLLQQ"
4048+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4049 
4050 def unpack_header(data):
4051     o = {}
4052hunk ./src/allmydata/mutable/layout.py 194
4053     return (share_hash_chain, block_hash_tree, share_data)
4054 
4055 
4056-def pack_checkstring(seqnum, root_hash, IV):
4057+def pack_checkstring(seqnum, root_hash, IV, version=0):
4058     return struct.pack(PREFIX,
4059hunk ./src/allmydata/mutable/layout.py 196
4060-                       0, # version,
4061+                       version,
4062                        seqnum,
4063                        root_hash,
4064                        IV)
4065hunk ./src/allmydata/mutable/layout.py 269
4066                            encprivkey])
4067     return final_share
4068 
4069+def pack_prefix(seqnum, root_hash, IV,
4070+                required_shares, total_shares,
4071+                segment_size, data_length):
4072+    prefix = struct.pack(SIGNED_PREFIX,
4073+                         0, # version,
4074+                         seqnum,
4075+                         root_hash,
4076+                         IV,
4077+                         required_shares,
4078+                         total_shares,
4079+                         segment_size,
4080+                         data_length,
4081+                         )
4082+    return prefix
4083+
4084+
4085+class SDMFSlotWriteProxy:
4086+    implements(IMutableSlotWriter)
4087+    """
4088+    I represent a remote write slot for an SDMF mutable file. I build a
4089+    share in memory, and then write it in one piece to the remote
4090+    server. This mimics how SDMF shares were built before MDMF (and the
4091+    new MDMF uploader), but provides that functionality in a way that
4092+    allows the MDMF uploader to be built without much special-casing for
4093+    file format, which makes the uploader code more readable.
4094+    """
4095+    def __init__(self,
4096+                 shnum,
4097+                 rref, # a remote reference to a storage server
4098+                 storage_index,
4099+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4100+                 seqnum, # the sequence number of the mutable file
4101+                 required_shares,
4102+                 total_shares,
4103+                 segment_size,
4104+                 data_length): # the length of the original file
4105+        self.shnum = shnum
4106+        self._rref = rref
4107+        self._storage_index = storage_index
4108+        self._secrets = secrets
4109+        self._seqnum = seqnum
4110+        self._required_shares = required_shares
4111+        self._total_shares = total_shares
4112+        self._segment_size = segment_size
4113+        self._data_length = data_length
4114+
4115+        # This is an SDMF file, so it should have only one segment, so,
4116+        # modulo padding of the data length, the segment size and the
4117+        # data length should be the same.
4118+        expected_segment_size = mathutil.next_multiple(data_length,
4119+                                                       self._required_shares)
4120+        assert expected_segment_size == segment_size
4121+
4122+        self._block_size = self._segment_size / self._required_shares
4123+
4124+        # This is meant to mimic how SDMF files were built before MDMF
4125+        # entered the picture: we generate each share in its entirety,
4126+        # then push it off to the storage server in one write. When
4127+        # callers call set_*, they are just populating this dict.
4128+        # finish_publishing will stitch these pieces together into a
4129+        # coherent share, and then write the coherent share to the
4130+        # storage server.
4131+        self._share_pieces = {}
4132+
4133+        # This tells the write logic what checkstring to use when
4134+        # writing remote shares.
4135+        self._testvs = []
4136+
4137+        self._readvs = [(0, struct.calcsize(PREFIX))]
4138+
4139+
4140+    def set_checkstring(self, checkstring_or_seqnum,
4141+                              root_hash=None,
4142+                              salt=None):
4143+        """
4144+        Set the checkstring that I will pass to the remote server when
4145+        writing.
4146+
4147+            @param checkstring_or_seqnum: A packed checkstring to use,
4148+                   or a sequence number. I will treat this as a checkstr
4149+
4150+        Note that implementations can differ in which semantics they
4151+        wish to support for set_checkstring -- they can, for example,
4152+        build the checkstring themselves from its constituents, or
4153+        some other thing.
4154+        """
4155+        if root_hash and salt:
4156+            checkstring = struct.pack(PREFIX,
4157+                                      0,
4158+                                      checkstring_or_seqnum,
4159+                                      root_hash,
4160+                                      salt)
4161+        else:
4162+            checkstring = checkstring_or_seqnum
4163+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4164+
4165+
4166+    def get_checkstring(self):
4167+        """
4168+        Get the checkstring that I think currently exists on the remote
4169+        server.
4170+        """
4171+        if self._testvs:
4172+            return self._testvs[0][3]
4173+        return ""
4174+
4175+
4176+    def put_block(self, data, segnum, salt):
4177+        """
4178+        Add a block and salt to the share.
4179+        """
4180+        # SDMF files have only one segment
4181+        assert segnum == 0
4182+        assert len(data) == self._block_size
4183+        assert len(salt) == SALT_SIZE
4184+
4185+        self._share_pieces['sharedata'] = data
4186+        self._share_pieces['salt'] = salt
4187+
4188+        # TODO: Figure out something intelligent to return.
4189+        return defer.succeed(None)
4190+
4191+
4192+    def put_encprivkey(self, encprivkey):
4193+        """
4194+        Add the encrypted private key to the share.
4195+        """
4196+        self._share_pieces['encprivkey'] = encprivkey
4197+
4198+        return defer.succeed(None)
4199+
4200+
4201+    def put_blockhashes(self, blockhashes):
4202+        """
4203+        Add the block hash tree to the share.
4204+        """
4205+        assert isinstance(blockhashes, list)
4206+        for h in blockhashes:
4207+            assert len(h) == HASH_SIZE
4208+
4209+        # serialize the blockhashes, then set them.
4210+        blockhashes_s = "".join(blockhashes)
4211+        self._share_pieces['block_hash_tree'] = blockhashes_s
4212+
4213+        return defer.succeed(None)
4214+
4215+
4216+    def put_sharehashes(self, sharehashes):
4217+        """
4218+        Add the share hash chain to the share.
4219+        """
4220+        assert isinstance(sharehashes, dict)
4221+        for h in sharehashes.itervalues():
4222+            assert len(h) == HASH_SIZE
4223+
4224+        # serialize the sharehashes, then set them.
4225+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4226+                                 for i in sorted(sharehashes.keys())])
4227+        self._share_pieces['share_hash_chain'] = sharehashes_s
4228+
4229+        return defer.succeed(None)
4230+
4231+
4232+    def put_root_hash(self, root_hash):
4233+        """
4234+        Add the root hash to the share.
4235+        """
4236+        assert len(root_hash) == HASH_SIZE
4237+
4238+        self._share_pieces['root_hash'] = root_hash
4239+
4240+        return defer.succeed(None)
4241+
4242+
4243+    def put_salt(self, salt):
4244+        """
4245+        Add a salt to an empty SDMF file.
4246+        """
4247+        assert len(salt) == SALT_SIZE
4248+
4249+        self._share_pieces['salt'] = salt
4250+        self._share_pieces['sharedata'] = ""
4251+
4252+
4253+    def get_signable(self):
4254+        """
4255+        Return the part of the share that needs to be signed.
4256+
4257+        SDMF writers need to sign the packed representation of the
4258+        first eight fields of the remote share, that is:
4259+            - version number (0)
4260+            - sequence number
4261+            - root of the share hash tree
4262+            - salt
4263+            - k
4264+            - n
4265+            - segsize
4266+            - datalen
4267+
4268+        This method is responsible for returning that to callers.
4269+        """
4270+        return struct.pack(SIGNED_PREFIX,
4271+                           0,
4272+                           self._seqnum,
4273+                           self._share_pieces['root_hash'],
4274+                           self._share_pieces['salt'],
4275+                           self._required_shares,
4276+                           self._total_shares,
4277+                           self._segment_size,
4278+                           self._data_length)
4279+
4280+
4281+    def put_signature(self, signature):
4282+        """
4283+        Add the signature to the share.
4284+        """
4285+        self._share_pieces['signature'] = signature
4286+
4287+        return defer.succeed(None)
4288+
4289+
4290+    def put_verification_key(self, verification_key):
4291+        """
4292+        Add the verification key to the share.
4293+        """
4294+        self._share_pieces['verification_key'] = verification_key
4295+
4296+        return defer.succeed(None)
4297+
4298+
4299+    def get_verinfo(self):
4300+        """
4301+        I return my verinfo tuple. This is used by the ServermapUpdater
4302+        to keep track of versions of mutable files.
4303+
4304+        The verinfo tuple for MDMF files contains:
4305+            - seqnum
4306+            - root hash
4307+            - a blank (nothing)
4308+            - segsize
4309+            - datalen
4310+            - k
4311+            - n
4312+            - prefix (the thing that you sign)
4313+            - a tuple of offsets
4314+
4315+        We include the nonce in MDMF to simplify processing of version
4316+        information tuples.
4317+
4318+        The verinfo tuple for SDMF files is the same, but contains a
4319+        16-byte IV instead of a hash of salts.
4320+        """
4321+        return (self._seqnum,
4322+                self._share_pieces['root_hash'],
4323+                self._share_pieces['salt'],
4324+                self._segment_size,
4325+                self._data_length,
4326+                self._required_shares,
4327+                self._total_shares,
4328+                self.get_signable(),
4329+                self._get_offsets_tuple())
4330+
4331+    def _get_offsets_dict(self):
4332+        post_offset = HEADER_LENGTH
4333+        offsets = {}
4334+
4335+        verification_key_length = len(self._share_pieces['verification_key'])
4336+        o1 = offsets['signature'] = post_offset + verification_key_length
4337+
4338+        signature_length = len(self._share_pieces['signature'])
4339+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4340+
4341+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4342+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4343+
4344+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4345+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4346+
4347+        share_data_length = len(self._share_pieces['sharedata'])
4348+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4349+
4350+        encprivkey_length = len(self._share_pieces['encprivkey'])
4351+        offsets['EOF'] = o5 + encprivkey_length
4352+        return offsets
4353+
4354+
4355+    def _get_offsets_tuple(self):
4356+        offsets = self._get_offsets_dict()
4357+        return tuple([(key, value) for key, value in offsets.items()])
4358+
4359+
4360+    def _pack_offsets(self):
4361+        offsets = self._get_offsets_dict()
4362+        return struct.pack(">LLLLQQ",
4363+                           offsets['signature'],
4364+                           offsets['share_hash_chain'],
4365+                           offsets['block_hash_tree'],
4366+                           offsets['share_data'],
4367+                           offsets['enc_privkey'],
4368+                           offsets['EOF'])
4369+
4370+
4371+    def finish_publishing(self):
4372+        """
4373+        Do anything necessary to finish writing the share to a remote
4374+        server. I require that no further publishing needs to take place
4375+        after this method has been called.
4376+        """
4377+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4378+                  "share_hash_chain", "block_hash_tree"]:
4379+            assert k in self._share_pieces
4380+        # This is the only method that actually writes something to the
4381+        # remote server.
4382+        # First, we need to pack the share into data that we can write
4383+        # to the remote server in one write.
4384+        offsets = self._pack_offsets()
4385+        prefix = self.get_signable()
4386+        final_share = "".join([prefix,
4387+                               offsets,
4388+                               self._share_pieces['verification_key'],
4389+                               self._share_pieces['signature'],
4390+                               self._share_pieces['share_hash_chain'],
4391+                               self._share_pieces['block_hash_tree'],
4392+                               self._share_pieces['sharedata'],
4393+                               self._share_pieces['encprivkey']])
4394+
4395+        # Our only data vector is going to be writing the final share,
4396+        # in its entirely.
4397+        datavs = [(0, final_share)]
4398+
4399+        if not self._testvs:
4400+            # Our caller has not provided us with another checkstring
4401+            # yet, so we assume that we are writing a new share, and set
4402+            # a test vector that will allow a new share to be written.
4403+            self._testvs = []
4404+            self._testvs.append(tuple([0, 1, "eq", ""]))
4405+            new_share = True
4406+
4407+        tw_vectors = {}
4408+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4409+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4410+                                     self._storage_index,
4411+                                     self._secrets,
4412+                                     tw_vectors,
4413+                                     # TODO is it useful to read something?
4414+                                     self._readvs)
4415+
4416+
4417+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4418+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4419+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4420+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4421+MDMFCHECKSTRING = ">BQ32s"
4422+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4423+MDMFOFFSETS = ">QQQQQQ"
4424+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4425+
4426+class MDMFSlotWriteProxy:
4427+    implements(IMutableSlotWriter)
4428+
4429+    """
4430+    I represent a remote write slot for an MDMF mutable file.
4431+
4432+    I abstract away from my caller the details of block and salt
4433+    management, and the implementation of the on-disk format for MDMF
4434+    shares.
4435+    """
4436+    # Expected layout, MDMF:
4437+    # offset:     size:       name:
4438+    #-- signed part --
4439+    # 0           1           version number (01)
4440+    # 1           8           sequence number
4441+    # 9           32          share tree root hash
4442+    # 41          1           The "k" encoding parameter
4443+    # 42          1           The "N" encoding parameter
4444+    # 43          8           The segment size of the uploaded file
4445+    # 51          8           The data length of the original plaintext
4446+    #-- end signed part --
4447+    # 59          8           The offset of the encrypted private key
4448+    # 67          8           The offset of the block hash tree
4449+    # 75          8           The offset of the share hash chain
4450+    # 83          8           The offset of the signature
4451+    # 91          8           The offset of the verification key
4452+    # 99          8           The offset of the EOF
4453+    #
4454+    # followed by salts and share data, the encrypted private key, the
4455+    # block hash tree, the salt hash tree, the share hash chain, a
4456+    # signature over the first eight fields, and a verification key.
4457+    #
4458+    # The checkstring is the first three fields -- the version number,
4459+    # sequence number, root hash and root salt hash. This is consistent
4460+    # in meaning to what we have with SDMF files, except now instead of
4461+    # using the literal salt, we use a value derived from all of the
4462+    # salts -- the share hash root.
4463+    #
4464+    # The salt is stored before the block for each segment. The block
4465+    # hash tree is computed over the combination of block and salt for
4466+    # each segment. In this way, we get integrity checking for both
4467+    # block and salt with the current block hash tree arrangement.
4468+    #
4469+    # The ordering of the offsets is different to reflect the dependencies
4470+    # that we'll run into with an MDMF file. The expected write flow is
4471+    # something like this:
4472+    #
4473+    #   0: Initialize with the sequence number, encoding parameters and
4474+    #      data length. From this, we can deduce the number of segments,
4475+    #      and where they should go.. We can also figure out where the
4476+    #      encrypted private key should go, because we can figure out how
4477+    #      big the share data will be.
4478+    #
4479+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4480+    #      like
4481+    #
4482+    #       put_block(data, segnum, salt)
4483+    #
4484+    #      to write a block and a salt to the disk. We can do both of
4485+    #      these operations now because we have enough of the offsets to
4486+    #      know where to put them.
4487+    #
4488+    #   2: Put the encrypted private key. Use:
4489+    #
4490+    #        put_encprivkey(encprivkey)
4491+    #
4492+    #      Now that we know the length of the private key, we can fill
4493+    #      in the offset for the block hash tree.
4494+    #
4495+    #   3: We're now in a position to upload the block hash tree for
4496+    #      a share. Put that using something like:
4497+    #       
4498+    #        put_blockhashes(block_hash_tree)
4499+    #
4500+    #      Note that block_hash_tree is a list of hashes -- we'll take
4501+    #      care of the details of serializing that appropriately. When
4502+    #      we get the block hash tree, we are also in a position to
4503+    #      calculate the offset for the share hash chain, and fill that
4504+    #      into the offsets table.
4505+    #
4506+    #   4: At the same time, we're in a position to upload the salt hash
4507+    #      tree. This is a Merkle tree over all of the salts. We use a
4508+    #      Merkle tree so that we can validate each block,salt pair as
4509+    #      we download them later. We do this using
4510+    #
4511+    #        put_salthashes(salt_hash_tree)
4512+    #
4513+    #      When you do this, I automatically put the root of the tree
4514+    #      (the hash at index 0 of the list) in its appropriate slot in
4515+    #      the signed prefix of the share.
4516+    #
4517+    #   5: We're now in a position to upload the share hash chain for
4518+    #      a share. Do that with something like:
4519+    #     
4520+    #        put_sharehashes(share_hash_chain)
4521+    #
4522+    #      share_hash_chain should be a dictionary mapping shnums to
4523+    #      32-byte hashes -- the wrapper handles serialization.
4524+    #      We'll know where to put the signature at this point, also.
4525+    #      The root of this tree will be put explicitly in the next
4526+    #      step.
4527+    #
4528+    #      TODO: Why? Why not just include it in the tree here?
4529+    #
4530+    #   6: Before putting the signature, we must first put the
4531+    #      root_hash. Do this with:
4532+    #
4533+    #        put_root_hash(root_hash).
4534+    #     
4535+    #      In terms of knowing where to put this value, it was always
4536+    #      possible to place it, but it makes sense semantically to
4537+    #      place it after the share hash tree, so that's why you do it
4538+    #      in this order.
4539+    #
4540+    #   6: With the root hash put, we can now sign the header. Use:
4541+    #
4542+    #        get_signable()
4543+    #
4544+    #      to get the part of the header that you want to sign, and use:
4545+    #       
4546+    #        put_signature(signature)
4547+    #
4548+    #      to write your signature to the remote server.
4549+    #
4550+    #   6: Add the verification key, and finish. Do:
4551+    #
4552+    #        put_verification_key(key)
4553+    #
4554+    #      and
4555+    #
4556+    #        finish_publish()
4557+    #
4558+    # Checkstring management:
4559+    #
4560+    # To write to a mutable slot, we have to provide test vectors to ensure
4561+    # that we are writing to the same data that we think we are. These
4562+    # vectors allow us to detect uncoordinated writes; that is, writes
4563+    # where both we and some other shareholder are writing to the
4564+    # mutable slot, and to report those back to the parts of the program
4565+    # doing the writing.
4566+    #
4567+    # With SDMF, this was easy -- all of the share data was written in
4568+    # one go, so it was easy to detect uncoordinated writes, and we only
4569+    # had to do it once. With MDMF, not all of the file is written at
4570+    # once.
4571+    #
4572+    # If a share is new, we write out as much of the header as we can
4573+    # before writing out anything else. This gives other writers a
4574+    # canary that they can use to detect uncoordinated writes, and, if
4575+    # they do the same thing, gives us the same canary. We them update
4576+    # the share. We won't be able to write out two fields of the header
4577+    # -- the share tree hash and the salt hash -- until we finish
4578+    # writing out the share. We only require the writer to provide the
4579+    # initial checkstring, and keep track of what it should be after
4580+    # updates ourselves.
4581+    #
4582+    # If we haven't written anything yet, then on the first write (which
4583+    # will probably be a block + salt of a share), we'll also write out
4584+    # the header. On subsequent passes, we'll expect to see the header.
4585+    # This changes in two places:
4586+    #
4587+    #   - When we write out the salt hash
4588+    #   - When we write out the root of the share hash tree
4589+    #
4590+    # since these values will change the header. It is possible that we
4591+    # can just make those be written in one operation to minimize
4592+    # disruption.
4593+    def __init__(self,
4594+                 shnum,
4595+                 rref, # a remote reference to a storage server
4596+                 storage_index,
4597+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4598+                 seqnum, # the sequence number of the mutable file
4599+                 required_shares,
4600+                 total_shares,
4601+                 segment_size,
4602+                 data_length): # the length of the original file
4603+        self.shnum = shnum
4604+        self._rref = rref
4605+        self._storage_index = storage_index
4606+        self._seqnum = seqnum
4607+        self._required_shares = required_shares
4608+        assert self.shnum >= 0 and self.shnum < total_shares
4609+        self._total_shares = total_shares
4610+        # We build up the offset table as we write things. It is the
4611+        # last thing we write to the remote server.
4612+        self._offsets = {}
4613+        self._testvs = []
4614+        self._secrets = secrets
4615+        # The segment size needs to be a multiple of the k parameter --
4616+        # any padding should have been carried out by the publisher
4617+        # already.
4618+        assert segment_size % required_shares == 0
4619+        self._segment_size = segment_size
4620+        self._data_length = data_length
4621+
4622+        # These are set later -- we define them here so that we can
4623+        # check for their existence easily
4624+
4625+        # This is the root of the share hash tree -- the Merkle tree
4626+        # over the roots of the block hash trees computed for shares in
4627+        # this upload.
4628+        self._root_hash = None
4629+
4630+        # We haven't yet written anything to the remote bucket. By
4631+        # setting this, we tell the _write method as much. The write
4632+        # method will then know that it also needs to add a write vector
4633+        # for the checkstring (or what we have of it) to the first write
4634+        # request. We'll then record that value for future use.  If
4635+        # we're expecting something to be there already, we need to call
4636+        # set_checkstring before we write anything to tell the first
4637+        # write about that.
4638+        self._written = False
4639+
4640+        # When writing data to the storage servers, we get a read vector
4641+        # for free. We'll read the checkstring, which will help us
4642+        # figure out what's gone wrong if a write fails.
4643+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4644+
4645+        # We calculate the number of segments because it tells us
4646+        # where the salt part of the file ends/share segment begins,
4647+        # and also because it provides a useful amount of bounds checking.
4648+        self._num_segments = mathutil.div_ceil(self._data_length,
4649+                                               self._segment_size)
4650+        self._block_size = self._segment_size / self._required_shares
4651+        # We also calculate the share size, to help us with block
4652+        # constraints later.
4653+        tail_size = self._data_length % self._segment_size
4654+        if not tail_size:
4655+            self._tail_block_size = self._block_size
4656+        else:
4657+            self._tail_block_size = mathutil.next_multiple(tail_size,
4658+                                                           self._required_shares)
4659+            self._tail_block_size /= self._required_shares
4660+
4661+        # We already know where the sharedata starts; right after the end
4662+        # of the header (which is defined as the signable part + the offsets)
4663+        # We can also calculate where the encrypted private key begins
4664+        # from what we know know.
4665+        self._actual_block_size = self._block_size + SALT_SIZE
4666+        data_size = self._actual_block_size * (self._num_segments - 1)
4667+        data_size += self._tail_block_size
4668+        data_size += SALT_SIZE
4669+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
4670+        self._offsets['enc_privkey'] += data_size
4671+        # We'll wait for the rest. Callers can now call my "put_block" and
4672+        # "set_checkstring" methods.
4673+
4674+
4675+    def set_checkstring(self,
4676+                        seqnum_or_checkstring,
4677+                        root_hash=None,
4678+                        salt=None):
4679+        """
4680+        Set checkstring checkstring for the given shnum.
4681+
4682+        This can be invoked in one of two ways.
4683+
4684+        With one argument, I assume that you are giving me a literal
4685+        checkstring -- e.g., the output of get_checkstring. I will then
4686+        set that checkstring as it is. This form is used by unit tests.
4687+
4688+        With two arguments, I assume that you are giving me a sequence
4689+        number and root hash to make a checkstring from. In that case, I
4690+        will build a checkstring and set it for you. This form is used
4691+        by the publisher.
4692+
4693+        By default, I assume that I am writing new shares to the grid.
4694+        If you don't explcitly set your own checkstring, I will use
4695+        one that requires that the remote share not exist. You will want
4696+        to use this method if you are updating a share in-place;
4697+        otherwise, writes will fail.
4698+        """
4699+        # You're allowed to overwrite checkstrings with this method;
4700+        # I assume that users know what they are doing when they call
4701+        # it.
4702+        if root_hash:
4703+            checkstring = struct.pack(MDMFCHECKSTRING,
4704+                                      1,
4705+                                      seqnum_or_checkstring,
4706+                                      root_hash)
4707+        else:
4708+            checkstring = seqnum_or_checkstring
4709+
4710+        if checkstring == "":
4711+            # We special-case this, since len("") = 0, but we need
4712+            # length of 1 for the case of an empty share to work on the
4713+            # storage server, which is what a checkstring that is the
4714+            # empty string means.
4715+            self._testvs = []
4716+        else:
4717+            self._testvs = []
4718+            self._testvs.append((0, len(checkstring), "eq", checkstring))
4719+
4720+
4721+    def __repr__(self):
4722+        return "MDMFSlotWriteProxy for share %d" % self.shnum
4723+
4724+
4725+    def get_checkstring(self):
4726+        """
4727+        Given a share number, I return a representation of what the
4728+        checkstring for that share on the server will look like.
4729+
4730+        I am mostly used for tests.
4731+        """
4732+        if self._root_hash:
4733+            roothash = self._root_hash
4734+        else:
4735+            roothash = "\x00" * 32
4736+        return struct.pack(MDMFCHECKSTRING,
4737+                           1,
4738+                           self._seqnum,
4739+                           roothash)
4740+
4741+
4742+    def put_block(self, data, segnum, salt):
4743+        """
4744+        Put the encrypted-and-encoded data segment in the slot, along
4745+        with the salt.
4746+        """
4747+        if segnum >= self._num_segments:
4748+            raise LayoutInvalid("I won't overwrite the private key")
4749+        if len(salt) != SALT_SIZE:
4750+            raise LayoutInvalid("I was given a salt of size %d, but "
4751+                                "I wanted a salt of size %d")
4752+        if segnum + 1 == self._num_segments:
4753+            if len(data) != self._tail_block_size:
4754+                raise LayoutInvalid("I was given the wrong size block to write")
4755+        elif len(data) != self._block_size:
4756+            raise LayoutInvalid("I was given the wrong size block to write")
4757+
4758+        # We want to write at len(MDMFHEADER) + segnum * block_size.
4759+
4760+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
4761+        data = salt + data
4762+
4763+        datavs = [tuple([offset, data])]
4764+        return self._write(datavs)
4765+
4766+
4767+    def put_encprivkey(self, encprivkey):
4768+        """
4769+        Put the encrypted private key in the remote slot.
4770+        """
4771+        assert self._offsets
4772+        assert self._offsets['enc_privkey']
4773+        # You shouldn't re-write the encprivkey after the block hash
4774+        # tree is written, since that could cause the private key to run
4775+        # into the block hash tree. Before it writes the block hash
4776+        # tree, the block hash tree writing method writes the offset of
4777+        # the salt hash tree. So that's a good indicator of whether or
4778+        # not the block hash tree has been written.
4779+        if "share_hash_chain" in self._offsets:
4780+            raise LayoutInvalid("You must write this before the block hash tree")
4781+
4782+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
4783+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
4784+        def _on_failure():
4785+            del(self._offsets['block_hash_tree'])
4786+        return self._write(datavs, on_failure=_on_failure)
4787+
4788+
4789+    def put_blockhashes(self, blockhashes):
4790+        """
4791+        Put the block hash tree in the remote slot.
4792+
4793+        The encrypted private key must be put before the block hash
4794+        tree, since we need to know how large it is to know where the
4795+        block hash tree should go. The block hash tree must be put
4796+        before the salt hash tree, since its size determines the
4797+        offset of the share hash chain.
4798+        """
4799+        assert self._offsets
4800+        assert isinstance(blockhashes, list)
4801+        if "block_hash_tree" not in self._offsets:
4802+            raise LayoutInvalid("You must put the encrypted private key "
4803+                                "before you put the block hash tree")
4804+        # If written, the share hash chain causes the signature offset
4805+        # to be defined.
4806+        if "signature" in self._offsets:
4807+            raise LayoutInvalid("You must put the block hash tree before "
4808+                                "you put the share hash chain")
4809+        blockhashes_s = "".join(blockhashes)
4810+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
4811+        datavs = []
4812+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
4813+        def _on_failure():
4814+            del(self._offsets['share_hash_chain'])
4815+        return self._write(datavs, on_failure=_on_failure)
4816+
4817+
4818+    def put_sharehashes(self, sharehashes):
4819+        """
4820+        Put the share hash chain in the remote slot.
4821+
4822+        The salt hash tree must be put before the share hash chain,
4823+        since we need to know where the salt hash tree ends before we
4824+        can know where the share hash chain starts. The share hash chain
4825+        must be put before the signature, since the length of the packed
4826+        share hash chain determines the offset of the signature. Also,
4827+        semantically, you must know what the root of the salt hash tree
4828+        is before you can generate a valid signature.
4829+        """
4830+        assert isinstance(sharehashes, dict)
4831+        if "share_hash_chain" not in self._offsets:
4832+            raise LayoutInvalid("You need to put the salt hash tree before "
4833+                                "you can put the share hash chain")
4834+        # The signature comes after the share hash chain. If the
4835+        # signature has already been written, we must not write another
4836+        # share hash chain. The signature writes the verification key
4837+        # offset when it gets sent to the remote server, so we look for
4838+        # that.
4839+        if "verification_key" in self._offsets:
4840+            raise LayoutInvalid("You must write the share hash chain "
4841+                                "before you write the signature")
4842+        datavs = []
4843+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4844+                                  for i in sorted(sharehashes.keys())])
4845+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
4846+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
4847+        def _on_failure():
4848+            del(self._offsets['signature'])
4849+        return self._write(datavs, on_failure=_on_failure)
4850+
4851+
4852+    def put_root_hash(self, roothash):
4853+        """
4854+        Put the root hash (the root of the share hash tree) in the
4855+        remote slot.
4856+        """
4857+        # It does not make sense to be able to put the root
4858+        # hash without first putting the share hashes, since you need
4859+        # the share hashes to generate the root hash.
4860+        #
4861+        # Signature is defined by the routine that places the share hash
4862+        # chain, so it's a good thing to look for in finding out whether
4863+        # or not the share hash chain exists on the remote server.
4864+        if "signature" not in self._offsets:
4865+            raise LayoutInvalid("You need to put the share hash chain "
4866+                                "before you can put the root share hash")
4867+        if len(roothash) != HASH_SIZE:
4868+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
4869+                                 % HASH_SIZE)
4870+        datavs = []
4871+        self._root_hash = roothash
4872+        # To write both of these values, we update the checkstring on
4873+        # the remote server, which includes them
4874+        checkstring = self.get_checkstring()
4875+        datavs.append(tuple([0, checkstring]))
4876+        # This write, if successful, changes the checkstring, so we need
4877+        # to update our internal checkstring to be consistent with the
4878+        # one on the server.
4879+        def _on_success():
4880+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
4881+        def _on_failure():
4882+            self._root_hash = None
4883+        return self._write(datavs,
4884+                           on_success=_on_success,
4885+                           on_failure=_on_failure)
4886+
4887+
4888+    def get_signable(self):
4889+        """
4890+        Get the first seven fields of the mutable file; the parts that
4891+        are signed.
4892+        """
4893+        if not self._root_hash:
4894+            raise LayoutInvalid("You need to set the root hash "
4895+                                "before getting something to "
4896+                                "sign")
4897+        return struct.pack(MDMFSIGNABLEHEADER,
4898+                           1,
4899+                           self._seqnum,
4900+                           self._root_hash,
4901+                           self._required_shares,
4902+                           self._total_shares,
4903+                           self._segment_size,
4904+                           self._data_length)
4905+
4906+
4907+    def put_signature(self, signature):
4908+        """
4909+        Put the signature field to the remote slot.
4910+
4911+        I require that the root hash and share hash chain have been put
4912+        to the grid before I will write the signature to the grid.
4913+        """
4914+        if "signature" not in self._offsets:
4915+            raise LayoutInvalid("You must put the share hash chain "
4916+        # It does not make sense to put a signature without first
4917+        # putting the root hash and the salt hash (since otherwise
4918+        # the signature would be incomplete), so we don't allow that.
4919+                       "before putting the signature")
4920+        if not self._root_hash:
4921+            raise LayoutInvalid("You must complete the signed prefix "
4922+                                "before computing a signature")
4923+        # If we put the signature after we put the verification key, we
4924+        # could end up running into the verification key, and will
4925+        # probably screw up the offsets as well. So we don't allow that.
4926+        # The method that writes the verification key defines the EOF
4927+        # offset before writing the verification key, so look for that.
4928+        if "EOF" in self._offsets:
4929+            raise LayoutInvalid("You must write the signature before the verification key")
4930+
4931+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
4932+        datavs = []
4933+        datavs.append(tuple([self._offsets['signature'], signature]))
4934+        def _on_failure():
4935+            del(self._offsets['verification_key'])
4936+        return self._write(datavs, on_failure=_on_failure)
4937+
4938+
4939+    def put_verification_key(self, verification_key):
4940+        """
4941+        Put the verification key into the remote slot.
4942+
4943+        I require that the signature have been written to the storage
4944+        server before I allow the verification key to be written to the
4945+        remote server.
4946+        """
4947+        if "verification_key" not in self._offsets:
4948+            raise LayoutInvalid("You must put the signature before you "
4949+                                "can put the verification key")
4950+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
4951+        datavs = []
4952+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
4953+        def _on_failure():
4954+            del(self._offsets['EOF'])
4955+        return self._write(datavs, on_failure=_on_failure)
4956+
4957+    def _get_offsets_tuple(self):
4958+        return tuple([(key, value) for key, value in self._offsets.items()])
4959+
4960+    def get_verinfo(self):
4961+        return (self._seqnum,
4962+                self._root_hash,
4963+                self._required_shares,
4964+                self._total_shares,
4965+                self._segment_size,
4966+                self._data_length,
4967+                self.get_signable(),
4968+                self._get_offsets_tuple())
4969+
4970+
4971+    def finish_publishing(self):
4972+        """
4973+        Write the offset table and encoding parameters to the remote
4974+        slot, since that's the only thing we have yet to publish at this
4975+        point.
4976+        """
4977+        if "EOF" not in self._offsets:
4978+            raise LayoutInvalid("You must put the verification key before "
4979+                                "you can publish the offsets")
4980+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4981+        offsets = struct.pack(MDMFOFFSETS,
4982+                              self._offsets['enc_privkey'],
4983+                              self._offsets['block_hash_tree'],
4984+                              self._offsets['share_hash_chain'],
4985+                              self._offsets['signature'],
4986+                              self._offsets['verification_key'],
4987+                              self._offsets['EOF'])
4988+        datavs = []
4989+        datavs.append(tuple([offsets_offset, offsets]))
4990+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
4991+        params = struct.pack(">BBQQ",
4992+                             self._required_shares,
4993+                             self._total_shares,
4994+                             self._segment_size,
4995+                             self._data_length)
4996+        datavs.append(tuple([encoding_parameters_offset, params]))
4997+        return self._write(datavs)
4998+
4999+
5000+    def _write(self, datavs, on_failure=None, on_success=None):
5001+        """I write the data vectors in datavs to the remote slot."""
5002+        tw_vectors = {}
5003+        new_share = False
5004+        if not self._testvs:
5005+            self._testvs = []
5006+            self._testvs.append(tuple([0, 1, "eq", ""]))
5007+            new_share = True
5008+        if not self._written:
5009+            # Write a new checkstring to the share when we write it, so
5010+            # that we have something to check later.
5011+            new_checkstring = self.get_checkstring()
5012+            datavs.append((0, new_checkstring))
5013+            def _first_write():
5014+                self._written = True
5015+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5016+            on_success = _first_write
5017+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5018+        datalength = sum([len(x[1]) for x in datavs])
5019+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5020+                                  self._storage_index,
5021+                                  self._secrets,
5022+                                  tw_vectors,
5023+                                  self._readv)
5024+        def _result(results):
5025+            if isinstance(results, failure.Failure) or not results[0]:
5026+                # Do nothing; the write was unsuccessful.
5027+                if on_failure: on_failure()
5028+            else:
5029+                if on_success: on_success()
5030+            return results
5031+        d.addCallback(_result)
5032+        return d
5033+
5034+
5035+class MDMFSlotReadProxy:
5036+    """
5037+    I read from a mutable slot filled with data written in the MDMF data
5038+    format (which is described above).
5039+
5040+    I can be initialized with some amount of data, which I will use (if
5041+    it is valid) to eliminate some of the need to fetch it from servers.
5042+    """
5043+    def __init__(self,
5044+                 rref,
5045+                 storage_index,
5046+                 shnum,
5047+                 data=""):
5048+        # Start the initialization process.
5049+        self._rref = rref
5050+        self._storage_index = storage_index
5051+        self.shnum = shnum
5052+
5053+        # Before doing anything, the reader is probably going to want to
5054+        # verify that the signature is correct. To do that, they'll need
5055+        # the verification key, and the signature. To get those, we'll
5056+        # need the offset table. So fetch the offset table on the
5057+        # assumption that that will be the first thing that a reader is
5058+        # going to do.
5059+
5060+        # The fact that these encoding parameters are None tells us
5061+        # that we haven't yet fetched them from the remote share, so we
5062+        # should. We could just not set them, but the checks will be
5063+        # easier to read if we don't have to use hasattr.
5064+        self._version_number = None
5065+        self._sequence_number = None
5066+        self._root_hash = None
5067+        # Filled in if we're dealing with an SDMF file. Unused
5068+        # otherwise.
5069+        self._salt = None
5070+        self._required_shares = None
5071+        self._total_shares = None
5072+        self._segment_size = None
5073+        self._data_length = None
5074+        self._offsets = None
5075+
5076+        # If the user has chosen to initialize us with some data, we'll
5077+        # try to satisfy subsequent data requests with that data before
5078+        # asking the storage server for it. If
5079+        self._data = data
5080+        # The way callers interact with cache in the filenode returns
5081+        # None if there isn't any cached data, but the way we index the
5082+        # cached data requires a string, so convert None to "".
5083+        if self._data == None:
5084+            self._data = ""
5085+
5086+        self._queue_observers = observer.ObserverList()
5087+        self._queue_errbacks = observer.ObserverList()
5088+        self._readvs = []
5089+
5090+
5091+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5092+        """
5093+        I fetch the offset table and the header from the remote slot if
5094+        I don't already have them. If I do have them, I do nothing and
5095+        return an empty Deferred.
5096+        """
5097+        if self._offsets:
5098+            return defer.succeed(None)
5099+        # At this point, we may be either SDMF or MDMF. Fetching 107
5100+        # bytes will be enough to get header and offsets for both SDMF and
5101+        # MDMF, though we'll be left with 4 more bytes than we
5102+        # need if this ends up being MDMF. This is probably less
5103+        # expensive than the cost of a second roundtrip.
5104+        readvs = [(0, 107)]
5105+        d = self._read(readvs, force_remote)
5106+        d.addCallback(self._process_encoding_parameters)
5107+        d.addCallback(self._process_offsets)
5108+        return d
5109+
5110+
5111+    def _process_encoding_parameters(self, encoding_parameters):
5112+        assert self.shnum in encoding_parameters
5113+        encoding_parameters = encoding_parameters[self.shnum][0]
5114+        # The first byte is the version number. It will tell us what
5115+        # to do next.
5116+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5117+        if verno == MDMF_VERSION:
5118+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5119+            (verno,
5120+             seqnum,
5121+             root_hash,
5122+             k,
5123+             n,
5124+             segsize,
5125+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5126+                                      encoding_parameters[:read_size])
5127+            if segsize == 0 and datalen == 0:
5128+                # Empty file, no segments.
5129+                self._num_segments = 0
5130+            else:
5131+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5132+
5133+        elif verno == SDMF_VERSION:
5134+            read_size = SIGNED_PREFIX_LENGTH
5135+            (verno,
5136+             seqnum,
5137+             root_hash,
5138+             salt,
5139+             k,
5140+             n,
5141+             segsize,
5142+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5143+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5144+            self._salt = salt
5145+            if segsize == 0 and datalen == 0:
5146+                # empty file
5147+                self._num_segments = 0
5148+            else:
5149+                # non-empty SDMF files have one segment.
5150+                self._num_segments = 1
5151+        else:
5152+            raise UnknownVersionError("You asked me to read mutable file "
5153+                                      "version %d, but I only understand "
5154+                                      "%d and %d" % (verno, SDMF_VERSION,
5155+                                                     MDMF_VERSION))
5156+
5157+        self._version_number = verno
5158+        self._sequence_number = seqnum
5159+        self._root_hash = root_hash
5160+        self._required_shares = k
5161+        self._total_shares = n
5162+        self._segment_size = segsize
5163+        self._data_length = datalen
5164+
5165+        self._block_size = self._segment_size / self._required_shares
5166+        # We can upload empty files, and need to account for this fact
5167+        # so as to avoid zero-division and zero-modulo errors.
5168+        if datalen > 0:
5169+            tail_size = self._data_length % self._segment_size
5170+        else:
5171+            tail_size = 0
5172+        if not tail_size:
5173+            self._tail_block_size = self._block_size
5174+        else:
5175+            self._tail_block_size = mathutil.next_multiple(tail_size,
5176+                                                    self._required_shares)
5177+            self._tail_block_size /= self._required_shares
5178+
5179+        return encoding_parameters
5180+
5181+
5182+    def _process_offsets(self, offsets):
5183+        if self._version_number == 0:
5184+            read_size = OFFSETS_LENGTH
5185+            read_offset = SIGNED_PREFIX_LENGTH
5186+            end = read_size + read_offset
5187+            (signature,
5188+             share_hash_chain,
5189+             block_hash_tree,
5190+             share_data,
5191+             enc_privkey,
5192+             EOF) = struct.unpack(">LLLLQQ",
5193+                                  offsets[read_offset:end])
5194+            self._offsets = {}
5195+            self._offsets['signature'] = signature
5196+            self._offsets['share_data'] = share_data
5197+            self._offsets['block_hash_tree'] = block_hash_tree
5198+            self._offsets['share_hash_chain'] = share_hash_chain
5199+            self._offsets['enc_privkey'] = enc_privkey
5200+            self._offsets['EOF'] = EOF
5201+
5202+        elif self._version_number == 1:
5203+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5204+            read_length = MDMFOFFSETS_LENGTH
5205+            end = read_offset + read_length
5206+            (encprivkey,
5207+             blockhashes,
5208+             sharehashes,
5209+             signature,
5210+             verification_key,
5211+             eof) = struct.unpack(MDMFOFFSETS,
5212+                                  offsets[read_offset:end])
5213+            self._offsets = {}
5214+            self._offsets['enc_privkey'] = encprivkey
5215+            self._offsets['block_hash_tree'] = blockhashes
5216+            self._offsets['share_hash_chain'] = sharehashes
5217+            self._offsets['signature'] = signature
5218+            self._offsets['verification_key'] = verification_key
5219+            self._offsets['EOF'] = eof
5220+
5221+
5222+    def get_block_and_salt(self, segnum, queue=False):
5223+        """
5224+        I return (block, salt), where block is the block data and
5225+        salt is the salt used to encrypt that segment.
5226+        """
5227+        d = self._maybe_fetch_offsets_and_header()
5228+        def _then(ignored):
5229+            if self._version_number == 1:
5230+                base_share_offset = MDMFHEADERSIZE
5231+            else:
5232+                base_share_offset = self._offsets['share_data']
5233+
5234+            if segnum + 1 > self._num_segments:
5235+                raise LayoutInvalid("Not a valid segment number")
5236+
5237+            if self._version_number == 0:
5238+                share_offset = base_share_offset + self._block_size * segnum
5239+            else:
5240+                share_offset = base_share_offset + (self._block_size + \
5241+                                                    SALT_SIZE) * segnum
5242+            if segnum + 1 == self._num_segments:
5243+                data = self._tail_block_size
5244+            else:
5245+                data = self._block_size
5246+
5247+            if self._version_number == 1:
5248+                data += SALT_SIZE
5249+
5250+            readvs = [(share_offset, data)]
5251+            return readvs
5252+        d.addCallback(_then)
5253+        d.addCallback(lambda readvs:
5254+            self._read(readvs, queue=queue))
5255+        def _process_results(results):
5256+            assert self.shnum in results
5257+            if self._version_number == 0:
5258+                # We only read the share data, but we know the salt from
5259+                # when we fetched the header
5260+                data = results[self.shnum]
5261+                if not data:
5262+                    data = ""
5263+                else:
5264+                    assert len(data) == 1
5265+                    data = data[0]
5266+                salt = self._salt
5267+            else:
5268+                data = results[self.shnum]
5269+                if not data:
5270+                    salt = data = ""
5271+                else:
5272+                    salt_and_data = results[self.shnum][0]
5273+                    salt = salt_and_data[:SALT_SIZE]
5274+                    data = salt_and_data[SALT_SIZE:]
5275+            return data, salt
5276+        d.addCallback(_process_results)
5277+        return d
5278+
5279+
5280+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5281+        """
5282+        I return the block hash tree
5283+
5284+        I take an optional argument, needed, which is a set of indices
5285+        correspond to hashes that I should fetch. If this argument is
5286+        missing, I will fetch the entire block hash tree; otherwise, I
5287+        may attempt to fetch fewer hashes, based on what needed says
5288+        that I should do. Note that I may fetch as many hashes as I
5289+        want, so long as the set of hashes that I do fetch is a superset
5290+        of the ones that I am asked for, so callers should be prepared
5291+        to tolerate additional hashes.
5292+        """
5293+        # TODO: Return only the parts of the block hash tree necessary
5294+        # to validate the blocknum provided?
5295+        # This is a good idea, but it is hard to implement correctly. It
5296+        # is bad to fetch any one block hash more than once, so we
5297+        # probably just want to fetch the whole thing at once and then
5298+        # serve it.
5299+        if needed == set([]):
5300+            return defer.succeed([])
5301+        d = self._maybe_fetch_offsets_and_header()
5302+        def _then(ignored):
5303+            blockhashes_offset = self._offsets['block_hash_tree']
5304+            if self._version_number == 1:
5305+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5306+            else:
5307+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5308+            readvs = [(blockhashes_offset, blockhashes_length)]
5309+            return readvs
5310+        d.addCallback(_then)
5311+        d.addCallback(lambda readvs:
5312+            self._read(readvs, queue=queue, force_remote=force_remote))
5313+        def _build_block_hash_tree(results):
5314+            assert self.shnum in results
5315+
5316+            rawhashes = results[self.shnum][0]
5317+            results = [rawhashes[i:i+HASH_SIZE]
5318+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5319+            return results
5320+        d.addCallback(_build_block_hash_tree)
5321+        return d
5322+
5323+
5324+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5325+        """
5326+        I return the part of the share hash chain placed to validate
5327+        this share.
5328+
5329+        I take an optional argument, needed. Needed is a set of indices
5330+        that correspond to the hashes that I should fetch. If needed is
5331+        not present, I will fetch and return the entire share hash
5332+        chain. Otherwise, I may fetch and return any part of the share
5333+        hash chain that is a superset of the part that I am asked to
5334+        fetch. Callers should be prepared to deal with more hashes than
5335+        they've asked for.
5336+        """
5337+        if needed == set([]):
5338+            return defer.succeed([])
5339+        d = self._maybe_fetch_offsets_and_header()
5340+
5341+        def _make_readvs(ignored):
5342+            sharehashes_offset = self._offsets['share_hash_chain']
5343+            if self._version_number == 0:
5344+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5345+            else:
5346+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5347+            readvs = [(sharehashes_offset, sharehashes_length)]
5348+            return readvs
5349+        d.addCallback(_make_readvs)
5350+        d.addCallback(lambda readvs:
5351+            self._read(readvs, queue=queue, force_remote=force_remote))
5352+        def _build_share_hash_chain(results):
5353+            assert self.shnum in results
5354+
5355+            sharehashes = results[self.shnum][0]
5356+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5357+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5358+            results = dict([struct.unpack(">H32s", data)
5359+                            for data in results])
5360+            return results
5361+        d.addCallback(_build_share_hash_chain)
5362+        return d
5363+
5364+
5365+    def get_encprivkey(self, queue=False):
5366+        """
5367+        I return the encrypted private key.
5368+        """
5369+        d = self._maybe_fetch_offsets_and_header()
5370+
5371+        def _make_readvs(ignored):
5372+            privkey_offset = self._offsets['enc_privkey']
5373+            if self._version_number == 0:
5374+                privkey_length = self._offsets['EOF'] - privkey_offset
5375+            else:
5376+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5377+            readvs = [(privkey_offset, privkey_length)]
5378+            return readvs
5379+        d.addCallback(_make_readvs)
5380+        d.addCallback(lambda readvs:
5381+            self._read(readvs, queue=queue))
5382+        def _process_results(results):
5383+            assert self.shnum in results
5384+            privkey = results[self.shnum][0]
5385+            return privkey
5386+        d.addCallback(_process_results)
5387+        return d
5388+
5389+
5390+    def get_signature(self, queue=False):
5391+        """
5392+        I return the signature of my share.
5393+        """
5394+        d = self._maybe_fetch_offsets_and_header()
5395+
5396+        def _make_readvs(ignored):
5397+            signature_offset = self._offsets['signature']
5398+            if self._version_number == 1:
5399+                signature_length = self._offsets['verification_key'] - signature_offset
5400+            else:
5401+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5402+            readvs = [(signature_offset, signature_length)]
5403+            return readvs
5404+        d.addCallback(_make_readvs)
5405+        d.addCallback(lambda readvs:
5406+            self._read(readvs, queue=queue))
5407+        def _process_results(results):
5408+            assert self.shnum in results
5409+            signature = results[self.shnum][0]
5410+            return signature
5411+        d.addCallback(_process_results)
5412+        return d
5413+
5414+
5415+    def get_verification_key(self, queue=False):
5416+        """
5417+        I return the verification key.
5418+        """
5419+        d = self._maybe_fetch_offsets_and_header()
5420+
5421+        def _make_readvs(ignored):
5422+            if self._version_number == 1:
5423+                vk_offset = self._offsets['verification_key']
5424+                vk_length = self._offsets['EOF'] - vk_offset
5425+            else:
5426+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5427+                vk_length = self._offsets['signature'] - vk_offset
5428+            readvs = [(vk_offset, vk_length)]
5429+            return readvs
5430+        d.addCallback(_make_readvs)
5431+        d.addCallback(lambda readvs:
5432+            self._read(readvs, queue=queue))
5433+        def _process_results(results):
5434+            assert self.shnum in results
5435+            verification_key = results[self.shnum][0]
5436+            return verification_key
5437+        d.addCallback(_process_results)
5438+        return d
5439+
5440+
5441+    def get_encoding_parameters(self):
5442+        """
5443+        I return (k, n, segsize, datalen)
5444+        """
5445+        d = self._maybe_fetch_offsets_and_header()
5446+        d.addCallback(lambda ignored:
5447+            (self._required_shares,
5448+             self._total_shares,
5449+             self._segment_size,
5450+             self._data_length))
5451+        return d
5452+
5453+
5454+    def get_seqnum(self):
5455+        """
5456+        I return the sequence number for this share.
5457+        """
5458+        d = self._maybe_fetch_offsets_and_header()
5459+        d.addCallback(lambda ignored:
5460+            self._sequence_number)
5461+        return d
5462+
5463+
5464+    def get_root_hash(self):
5465+        """
5466+        I return the root of the block hash tree
5467+        """
5468+        d = self._maybe_fetch_offsets_and_header()
5469+        d.addCallback(lambda ignored: self._root_hash)
5470+        return d
5471+
5472+
5473+    def get_checkstring(self):
5474+        """
5475+        I return the packed representation of the following:
5476+
5477+            - version number
5478+            - sequence number
5479+            - root hash
5480+            - salt hash
5481+
5482+        which my users use as a checkstring to detect other writers.
5483+        """
5484+        d = self._maybe_fetch_offsets_and_header()
5485+        def _build_checkstring(ignored):
5486+            if self._salt:
5487+                checkstring = strut.pack(PREFIX,
5488+                                         self._version_number,
5489+                                         self._sequence_number,
5490+                                         self._root_hash,
5491+                                         self._salt)
5492+            else:
5493+                checkstring = struct.pack(MDMFCHECKSTRING,
5494+                                          self._version_number,
5495+                                          self._sequence_number,
5496+                                          self._root_hash)
5497+
5498+            return checkstring
5499+        d.addCallback(_build_checkstring)
5500+        return d
5501+
5502+
5503+    def get_prefix(self, force_remote):
5504+        d = self._maybe_fetch_offsets_and_header(force_remote)
5505+        d.addCallback(lambda ignored:
5506+            self._build_prefix())
5507+        return d
5508+
5509+
5510+    def _build_prefix(self):
5511+        # The prefix is another name for the part of the remote share
5512+        # that gets signed. It consists of everything up to and
5513+        # including the datalength, packed by struct.
5514+        if self._version_number == SDMF_VERSION:
5515+            return struct.pack(SIGNED_PREFIX,
5516+                           self._version_number,
5517+                           self._sequence_number,
5518+                           self._root_hash,
5519+                           self._salt,
5520+                           self._required_shares,
5521+                           self._total_shares,
5522+                           self._segment_size,
5523+                           self._data_length)
5524+
5525+        else:
5526+            return struct.pack(MDMFSIGNABLEHEADER,
5527+                           self._version_number,
5528+                           self._sequence_number,
5529+                           self._root_hash,
5530+                           self._required_shares,
5531+                           self._total_shares,
5532+                           self._segment_size,
5533+                           self._data_length)
5534+
5535+
5536+    def _get_offsets_tuple(self):
5537+        # The offsets tuple is another component of the version
5538+        # information tuple. It is basically our offsets dictionary,
5539+        # itemized and in a tuple.
5540+        return self._offsets.copy()
5541+
5542+
5543+    def get_verinfo(self):
5544+        """
5545+        I return my verinfo tuple. This is used by the ServermapUpdater
5546+        to keep track of versions of mutable files.
5547+
5548+        The verinfo tuple for MDMF files contains:
5549+            - seqnum
5550+            - root hash
5551+            - a blank (nothing)
5552+            - segsize
5553+            - datalen
5554+            - k
5555+            - n
5556+            - prefix (the thing that you sign)
5557+            - a tuple of offsets
5558+
5559+        We include the nonce in MDMF to simplify processing of version
5560+        information tuples.
5561+
5562+        The verinfo tuple for SDMF files is the same, but contains a
5563+        16-byte IV instead of a hash of salts.
5564+        """
5565+        d = self._maybe_fetch_offsets_and_header()
5566+        def _build_verinfo(ignored):
5567+            if self._version_number == SDMF_VERSION:
5568+                salt_to_use = self._salt
5569+            else:
5570+                salt_to_use = None
5571+            return (self._sequence_number,
5572+                    self._root_hash,
5573+                    salt_to_use,
5574+                    self._segment_size,
5575+                    self._data_length,
5576+                    self._required_shares,
5577+                    self._total_shares,
5578+                    self._build_prefix(),
5579+                    self._get_offsets_tuple())
5580+        d.addCallback(_build_verinfo)
5581+        return d
5582+
5583+
5584+    def flush(self):
5585+        """
5586+        I flush my queue of read vectors.
5587+        """
5588+        d = self._read(self._readvs)
5589+        def _then(results):
5590+            self._readvs = []
5591+            if isinstance(results, failure.Failure):
5592+                self._queue_errbacks.notify(results)
5593+            else:
5594+                self._queue_observers.notify(results)
5595+            self._queue_observers = observer.ObserverList()
5596+            self._queue_errbacks = observer.ObserverList()
5597+        d.addBoth(_then)
5598+
5599+
5600+    def _read(self, readvs, force_remote=False, queue=False):
5601+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5602+        # TODO: It's entirely possible to tweak this so that it just
5603+        # fulfills the requests that it can, and not demand that all
5604+        # requests are satisfiable before running it.
5605+        if not unsatisfiable and not force_remote:
5606+            results = [self._data[offset:offset+length]
5607+                       for (offset, length) in readvs]
5608+            results = {self.shnum: results}
5609+            return defer.succeed(results)
5610+        else:
5611+            if queue:
5612+                start = len(self._readvs)
5613+                self._readvs += readvs
5614+                end = len(self._readvs)
5615+                def _get_results(results, start, end):
5616+                    if not self.shnum in results:
5617+                        return {self._shnum: [""]}
5618+                    return {self.shnum: results[self.shnum][start:end]}
5619+                d = defer.Deferred()
5620+                d.addCallback(_get_results, start, end)
5621+                self._queue_observers.subscribe(d.callback)
5622+                self._queue_errbacks.subscribe(d.errback)
5623+                return d
5624+            return self._rref.callRemote("slot_readv",
5625+                                         self._storage_index,
5626+                                         [self.shnum],
5627+                                         readvs)
5628+
5629+
5630+    def is_sdmf(self):
5631+        """I tell my caller whether or not my remote file is SDMF or MDMF
5632+        """
5633+        d = self._maybe_fetch_offsets_and_header()
5634+        d.addCallback(lambda ignored:
5635+            self._version_number == 0)
5636+        return d
5637+
5638+
5639+class LayoutInvalid(Exception):
5640+    """
5641+    This isn't a valid MDMF mutable file
5642+    """
5643hunk ./src/allmydata/test/test_storage.py 2
5644 
5645-import time, os.path, stat, re, simplejson, struct
5646+import time, os.path, stat, re, simplejson, struct, shutil
5647 
5648 from twisted.trial import unittest
5649 
5650hunk ./src/allmydata/test/test_storage.py 22
5651 from allmydata.storage.expirer import LeaseCheckingCrawler
5652 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5653      ReadBucketProxy
5654-from allmydata.interfaces import BadWriteEnablerError
5655-from allmydata.test.common import LoggingServiceParent
5656+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5657+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5658+                                     SIGNED_PREFIX, MDMFHEADER, \
5659+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5660+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5661+                                 SDMF_VERSION
5662+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5663 from allmydata.test.common_web import WebRenderingMixin
5664 from allmydata.web.storage import StorageStatus, remove_prefix
5665 
5666hunk ./src/allmydata/test/test_storage.py 106
5667 
5668 class RemoteBucket:
5669 
5670+    def __init__(self):
5671+        self.read_count = 0
5672+        self.write_count = 0
5673+
5674     def callRemote(self, methname, *args, **kwargs):
5675         def _call():
5676             meth = getattr(self.target, "remote_" + methname)
5677hunk ./src/allmydata/test/test_storage.py 114
5678             return meth(*args, **kwargs)
5679+
5680+        if methname == "slot_readv":
5681+            self.read_count += 1
5682+        if "writev" in methname:
5683+            self.write_count += 1
5684+
5685         return defer.maybeDeferred(_call)
5686 
5687hunk ./src/allmydata/test/test_storage.py 122
5688+
5689 class BucketProxy(unittest.TestCase):
5690     def make_bucket(self, name, size):
5691         basedir = os.path.join("storage", "BucketProxy", name)
5692hunk ./src/allmydata/test/test_storage.py 1299
5693         self.failUnless(os.path.exists(prefixdir), prefixdir)
5694         self.failIf(os.path.exists(bucketdir), bucketdir)
5695 
5696+
5697+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
5698+    def setUp(self):
5699+        self.sparent = LoggingServiceParent()
5700+        self._lease_secret = itertools.count()
5701+        self.ss = self.create("MDMFProxies storage test server")
5702+        self.rref = RemoteBucket()
5703+        self.rref.target = self.ss
5704+        self.secrets = (self.write_enabler("we_secret"),
5705+                        self.renew_secret("renew_secret"),
5706+                        self.cancel_secret("cancel_secret"))
5707+        self.segment = "aaaaaa"
5708+        self.block = "aa"
5709+        self.salt = "a" * 16
5710+        self.block_hash = "a" * 32
5711+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
5712+        self.share_hash = self.block_hash
5713+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
5714+        self.signature = "foobarbaz"
5715+        self.verification_key = "vvvvvv"
5716+        self.encprivkey = "private"
5717+        self.root_hash = self.block_hash
5718+        self.salt_hash = self.root_hash
5719+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
5720+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
5721+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
5722+        # blockhashes and salt hashes are serialized in the same way,
5723+        # only we lop off the first element and store that in the
5724+        # header.
5725+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
5726+
5727+
5728+    def tearDown(self):
5729+        self.sparent.stopService()
5730+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
5731+
5732+
5733+    def write_enabler(self, we_tag):
5734+        return hashutil.tagged_hash("we_blah", we_tag)
5735+
5736+
5737+    def renew_secret(self, tag):
5738+        return hashutil.tagged_hash("renew_blah", str(tag))
5739+
5740+
5741+    def cancel_secret(self, tag):
5742+        return hashutil.tagged_hash("cancel_blah", str(tag))
5743+
5744+
5745+    def workdir(self, name):
5746+        basedir = os.path.join("storage", "MutableServer", name)
5747+        return basedir
5748+
5749+
5750+    def create(self, name):
5751+        workdir = self.workdir(name)
5752+        ss = StorageServer(workdir, "\x00" * 20)
5753+        ss.setServiceParent(self.sparent)
5754+        return ss
5755+
5756+
5757+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
5758+        # Start with the checkstring
5759+        data = struct.pack(">BQ32s",
5760+                           1,
5761+                           0,
5762+                           self.root_hash)
5763+        self.checkstring = data
5764+        # Next, the encoding parameters
5765+        if tail_segment:
5766+            data += struct.pack(">BBQQ",
5767+                                3,
5768+                                10,
5769+                                6,
5770+                                33)
5771+        elif empty:
5772+            data += struct.pack(">BBQQ",
5773+                                3,
5774+                                10,
5775+                                0,
5776+                                0)
5777+        else:
5778+            data += struct.pack(">BBQQ",
5779+                                3,
5780+                                10,
5781+                                6,
5782+                                36)
5783+        # Now we'll build the offsets.
5784+        sharedata = ""
5785+        if not tail_segment and not empty:
5786+            for i in xrange(6):
5787+                sharedata += self.salt + self.block
5788+        elif tail_segment:
5789+            for i in xrange(5):
5790+                sharedata += self.salt + self.block
5791+            sharedata += self.salt + "a"
5792+
5793+        # The encrypted private key comes after the shares + salts
5794+        offset_size = struct.calcsize(MDMFOFFSETS)
5795+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
5796+        # The blockhashes come after the private key
5797+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
5798+        # The sharehashes come after the salt hashes
5799+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
5800+        # The signature comes after the share hash chain
5801+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
5802+        # The verification key comes after the signature
5803+        verification_offset = signature_offset + len(self.signature)
5804+        # The EOF comes after the verification key
5805+        eof_offset = verification_offset + len(self.verification_key)
5806+        data += struct.pack(MDMFOFFSETS,
5807+                            encrypted_private_key_offset,
5808+                            blockhashes_offset,
5809+                            sharehashes_offset,
5810+                            signature_offset,
5811+                            verification_offset,
5812+                            eof_offset)
5813+        self.offsets = {}
5814+        self.offsets['enc_privkey'] = encrypted_private_key_offset
5815+        self.offsets['block_hash_tree'] = blockhashes_offset
5816+        self.offsets['share_hash_chain'] = sharehashes_offset
5817+        self.offsets['signature'] = signature_offset
5818+        self.offsets['verification_key'] = verification_offset
5819+        self.offsets['EOF'] = eof_offset
5820+        # Next, we'll add in the salts and share data,
5821+        data += sharedata
5822+        # the private key,
5823+        data += self.encprivkey
5824+        # the block hash tree,
5825+        data += self.block_hash_tree_s
5826+        # the share hash chain,
5827+        data += self.share_hash_chain_s
5828+        # the signature,
5829+        data += self.signature
5830+        # and the verification key
5831+        data += self.verification_key
5832+        return data
5833+
5834+
5835+    def write_test_share_to_server(self,
5836+                                   storage_index,
5837+                                   tail_segment=False,
5838+                                   empty=False):
5839+        """
5840+        I write some data for the read tests to read to self.ss
5841+
5842+        If tail_segment=True, then I will write a share that has a
5843+        smaller tail segment than other segments.
5844+        """
5845+        write = self.ss.remote_slot_testv_and_readv_and_writev
5846+        data = self.build_test_mdmf_share(tail_segment, empty)
5847+        # Finally, we write the whole thing to the storage server in one
5848+        # pass.
5849+        testvs = [(0, 1, "eq", "")]
5850+        tws = {}
5851+        tws[0] = (testvs, [(0, data)], None)
5852+        readv = [(0, 1)]
5853+        results = write(storage_index, self.secrets, tws, readv)
5854+        self.failUnless(results[0])
5855+
5856+
5857+    def build_test_sdmf_share(self, empty=False):
5858+        if empty:
5859+            sharedata = ""
5860+        else:
5861+            sharedata = self.segment * 6
5862+        self.sharedata = sharedata
5863+        blocksize = len(sharedata) / 3
5864+        block = sharedata[:blocksize]
5865+        self.blockdata = block
5866+        prefix = struct.pack(">BQ32s16s BBQQ",
5867+                             0, # version,
5868+                             0,
5869+                             self.root_hash,
5870+                             self.salt,
5871+                             3,
5872+                             10,
5873+                             len(sharedata),
5874+                             len(sharedata),
5875+                            )
5876+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5877+        signature_offset = post_offset + len(self.verification_key)
5878+        sharehashes_offset = signature_offset + len(self.signature)
5879+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
5880+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
5881+        encprivkey_offset = sharedata_offset + len(block)
5882+        eof_offset = encprivkey_offset + len(self.encprivkey)
5883+        offsets = struct.pack(">LLLLQQ",
5884+                              signature_offset,
5885+                              sharehashes_offset,
5886+                              blockhashes_offset,
5887+                              sharedata_offset,
5888+                              encprivkey_offset,
5889+                              eof_offset)
5890+        final_share = "".join([prefix,
5891+                           offsets,
5892+                           self.verification_key,
5893+                           self.signature,
5894+                           self.share_hash_chain_s,
5895+                           self.block_hash_tree_s,
5896+                           block,
5897+                           self.encprivkey])
5898+        self.offsets = {}
5899+        self.offsets['signature'] = signature_offset
5900+        self.offsets['share_hash_chain'] = sharehashes_offset
5901+        self.offsets['block_hash_tree'] = blockhashes_offset
5902+        self.offsets['share_data'] = sharedata_offset
5903+        self.offsets['enc_privkey'] = encprivkey_offset
5904+        self.offsets['EOF'] = eof_offset
5905+        return final_share
5906+
5907+
5908+    def write_sdmf_share_to_server(self,
5909+                                   storage_index,
5910+                                   empty=False):
5911+        # Some tests need SDMF shares to verify that we can still
5912+        # read them. This method writes one, which resembles but is not
5913+        assert self.rref
5914+        write = self.ss.remote_slot_testv_and_readv_and_writev
5915+        share = self.build_test_sdmf_share(empty)
5916+        testvs = [(0, 1, "eq", "")]
5917+        tws = {}
5918+        tws[0] = (testvs, [(0, share)], None)
5919+        readv = []
5920+        results = write(storage_index, self.secrets, tws, readv)
5921+        self.failUnless(results[0])
5922+
5923+
5924+    def test_read(self):
5925+        self.write_test_share_to_server("si1")
5926+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5927+        # Check that every method equals what we expect it to.
5928+        d = defer.succeed(None)
5929+        def _check_block_and_salt((block, salt)):
5930+            self.failUnlessEqual(block, self.block)
5931+            self.failUnlessEqual(salt, self.salt)
5932+
5933+        for i in xrange(6):
5934+            d.addCallback(lambda ignored, i=i:
5935+                mr.get_block_and_salt(i))
5936+            d.addCallback(_check_block_and_salt)
5937+
5938+        d.addCallback(lambda ignored:
5939+            mr.get_encprivkey())
5940+        d.addCallback(lambda encprivkey:
5941+            self.failUnlessEqual(self.encprivkey, encprivkey))
5942+
5943+        d.addCallback(lambda ignored:
5944+            mr.get_blockhashes())
5945+        d.addCallback(lambda blockhashes:
5946+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
5947+
5948+        d.addCallback(lambda ignored:
5949+            mr.get_sharehashes())
5950+        d.addCallback(lambda sharehashes:
5951+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
5952+
5953+        d.addCallback(lambda ignored:
5954+            mr.get_signature())
5955+        d.addCallback(lambda signature:
5956+            self.failUnlessEqual(signature, self.signature))
5957+
5958+        d.addCallback(lambda ignored:
5959+            mr.get_verification_key())
5960+        d.addCallback(lambda verification_key:
5961+            self.failUnlessEqual(verification_key, self.verification_key))
5962+
5963+        d.addCallback(lambda ignored:
5964+            mr.get_seqnum())
5965+        d.addCallback(lambda seqnum:
5966+            self.failUnlessEqual(seqnum, 0))
5967+
5968+        d.addCallback(lambda ignored:
5969+            mr.get_root_hash())
5970+        d.addCallback(lambda root_hash:
5971+            self.failUnlessEqual(self.root_hash, root_hash))
5972+
5973+        d.addCallback(lambda ignored:
5974+            mr.get_seqnum())
5975+        d.addCallback(lambda seqnum:
5976+            self.failUnlessEqual(0, seqnum))
5977+
5978+        d.addCallback(lambda ignored:
5979+            mr.get_encoding_parameters())
5980+        def _check_encoding_parameters((k, n, segsize, datalen)):
5981+            self.failUnlessEqual(k, 3)
5982+            self.failUnlessEqual(n, 10)
5983+            self.failUnlessEqual(segsize, 6)
5984+            self.failUnlessEqual(datalen, 36)
5985+        d.addCallback(_check_encoding_parameters)
5986+
5987+        d.addCallback(lambda ignored:
5988+            mr.get_checkstring())
5989+        d.addCallback(lambda checkstring:
5990+            self.failUnlessEqual(checkstring, checkstring))
5991+        return d
5992+
5993+
5994+    def test_read_with_different_tail_segment_size(self):
5995+        self.write_test_share_to_server("si1", tail_segment=True)
5996+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5997+        d = mr.get_block_and_salt(5)
5998+        def _check_tail_segment(results):
5999+            block, salt = results
6000+            self.failUnlessEqual(len(block), 1)
6001+            self.failUnlessEqual(block, "a")
6002+        d.addCallback(_check_tail_segment)
6003+        return d
6004+
6005+
6006+    def test_get_block_with_invalid_segnum(self):
6007+        self.write_test_share_to_server("si1")
6008+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6009+        d = defer.succeed(None)
6010+        d.addCallback(lambda ignored:
6011+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6012+                            None,
6013+                            mr.get_block_and_salt, 7))
6014+        return d
6015+
6016+
6017+    def test_get_encoding_parameters_first(self):
6018+        self.write_test_share_to_server("si1")
6019+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6020+        d = mr.get_encoding_parameters()
6021+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6022+            self.failUnlessEqual(k, 3)
6023+            self.failUnlessEqual(n, 10)
6024+            self.failUnlessEqual(segment_size, 6)
6025+            self.failUnlessEqual(datalen, 36)
6026+        d.addCallback(_check_encoding_parameters)
6027+        return d
6028+
6029+
6030+    def test_get_seqnum_first(self):
6031+        self.write_test_share_to_server("si1")
6032+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6033+        d = mr.get_seqnum()
6034+        d.addCallback(lambda seqnum:
6035+            self.failUnlessEqual(seqnum, 0))
6036+        return d
6037+
6038+
6039+    def test_get_root_hash_first(self):
6040+        self.write_test_share_to_server("si1")
6041+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6042+        d = mr.get_root_hash()
6043+        d.addCallback(lambda root_hash:
6044+            self.failUnlessEqual(root_hash, self.root_hash))
6045+        return d
6046+
6047+
6048+    def test_get_checkstring_first(self):
6049+        self.write_test_share_to_server("si1")
6050+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6051+        d = mr.get_checkstring()
6052+        d.addCallback(lambda checkstring:
6053+            self.failUnlessEqual(checkstring, self.checkstring))
6054+        return d
6055+
6056+
6057+    def test_write_read_vectors(self):
6058+        # When writing for us, the storage server will return to us a
6059+        # read vector, along with its result. If a write fails because
6060+        # the test vectors failed, this read vector can help us to
6061+        # diagnose the problem. This test ensures that the read vector
6062+        # is working appropriately.
6063+        mw = self._make_new_mw("si1", 0)
6064+        d = defer.succeed(None)
6065+
6066+        # Write one share. This should return a checkstring of nothing,
6067+        # since there is no data there.
6068+        d.addCallback(lambda ignored:
6069+            mw.put_block(self.block, 0, self.salt))
6070+        def _check_first_write(results):
6071+            result, readvs = results
6072+            self.failUnless(result)
6073+            self.failIf(readvs)
6074+        d.addCallback(_check_first_write)
6075+        # Now, there should be a different checkstring returned when
6076+        # we write other shares
6077+        d.addCallback(lambda ignored:
6078+            mw.put_block(self.block, 1, self.salt))
6079+        def _check_next_write(results):
6080+            result, readvs = results
6081+            self.failUnless(result)
6082+            self.expected_checkstring = mw.get_checkstring()
6083+            self.failUnlessIn(0, readvs)
6084+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
6085+        d.addCallback(_check_next_write)
6086+        # Add the other four shares
6087+        for i in xrange(2, 6):
6088+            d.addCallback(lambda ignored, i=i:
6089+                mw.put_block(self.block, i, self.salt))
6090+            d.addCallback(_check_next_write)
6091+        # Add the encrypted private key
6092+        d.addCallback(lambda ignored:
6093+            mw.put_encprivkey(self.encprivkey))
6094+        d.addCallback(_check_next_write)
6095+        # Add the block hash tree and share hash tree
6096+        d.addCallback(lambda ignored:
6097+            mw.put_blockhashes(self.block_hash_tree))
6098+        d.addCallback(_check_next_write)
6099+        d.addCallback(lambda ignored:
6100+            mw.put_sharehashes(self.share_hash_chain))
6101+        d.addCallback(_check_next_write)
6102+        # Add the root hash and the salt hash. This should change the
6103+        # checkstring, but not in a way that we'll be able to see right
6104+        # now, since the read vectors are applied before the write
6105+        # vectors.
6106+        d.addCallback(lambda ignored:
6107+            mw.put_root_hash(self.root_hash))
6108+        def _check_old_testv_after_new_one_is_written(results):
6109+            result, readvs = results
6110+            self.failUnless(result)
6111+            self.failUnlessIn(0, readvs)
6112+            self.failUnlessEqual(self.expected_checkstring,
6113+                                 readvs[0][0])
6114+            new_checkstring = mw.get_checkstring()
6115+            self.failIfEqual(new_checkstring,
6116+                             readvs[0][0])
6117+        d.addCallback(_check_old_testv_after_new_one_is_written)
6118+        # Now add the signature. This should succeed, meaning that the
6119+        # data gets written and the read vector matches what the writer
6120+        # thinks should be there.
6121+        d.addCallback(lambda ignored:
6122+            mw.put_signature(self.signature))
6123+        d.addCallback(_check_next_write)
6124+        # The checkstring remains the same for the rest of the process.
6125+        return d
6126+
6127+
6128+    def test_blockhashes_after_share_hash_chain(self):
6129+        mw = self._make_new_mw("si1", 0)
6130+        d = defer.succeed(None)
6131+        # Put everything up to and including the share hash chain
6132+        for i in xrange(6):
6133+            d.addCallback(lambda ignored, i=i:
6134+                mw.put_block(self.block, i, self.salt))
6135+        d.addCallback(lambda ignored:
6136+            mw.put_encprivkey(self.encprivkey))
6137+        d.addCallback(lambda ignored:
6138+            mw.put_blockhashes(self.block_hash_tree))
6139+        d.addCallback(lambda ignored:
6140+            mw.put_sharehashes(self.share_hash_chain))
6141+
6142+        # Now try to put the block hash tree again.
6143+        d.addCallback(lambda ignored:
6144+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6145+                            None,
6146+                            mw.put_blockhashes, self.block_hash_tree))
6147+        return d
6148+
6149+
6150+    def test_encprivkey_after_blockhashes(self):
6151+        mw = self._make_new_mw("si1", 0)
6152+        d = defer.succeed(None)
6153+        # Put everything up to and including the block hash tree
6154+        for i in xrange(6):
6155+            d.addCallback(lambda ignored, i=i:
6156+                mw.put_block(self.block, i, self.salt))
6157+        d.addCallback(lambda ignored:
6158+            mw.put_encprivkey(self.encprivkey))
6159+        d.addCallback(lambda ignored:
6160+            mw.put_blockhashes(self.block_hash_tree))
6161+        d.addCallback(lambda ignored:
6162+            self.shouldFail(LayoutInvalid, "out of order private key",
6163+                            None,
6164+                            mw.put_encprivkey, self.encprivkey))
6165+        return d
6166+
6167+
6168+    def test_share_hash_chain_after_signature(self):
6169+        mw = self._make_new_mw("si1", 0)
6170+        d = defer.succeed(None)
6171+        # Put everything up to and including the signature
6172+        for i in xrange(6):
6173+            d.addCallback(lambda ignored, i=i:
6174+                mw.put_block(self.block, i, self.salt))
6175+        d.addCallback(lambda ignored:
6176+            mw.put_encprivkey(self.encprivkey))
6177+        d.addCallback(lambda ignored:
6178+            mw.put_blockhashes(self.block_hash_tree))
6179+        d.addCallback(lambda ignored:
6180+            mw.put_sharehashes(self.share_hash_chain))
6181+        d.addCallback(lambda ignored:
6182+            mw.put_root_hash(self.root_hash))
6183+        d.addCallback(lambda ignored:
6184+            mw.put_signature(self.signature))
6185+        # Now try to put the share hash chain again. This should fail
6186+        d.addCallback(lambda ignored:
6187+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6188+                            None,
6189+                            mw.put_sharehashes, self.share_hash_chain))
6190+        return d
6191+
6192+
6193+    def test_signature_after_verification_key(self):
6194+        mw = self._make_new_mw("si1", 0)
6195+        d = defer.succeed(None)
6196+        # Put everything up to and including the verification key.
6197+        for i in xrange(6):
6198+            d.addCallback(lambda ignored, i=i:
6199+                mw.put_block(self.block, i, self.salt))
6200+        d.addCallback(lambda ignored:
6201+            mw.put_encprivkey(self.encprivkey))
6202+        d.addCallback(lambda ignored:
6203+            mw.put_blockhashes(self.block_hash_tree))
6204+        d.addCallback(lambda ignored:
6205+            mw.put_sharehashes(self.share_hash_chain))
6206+        d.addCallback(lambda ignored:
6207+            mw.put_root_hash(self.root_hash))
6208+        d.addCallback(lambda ignored:
6209+            mw.put_signature(self.signature))
6210+        d.addCallback(lambda ignored:
6211+            mw.put_verification_key(self.verification_key))
6212+        # Now try to put the signature again. This should fail
6213+        d.addCallback(lambda ignored:
6214+            self.shouldFail(LayoutInvalid, "signature after verification",
6215+                            None,
6216+                            mw.put_signature, self.signature))
6217+        return d
6218+
6219+
6220+    def test_uncoordinated_write(self):
6221+        # Make two mutable writers, both pointing to the same storage
6222+        # server, both at the same storage index, and try writing to the
6223+        # same share.
6224+        mw1 = self._make_new_mw("si1", 0)
6225+        mw2 = self._make_new_mw("si1", 0)
6226+        d = defer.succeed(None)
6227+        def _check_success(results):
6228+            result, readvs = results
6229+            self.failUnless(result)
6230+
6231+        def _check_failure(results):
6232+            result, readvs = results
6233+            self.failIf(result)
6234+
6235+        d.addCallback(lambda ignored:
6236+            mw1.put_block(self.block, 0, self.salt))
6237+        d.addCallback(_check_success)
6238+        d.addCallback(lambda ignored:
6239+            mw2.put_block(self.block, 0, self.salt))
6240+        d.addCallback(_check_failure)
6241+        return d
6242+
6243+
6244+    def test_invalid_salt_size(self):
6245+        # Salts need to be 16 bytes in size. Writes that attempt to
6246+        # write more or less than this should be rejected.
6247+        mw = self._make_new_mw("si1", 0)
6248+        invalid_salt = "a" * 17 # 17 bytes
6249+        another_invalid_salt = "b" * 15 # 15 bytes
6250+        d = defer.succeed(None)
6251+        d.addCallback(lambda ignored:
6252+            self.shouldFail(LayoutInvalid, "salt too big",
6253+                            None,
6254+                            mw.put_block, self.block, 0, invalid_salt))
6255+        d.addCallback(lambda ignored:
6256+            self.shouldFail(LayoutInvalid, "salt too small",
6257+                            None,
6258+                            mw.put_block, self.block, 0,
6259+                            another_invalid_salt))
6260+        return d
6261+
6262+
6263+    def test_write_test_vectors(self):
6264+        # If we give the write proxy a bogus test vector at
6265+        # any point during the process, it should fail to write.
6266+        mw = self._make_new_mw("si1", 0)
6267+        mw.set_checkstring("this is a lie")
6268+        # The initial write should be expecting to find the improbable
6269+        # checkstring above in place; finding nothing, it should fail.
6270+        d = defer.succeed(None)
6271+        d.addCallback(lambda ignored:
6272+            mw.put_block(self.block, 0, self.salt))
6273+        def _check_failure(results):
6274+            result, readv = results
6275+            self.failIf(result)
6276+        d.addCallback(_check_failure)
6277+        # Now set the checkstring to the empty string, which
6278+        # indicates that no share is there.
6279+        d.addCallback(lambda ignored:
6280+            mw.set_checkstring(""))
6281+        d.addCallback(lambda ignored:
6282+            mw.put_block(self.block, 0, self.salt))
6283+        def _check_success(results):
6284+            result, readv = results
6285+            self.failUnless(result)
6286+        d.addCallback(_check_success)
6287+        # Now set the checkstring to something wrong
6288+        d.addCallback(lambda ignored:
6289+            mw.set_checkstring("something wrong"))
6290+        # This should fail to do anything
6291+        d.addCallback(lambda ignored:
6292+            mw.put_block(self.block, 1, self.salt))
6293+        d.addCallback(_check_failure)
6294+        # Now set it back to what it should be.
6295+        d.addCallback(lambda ignored:
6296+            mw.set_checkstring(mw.get_checkstring()))
6297+        for i in xrange(1, 6):
6298+            d.addCallback(lambda ignored, i=i:
6299+                mw.put_block(self.block, i, self.salt))
6300+            d.addCallback(_check_success)
6301+        d.addCallback(lambda ignored:
6302+            mw.put_encprivkey(self.encprivkey))
6303+        d.addCallback(_check_success)
6304+        d.addCallback(lambda ignored:
6305+            mw.put_blockhashes(self.block_hash_tree))
6306+        d.addCallback(_check_success)
6307+        d.addCallback(lambda ignored:
6308+            mw.put_sharehashes(self.share_hash_chain))
6309+        d.addCallback(_check_success)
6310+        def _keep_old_checkstring(ignored):
6311+            self.old_checkstring = mw.get_checkstring()
6312+            mw.set_checkstring("foobarbaz")
6313+        d.addCallback(_keep_old_checkstring)
6314+        d.addCallback(lambda ignored:
6315+            mw.put_root_hash(self.root_hash))
6316+        d.addCallback(_check_failure)
6317+        d.addCallback(lambda ignored:
6318+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
6319+        def _restore_old_checkstring(ignored):
6320+            mw.set_checkstring(self.old_checkstring)
6321+        d.addCallback(_restore_old_checkstring)
6322+        d.addCallback(lambda ignored:
6323+            mw.put_root_hash(self.root_hash))
6324+        d.addCallback(_check_success)
6325+        # The checkstring should have been set appropriately for us on
6326+        # the last write; if we try to change it to something else,
6327+        # that change should cause the verification key step to fail.
6328+        d.addCallback(lambda ignored:
6329+            mw.set_checkstring("something else"))
6330+        d.addCallback(lambda ignored:
6331+            mw.put_signature(self.signature))
6332+        d.addCallback(_check_failure)
6333+        d.addCallback(lambda ignored:
6334+            mw.set_checkstring(mw.get_checkstring()))
6335+        d.addCallback(lambda ignored:
6336+            mw.put_signature(self.signature))
6337+        d.addCallback(_check_success)
6338+        d.addCallback(lambda ignored:
6339+            mw.put_verification_key(self.verification_key))
6340+        d.addCallback(_check_success)
6341+        return d
6342+
6343+
6344+    def test_offset_only_set_on_success(self):
6345+        # The write proxy should be smart enough to detect when a write
6346+        # has failed, and to temper its definition of progress based on
6347+        # that.
6348+        mw = self._make_new_mw("si1", 0)
6349+        d = defer.succeed(None)
6350+        for i in xrange(1, 6):
6351+            d.addCallback(lambda ignored, i=i:
6352+                mw.put_block(self.block, i, self.salt))
6353+        def _break_checkstring(ignored):
6354+            self._old_checkstring = mw.get_checkstring()
6355+            mw.set_checkstring("foobarbaz")
6356+
6357+        def _fix_checkstring(ignored):
6358+            mw.set_checkstring(self._old_checkstring)
6359+
6360+        d.addCallback(_break_checkstring)
6361+
6362+        # Setting the encrypted private key shouldn't work now, which is
6363+        # to be expected and is tested elsewhere. We also want to make
6364+        # sure that we can't add the block hash tree after a failed
6365+        # write of this sort.
6366+        d.addCallback(lambda ignored:
6367+            mw.put_encprivkey(self.encprivkey))
6368+        d.addCallback(lambda ignored:
6369+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
6370+                            None,
6371+                            mw.put_blockhashes, self.block_hash_tree))
6372+        d.addCallback(_fix_checkstring)
6373+        d.addCallback(lambda ignored:
6374+            mw.put_encprivkey(self.encprivkey))
6375+        d.addCallback(_break_checkstring)
6376+        d.addCallback(lambda ignored:
6377+            mw.put_blockhashes(self.block_hash_tree))
6378+        d.addCallback(lambda ignored:
6379+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
6380+                            None,
6381+                            mw.put_sharehashes, self.share_hash_chain))
6382+        d.addCallback(_fix_checkstring)
6383+        d.addCallback(lambda ignored:
6384+            mw.put_blockhashes(self.block_hash_tree))
6385+        d.addCallback(_break_checkstring)
6386+        d.addCallback(lambda ignored:
6387+            mw.put_sharehashes(self.share_hash_chain))
6388+        d.addCallback(lambda ignored:
6389+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
6390+                            None,
6391+                            mw.put_root_hash, self.root_hash))
6392+        d.addCallback(_fix_checkstring)
6393+        d.addCallback(lambda ignored:
6394+            mw.put_sharehashes(self.share_hash_chain))
6395+        d.addCallback(_break_checkstring)
6396+        d.addCallback(lambda ignored:
6397+            mw.put_root_hash(self.root_hash))
6398+        d.addCallback(lambda ignored:
6399+            self.shouldFail(LayoutInvalid, "out-of-order signature",
6400+                            None,
6401+                            mw.put_signature, self.signature))
6402+        d.addCallback(_fix_checkstring)
6403+        d.addCallback(lambda ignored:
6404+            mw.put_root_hash(self.root_hash))
6405+        d.addCallback(_break_checkstring)
6406+        d.addCallback(lambda ignored:
6407+            mw.put_signature(self.signature))
6408+        d.addCallback(lambda ignored:
6409+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
6410+                            None,
6411+                            mw.put_verification_key,
6412+                            self.verification_key))
6413+        d.addCallback(_fix_checkstring)
6414+        d.addCallback(lambda ignored:
6415+            mw.put_signature(self.signature))
6416+        d.addCallback(_break_checkstring)
6417+        d.addCallback(lambda ignored:
6418+            mw.put_verification_key(self.verification_key))
6419+        d.addCallback(lambda ignored:
6420+            self.shouldFail(LayoutInvalid, "out-of-order finish",
6421+                            None,
6422+                            mw.finish_publishing))
6423+        return d
6424+
6425+
6426+    def serialize_blockhashes(self, blockhashes):
6427+        return "".join(blockhashes)
6428+
6429+
6430+    def serialize_sharehashes(self, sharehashes):
6431+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6432+                        for i in sorted(sharehashes.keys())])
6433+        return ret
6434+
6435+
6436+    def test_write(self):
6437+        # This translates to a file with 6 6-byte segments, and with 2-byte
6438+        # blocks.
6439+        mw = self._make_new_mw("si1", 0)
6440+        mw2 = self._make_new_mw("si1", 1)
6441+        # Test writing some blocks.
6442+        read = self.ss.remote_slot_readv
6443+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6444+        written_block_size = 2 + len(self.salt)
6445+        written_block = self.block + self.salt
6446+        def _check_block_write(i, share):
6447+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6448+                                {share: [written_block]})
6449+        d = defer.succeed(None)
6450+        for i in xrange(6):
6451+            d.addCallback(lambda ignored, i=i:
6452+                mw.put_block(self.block, i, self.salt))
6453+            d.addCallback(lambda ignored, i=i:
6454+                _check_block_write(i, 0))
6455+        # Now try the same thing, but with share 1 instead of share 0.
6456+        for i in xrange(6):
6457+            d.addCallback(lambda ignored, i=i:
6458+                mw2.put_block(self.block, i, self.salt))
6459+            d.addCallback(lambda ignored, i=i:
6460+                _check_block_write(i, 1))
6461+
6462+        # Next, we make a fake encrypted private key, and put it onto the
6463+        # storage server.
6464+        d.addCallback(lambda ignored:
6465+            mw.put_encprivkey(self.encprivkey))
6466+        expected_private_key_offset = expected_sharedata_offset + \
6467+                                      len(written_block) * 6
6468+        self.failUnlessEqual(len(self.encprivkey), 7)
6469+        d.addCallback(lambda ignored:
6470+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6471+                                 {0: [self.encprivkey]}))
6472+
6473+        # Next, we put a fake block hash tree.
6474+        d.addCallback(lambda ignored:
6475+            mw.put_blockhashes(self.block_hash_tree))
6476+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6477+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6478+        d.addCallback(lambda ignored:
6479+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6480+                                 {0: [self.block_hash_tree_s]}))
6481+
6482+        # Next, put a fake share hash chain
6483+        d.addCallback(lambda ignored:
6484+            mw.put_sharehashes(self.share_hash_chain))
6485+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6486+        d.addCallback(lambda ignored:
6487+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6488+                                 {0: [self.share_hash_chain_s]}))
6489+
6490+        # Next, we put what is supposed to be the root hash of
6491+        # our share hash tree but isn't       
6492+        d.addCallback(lambda ignored:
6493+            mw.put_root_hash(self.root_hash))
6494+        # The root hash gets inserted at byte 9 (its position is in the header,
6495+        # and is fixed).
6496+        def _check(ignored):
6497+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6498+                                 {0: [self.root_hash]})
6499+        d.addCallback(_check)
6500+
6501+        # Next, we put a signature of the header block.
6502+        d.addCallback(lambda ignored:
6503+            mw.put_signature(self.signature))
6504+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6505+        self.failUnlessEqual(len(self.signature), 9)
6506+        d.addCallback(lambda ignored:
6507+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6508+                                 {0: [self.signature]}))
6509+
6510+        # Next, we put the verification key
6511+        d.addCallback(lambda ignored:
6512+            mw.put_verification_key(self.verification_key))
6513+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
6514+        self.failUnlessEqual(len(self.verification_key), 6)
6515+        d.addCallback(lambda ignored:
6516+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6517+                                 {0: [self.verification_key]}))
6518+
6519+        def _check_signable(ignored):
6520+            # Make sure that the signable is what we think it should be.
6521+            signable = mw.get_signable()
6522+            verno, seq, roothash, k, n, segsize, datalen = \
6523+                                            struct.unpack(">BQ32sBBQQ",
6524+                                                          signable)
6525+            self.failUnlessEqual(verno, 1)
6526+            self.failUnlessEqual(seq, 0)
6527+            self.failUnlessEqual(roothash, self.root_hash)
6528+            self.failUnlessEqual(k, 3)
6529+            self.failUnlessEqual(n, 10)
6530+            self.failUnlessEqual(segsize, 6)
6531+            self.failUnlessEqual(datalen, 36)
6532+        d.addCallback(_check_signable)
6533+        # Next, we cause the offset table to be published.
6534+        d.addCallback(lambda ignored:
6535+            mw.finish_publishing())
6536+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6537+
6538+        def _check_offsets(ignored):
6539+            # Check the version number to make sure that it is correct.
6540+            expected_version_number = struct.pack(">B", 1)
6541+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6542+                                 {0: [expected_version_number]})
6543+            # Check the sequence number to make sure that it is correct
6544+            expected_sequence_number = struct.pack(">Q", 0)
6545+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6546+                                 {0: [expected_sequence_number]})
6547+            # Check that the encoding parameters (k, N, segement size, data
6548+            # length) are what they should be. These are  3, 10, 6, 36
6549+            expected_k = struct.pack(">B", 3)
6550+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6551+                                 {0: [expected_k]})
6552+            expected_n = struct.pack(">B", 10)
6553+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6554+                                 {0: [expected_n]})
6555+            expected_segment_size = struct.pack(">Q", 6)
6556+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6557+                                 {0: [expected_segment_size]})
6558+            expected_data_length = struct.pack(">Q", 36)
6559+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6560+                                 {0: [expected_data_length]})
6561+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6562+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6563+                                 {0: [expected_offset]})
6564+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6565+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6566+                                 {0: [expected_offset]})
6567+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6568+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6569+                                 {0: [expected_offset]})
6570+            expected_offset = struct.pack(">Q", expected_signature_offset)
6571+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6572+                                 {0: [expected_offset]})
6573+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6574+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6575+                                 {0: [expected_offset]})
6576+            expected_offset = struct.pack(">Q", expected_eof_offset)
6577+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6578+                                 {0: [expected_offset]})
6579+        d.addCallback(_check_offsets)
6580+        return d
6581+
6582+    def _make_new_mw(self, si, share, datalength=36):
6583+        # This is a file of size 36 bytes. Since it has a segment
6584+        # size of 6, we know that it has 6 byte segments, which will
6585+        # be split into blocks of 2 bytes because our FEC k
6586+        # parameter is 3.
6587+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6588+                                6, datalength)
6589+        return mw
6590+
6591+
6592+    def test_write_rejected_with_too_many_blocks(self):
6593+        mw = self._make_new_mw("si0", 0)
6594+
6595+        # Try writing too many blocks. We should not be able to write
6596+        # more than 6
6597+        # blocks into each share.
6598+        d = defer.succeed(None)
6599+        for i in xrange(6):
6600+            d.addCallback(lambda ignored, i=i:
6601+                mw.put_block(self.block, i, self.salt))
6602+        d.addCallback(lambda ignored:
6603+            self.shouldFail(LayoutInvalid, "too many blocks",
6604+                            None,
6605+                            mw.put_block, self.block, 7, self.salt))
6606+        return d
6607+
6608+
6609+    def test_write_rejected_with_invalid_salt(self):
6610+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6611+        # less should cause an error.
6612+        mw = self._make_new_mw("si1", 0)
6613+        bad_salt = "a" * 17 # 17 bytes
6614+        d = defer.succeed(None)
6615+        d.addCallback(lambda ignored:
6616+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6617+                            None, mw.put_block, self.block, 7, bad_salt))
6618+        return d
6619+
6620+
6621+    def test_write_rejected_with_invalid_root_hash(self):
6622+        # Try writing an invalid root hash. This should be SHA256d, and
6623+        # 32 bytes long as a result.
6624+        mw = self._make_new_mw("si2", 0)
6625+        # 17 bytes != 32 bytes
6626+        invalid_root_hash = "a" * 17
6627+        d = defer.succeed(None)
6628+        # Before this test can work, we need to put some blocks + salts,
6629+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6630+        # failures that match what we are looking for, but are caused by
6631+        # the constraints imposed on operation ordering.
6632+        for i in xrange(6):
6633+            d.addCallback(lambda ignored, i=i:
6634+                mw.put_block(self.block, i, self.salt))
6635+        d.addCallback(lambda ignored:
6636+            mw.put_encprivkey(self.encprivkey))
6637+        d.addCallback(lambda ignored:
6638+            mw.put_blockhashes(self.block_hash_tree))
6639+        d.addCallback(lambda ignored:
6640+            mw.put_sharehashes(self.share_hash_chain))
6641+        d.addCallback(lambda ignored:
6642+            self.shouldFail(LayoutInvalid, "invalid root hash",
6643+                            None, mw.put_root_hash, invalid_root_hash))
6644+        return d
6645+
6646+
6647+    def test_write_rejected_with_invalid_blocksize(self):
6648+        # The blocksize implied by the writer that we get from
6649+        # _make_new_mw is 2bytes -- any more or any less than this
6650+        # should be cause for failure, unless it is the tail segment, in
6651+        # which case it may not be failure.
6652+        invalid_block = "a"
6653+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6654+                                             # one byte blocks
6655+        # 1 bytes != 2 bytes
6656+        d = defer.succeed(None)
6657+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6658+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6659+                            None, mw.put_block, invalid_block, 0,
6660+                            self.salt))
6661+        invalid_block = invalid_block * 3
6662+        # 3 bytes != 2 bytes
6663+        d.addCallback(lambda ignored:
6664+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6665+                            None,
6666+                            mw.put_block, invalid_block, 0, self.salt))
6667+        for i in xrange(5):
6668+            d.addCallback(lambda ignored, i=i:
6669+                mw.put_block(self.block, i, self.salt))
6670+        # Try to put an invalid tail segment
6671+        d.addCallback(lambda ignored:
6672+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6673+                            None,
6674+                            mw.put_block, self.block, 5, self.salt))
6675+        valid_block = "a"
6676+        d.addCallback(lambda ignored:
6677+            mw.put_block(valid_block, 5, self.salt))
6678+        return d
6679+
6680+
6681+    def test_write_enforces_order_constraints(self):
6682+        # We require that the MDMFSlotWriteProxy be interacted with in a
6683+        # specific way.
6684+        # That way is:
6685+        # 0: __init__
6686+        # 1: write blocks and salts
6687+        # 2: Write the encrypted private key
6688+        # 3: Write the block hashes
6689+        # 4: Write the share hashes
6690+        # 5: Write the root hash and salt hash
6691+        # 6: Write the signature and verification key
6692+        # 7: Write the file.
6693+        #
6694+        # Some of these can be performed out-of-order, and some can't.
6695+        # The dependencies that I want to test here are:
6696+        #  - Private key before block hashes
6697+        #  - share hashes and block hashes before root hash
6698+        #  - root hash before signature
6699+        #  - signature before verification key
6700+        mw0 = self._make_new_mw("si0", 0)
6701+        # Write some shares
6702+        d = defer.succeed(None)
6703+        for i in xrange(6):
6704+            d.addCallback(lambda ignored, i=i:
6705+                mw0.put_block(self.block, i, self.salt))
6706+        # Try to write the block hashes before writing the encrypted
6707+        # private key
6708+        d.addCallback(lambda ignored:
6709+            self.shouldFail(LayoutInvalid, "block hashes before key",
6710+                            None, mw0.put_blockhashes,
6711+                            self.block_hash_tree))
6712+
6713+        # Write the private key.
6714+        d.addCallback(lambda ignored:
6715+            mw0.put_encprivkey(self.encprivkey))
6716+
6717+
6718+        # Try to write the share hash chain without writing the block
6719+        # hash tree
6720+        d.addCallback(lambda ignored:
6721+            self.shouldFail(LayoutInvalid, "share hash chain before "
6722+                                           "salt hash tree",
6723+                            None,
6724+                            mw0.put_sharehashes, self.share_hash_chain))
6725+
6726+        # Try to write the root hash and without writing either the
6727+        # block hashes or the or the share hashes
6728+        d.addCallback(lambda ignored:
6729+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6730+                            None,
6731+                            mw0.put_root_hash, self.root_hash))
6732+
6733+        # Now write the block hashes and try again
6734+        d.addCallback(lambda ignored:
6735+            mw0.put_blockhashes(self.block_hash_tree))
6736+
6737+        d.addCallback(lambda ignored:
6738+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6739+                            None, mw0.put_root_hash, self.root_hash))
6740+
6741+        # We haven't yet put the root hash on the share, so we shouldn't
6742+        # be able to sign it.
6743+        d.addCallback(lambda ignored:
6744+            self.shouldFail(LayoutInvalid, "signature before root hash",
6745+                            None, mw0.put_signature, self.signature))
6746+
6747+        d.addCallback(lambda ignored:
6748+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6749+
6750+        # ..and, since that fails, we also shouldn't be able to put the
6751+        # verification key.
6752+        d.addCallback(lambda ignored:
6753+            self.shouldFail(LayoutInvalid, "key before signature",
6754+                            None, mw0.put_verification_key,
6755+                            self.verification_key))
6756+
6757+        # Now write the share hashes.
6758+        d.addCallback(lambda ignored:
6759+            mw0.put_sharehashes(self.share_hash_chain))
6760+        # We should be able to write the root hash now too
6761+        d.addCallback(lambda ignored:
6762+            mw0.put_root_hash(self.root_hash))
6763+
6764+        # We should still be unable to put the verification key
6765+        d.addCallback(lambda ignored:
6766+            self.shouldFail(LayoutInvalid, "key before signature",
6767+                            None, mw0.put_verification_key,
6768+                            self.verification_key))
6769+
6770+        d.addCallback(lambda ignored:
6771+            mw0.put_signature(self.signature))
6772+
6773+        # We shouldn't be able to write the offsets to the remote server
6774+        # until the offset table is finished; IOW, until we have written
6775+        # the verification key.
6776+        d.addCallback(lambda ignored:
6777+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6778+                            None,
6779+                            mw0.finish_publishing))
6780+
6781+        d.addCallback(lambda ignored:
6782+            mw0.put_verification_key(self.verification_key))
6783+        return d
6784+
6785+
6786+    def test_end_to_end(self):
6787+        mw = self._make_new_mw("si1", 0)
6788+        # Write a share using the mutable writer, and make sure that the
6789+        # reader knows how to read everything back to us.
6790+        d = defer.succeed(None)
6791+        for i in xrange(6):
6792+            d.addCallback(lambda ignored, i=i:
6793+                mw.put_block(self.block, i, self.salt))
6794+        d.addCallback(lambda ignored:
6795+            mw.put_encprivkey(self.encprivkey))
6796+        d.addCallback(lambda ignored:
6797+            mw.put_blockhashes(self.block_hash_tree))
6798+        d.addCallback(lambda ignored:
6799+            mw.put_sharehashes(self.share_hash_chain))
6800+        d.addCallback(lambda ignored:
6801+            mw.put_root_hash(self.root_hash))
6802+        d.addCallback(lambda ignored:
6803+            mw.put_signature(self.signature))
6804+        d.addCallback(lambda ignored:
6805+            mw.put_verification_key(self.verification_key))
6806+        d.addCallback(lambda ignored:
6807+            mw.finish_publishing())
6808+
6809+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6810+        def _check_block_and_salt((block, salt)):
6811+            self.failUnlessEqual(block, self.block)
6812+            self.failUnlessEqual(salt, self.salt)
6813+
6814+        for i in xrange(6):
6815+            d.addCallback(lambda ignored, i=i:
6816+                mr.get_block_and_salt(i))
6817+            d.addCallback(_check_block_and_salt)
6818+
6819+        d.addCallback(lambda ignored:
6820+            mr.get_encprivkey())
6821+        d.addCallback(lambda encprivkey:
6822+            self.failUnlessEqual(self.encprivkey, encprivkey))
6823+
6824+        d.addCallback(lambda ignored:
6825+            mr.get_blockhashes())
6826+        d.addCallback(lambda blockhashes:
6827+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6828+
6829+        d.addCallback(lambda ignored:
6830+            mr.get_sharehashes())
6831+        d.addCallback(lambda sharehashes:
6832+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6833+
6834+        d.addCallback(lambda ignored:
6835+            mr.get_signature())
6836+        d.addCallback(lambda signature:
6837+            self.failUnlessEqual(signature, self.signature))
6838+
6839+        d.addCallback(lambda ignored:
6840+            mr.get_verification_key())
6841+        d.addCallback(lambda verification_key:
6842+            self.failUnlessEqual(verification_key, self.verification_key))
6843+
6844+        d.addCallback(lambda ignored:
6845+            mr.get_seqnum())
6846+        d.addCallback(lambda seqnum:
6847+            self.failUnlessEqual(seqnum, 0))
6848+
6849+        d.addCallback(lambda ignored:
6850+            mr.get_root_hash())
6851+        d.addCallback(lambda root_hash:
6852+            self.failUnlessEqual(self.root_hash, root_hash))
6853+
6854+        d.addCallback(lambda ignored:
6855+            mr.get_encoding_parameters())
6856+        def _check_encoding_parameters((k, n, segsize, datalen)):
6857+            self.failUnlessEqual(k, 3)
6858+            self.failUnlessEqual(n, 10)
6859+            self.failUnlessEqual(segsize, 6)
6860+            self.failUnlessEqual(datalen, 36)
6861+        d.addCallback(_check_encoding_parameters)
6862+
6863+        d.addCallback(lambda ignored:
6864+            mr.get_checkstring())
6865+        d.addCallback(lambda checkstring:
6866+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
6867+        return d
6868+
6869+
6870+    def test_is_sdmf(self):
6871+        # The MDMFSlotReadProxy should also know how to read SDMF files,
6872+        # since it will encounter them on the grid. Callers use the
6873+        # is_sdmf method to test this.
6874+        self.write_sdmf_share_to_server("si1")
6875+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6876+        d = mr.is_sdmf()
6877+        d.addCallback(lambda issdmf:
6878+            self.failUnless(issdmf))
6879+        return d
6880+
6881+
6882+    def test_reads_sdmf(self):
6883+        # The slot read proxy should, naturally, know how to tell us
6884+        # about data in the SDMF format
6885+        self.write_sdmf_share_to_server("si1")
6886+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6887+        d = defer.succeed(None)
6888+        d.addCallback(lambda ignored:
6889+            mr.is_sdmf())
6890+        d.addCallback(lambda issdmf:
6891+            self.failUnless(issdmf))
6892+
6893+        # What do we need to read?
6894+        #  - The sharedata
6895+        #  - The salt
6896+        d.addCallback(lambda ignored:
6897+            mr.get_block_and_salt(0))
6898+        def _check_block_and_salt(results):
6899+            block, salt = results
6900+            # Our original file is 36 bytes long. Then each share is 12
6901+            # bytes in size. The share is composed entirely of the
6902+            # letter a. self.block contains 2 as, so 6 * self.block is
6903+            # what we are looking for.
6904+            self.failUnlessEqual(block, self.block * 6)
6905+            self.failUnlessEqual(salt, self.salt)
6906+        d.addCallback(_check_block_and_salt)
6907+
6908+        #  - The blockhashes
6909+        d.addCallback(lambda ignored:
6910+            mr.get_blockhashes())
6911+        d.addCallback(lambda blockhashes:
6912+            self.failUnlessEqual(self.block_hash_tree,
6913+                                 blockhashes,
6914+                                 blockhashes))
6915+        #  - The sharehashes
6916+        d.addCallback(lambda ignored:
6917+            mr.get_sharehashes())
6918+        d.addCallback(lambda sharehashes:
6919+            self.failUnlessEqual(self.share_hash_chain,
6920+                                 sharehashes))
6921+        #  - The keys
6922+        d.addCallback(lambda ignored:
6923+            mr.get_encprivkey())
6924+        d.addCallback(lambda encprivkey:
6925+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
6926+        d.addCallback(lambda ignored:
6927+            mr.get_verification_key())
6928+        d.addCallback(lambda verification_key:
6929+            self.failUnlessEqual(verification_key,
6930+                                 self.verification_key,
6931+                                 verification_key))
6932+        #  - The signature
6933+        d.addCallback(lambda ignored:
6934+            mr.get_signature())
6935+        d.addCallback(lambda signature:
6936+            self.failUnlessEqual(signature, self.signature, signature))
6937+
6938+        #  - The sequence number
6939+        d.addCallback(lambda ignored:
6940+            mr.get_seqnum())
6941+        d.addCallback(lambda seqnum:
6942+            self.failUnlessEqual(seqnum, 0, seqnum))
6943+
6944+        #  - The root hash
6945+        d.addCallback(lambda ignored:
6946+            mr.get_root_hash())
6947+        d.addCallback(lambda root_hash:
6948+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
6949+        return d
6950+
6951+
6952+    def test_only_reads_one_segment_sdmf(self):
6953+        # SDMF shares have only one segment, so it doesn't make sense to
6954+        # read more segments than that. The reader should know this and
6955+        # complain if we try to do that.
6956+        self.write_sdmf_share_to_server("si1")
6957+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6958+        d = defer.succeed(None)
6959+        d.addCallback(lambda ignored:
6960+            mr.is_sdmf())
6961+        d.addCallback(lambda issdmf:
6962+            self.failUnless(issdmf))
6963+        d.addCallback(lambda ignored:
6964+            self.shouldFail(LayoutInvalid, "test bad segment",
6965+                            None,
6966+                            mr.get_block_and_salt, 1))
6967+        return d
6968+
6969+
6970+    def test_read_with_prefetched_mdmf_data(self):
6971+        # The MDMFSlotReadProxy will prefill certain fields if you pass
6972+        # it data that you have already fetched. This is useful for
6973+        # cases like the Servermap, which prefetches ~2kb of data while
6974+        # finding out which shares are on the remote peer so that it
6975+        # doesn't waste round trips.
6976+        mdmf_data = self.build_test_mdmf_share()
6977+        self.write_test_share_to_server("si1")
6978+        def _make_mr(ignored, length):
6979+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
6980+            return mr
6981+
6982+        d = defer.succeed(None)
6983+        # This should be enough to fill in both the encoding parameters
6984+        # and the table of offsets, which will complete the version
6985+        # information tuple.
6986+        d.addCallback(_make_mr, 107)
6987+        d.addCallback(lambda mr:
6988+            mr.get_verinfo())
6989+        def _check_verinfo(verinfo):
6990+            self.failUnless(verinfo)
6991+            self.failUnlessEqual(len(verinfo), 9)
6992+            (seqnum,
6993+             root_hash,
6994+             salt_hash,
6995+             segsize,
6996+             datalen,
6997+             k,
6998+             n,
6999+             prefix,
7000+             offsets) = verinfo
7001+            self.failUnlessEqual(seqnum, 0)
7002+            self.failUnlessEqual(root_hash, self.root_hash)
7003+            self.failUnlessEqual(segsize, 6)
7004+            self.failUnlessEqual(datalen, 36)
7005+            self.failUnlessEqual(k, 3)
7006+            self.failUnlessEqual(n, 10)
7007+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7008+                                          1,
7009+                                          seqnum,
7010+                                          root_hash,
7011+                                          k,
7012+                                          n,
7013+                                          segsize,
7014+                                          datalen)
7015+            self.failUnlessEqual(expected_prefix, prefix)
7016+            self.failUnlessEqual(self.rref.read_count, 0)
7017+        d.addCallback(_check_verinfo)
7018+        # This is not enough data to read a block and a share, so the
7019+        # wrapper should attempt to read this from the remote server.
7020+        d.addCallback(_make_mr, 107)
7021+        d.addCallback(lambda mr:
7022+            mr.get_block_and_salt(0))
7023+        def _check_block_and_salt((block, salt)):
7024+            self.failUnlessEqual(block, self.block)
7025+            self.failUnlessEqual(salt, self.salt)
7026+            self.failUnlessEqual(self.rref.read_count, 1)
7027+        # This should be enough data to read one block.
7028+        d.addCallback(_make_mr, 249)
7029+        d.addCallback(lambda mr:
7030+            mr.get_block_and_salt(0))
7031+        d.addCallback(_check_block_and_salt)
7032+        return d
7033+
7034+
7035+    def test_read_with_prefetched_sdmf_data(self):
7036+        sdmf_data = self.build_test_sdmf_share()
7037+        self.write_sdmf_share_to_server("si1")
7038+        def _make_mr(ignored, length):
7039+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7040+            return mr
7041+
7042+        d = defer.succeed(None)
7043+        # This should be enough to get us the encoding parameters,
7044+        # offset table, and everything else we need to build a verinfo
7045+        # string.
7046+        d.addCallback(_make_mr, 107)
7047+        d.addCallback(lambda mr:
7048+            mr.get_verinfo())
7049+        def _check_verinfo(verinfo):
7050+            self.failUnless(verinfo)
7051+            self.failUnlessEqual(len(verinfo), 9)
7052+            (seqnum,
7053+             root_hash,
7054+             salt,
7055+             segsize,
7056+             datalen,
7057+             k,
7058+             n,
7059+             prefix,
7060+             offsets) = verinfo
7061+            self.failUnlessEqual(seqnum, 0)
7062+            self.failUnlessEqual(root_hash, self.root_hash)
7063+            self.failUnlessEqual(salt, self.salt)
7064+            self.failUnlessEqual(segsize, 36)
7065+            self.failUnlessEqual(datalen, 36)
7066+            self.failUnlessEqual(k, 3)
7067+            self.failUnlessEqual(n, 10)
7068+            expected_prefix = struct.pack(SIGNED_PREFIX,
7069+                                          0,
7070+                                          seqnum,
7071+                                          root_hash,
7072+                                          salt,
7073+                                          k,
7074+                                          n,
7075+                                          segsize,
7076+                                          datalen)
7077+            self.failUnlessEqual(expected_prefix, prefix)
7078+            self.failUnlessEqual(self.rref.read_count, 0)
7079+        d.addCallback(_check_verinfo)
7080+        # This shouldn't be enough to read any share data.
7081+        d.addCallback(_make_mr, 107)
7082+        d.addCallback(lambda mr:
7083+            mr.get_block_and_salt(0))
7084+        def _check_block_and_salt((block, salt)):
7085+            self.failUnlessEqual(block, self.block * 6)
7086+            self.failUnlessEqual(salt, self.salt)
7087+            # TODO: Fix the read routine so that it reads only the data
7088+            #       that it has cached if it can't read all of it.
7089+            self.failUnlessEqual(self.rref.read_count, 2)
7090+
7091+        # This should be enough to read share data.
7092+        d.addCallback(_make_mr, self.offsets['share_data'])
7093+        d.addCallback(lambda mr:
7094+            mr.get_block_and_salt(0))
7095+        d.addCallback(_check_block_and_salt)
7096+        return d
7097+
7098+
7099+    def test_read_with_empty_mdmf_file(self):
7100+        # Some tests upload a file with no contents to test things
7101+        # unrelated to the actual handling of the content of the file.
7102+        # The reader should behave intelligently in these cases.
7103+        self.write_test_share_to_server("si1", empty=True)
7104+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7105+        # We should be able to get the encoding parameters, and they
7106+        # should be correct.
7107+        d = defer.succeed(None)
7108+        d.addCallback(lambda ignored:
7109+            mr.get_encoding_parameters())
7110+        def _check_encoding_parameters(params):
7111+            self.failUnlessEqual(len(params), 4)
7112+            k, n, segsize, datalen = params
7113+            self.failUnlessEqual(k, 3)
7114+            self.failUnlessEqual(n, 10)
7115+            self.failUnlessEqual(segsize, 0)
7116+            self.failUnlessEqual(datalen, 0)
7117+        d.addCallback(_check_encoding_parameters)
7118+
7119+        # We should not be able to fetch a block, since there are no
7120+        # blocks to fetch
7121+        d.addCallback(lambda ignored:
7122+            self.shouldFail(LayoutInvalid, "get block on empty file",
7123+                            None,
7124+                            mr.get_block_and_salt, 0))
7125+        return d
7126+
7127+
7128+    def test_read_with_empty_sdmf_file(self):
7129+        self.write_sdmf_share_to_server("si1", empty=True)
7130+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7131+        # We should be able to get the encoding parameters, and they
7132+        # should be correct
7133+        d = defer.succeed(None)
7134+        d.addCallback(lambda ignored:
7135+            mr.get_encoding_parameters())
7136+        def _check_encoding_parameters(params):
7137+            self.failUnlessEqual(len(params), 4)
7138+            k, n, segsize, datalen = params
7139+            self.failUnlessEqual(k, 3)
7140+            self.failUnlessEqual(n, 10)
7141+            self.failUnlessEqual(segsize, 0)
7142+            self.failUnlessEqual(datalen, 0)
7143+        d.addCallback(_check_encoding_parameters)
7144+
7145+        # It does not make sense to get a block in this format, so we
7146+        # should not be able to.
7147+        d.addCallback(lambda ignored:
7148+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7149+                            None,
7150+                            mr.get_block_and_salt, 0))
7151+        return d
7152+
7153+
7154+    def test_verinfo_with_sdmf_file(self):
7155+        self.write_sdmf_share_to_server("si1")
7156+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7157+        # We should be able to get the version information.
7158+        d = defer.succeed(None)
7159+        d.addCallback(lambda ignored:
7160+            mr.get_verinfo())
7161+        def _check_verinfo(verinfo):
7162+            self.failUnless(verinfo)
7163+            self.failUnlessEqual(len(verinfo), 9)
7164+            (seqnum,
7165+             root_hash,
7166+             salt,
7167+             segsize,
7168+             datalen,
7169+             k,
7170+             n,
7171+             prefix,
7172+             offsets) = verinfo
7173+            self.failUnlessEqual(seqnum, 0)
7174+            self.failUnlessEqual(root_hash, self.root_hash)
7175+            self.failUnlessEqual(salt, self.salt)
7176+            self.failUnlessEqual(segsize, 36)
7177+            self.failUnlessEqual(datalen, 36)
7178+            self.failUnlessEqual(k, 3)
7179+            self.failUnlessEqual(n, 10)
7180+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7181+                                          0,
7182+                                          seqnum,
7183+                                          root_hash,
7184+                                          salt,
7185+                                          k,
7186+                                          n,
7187+                                          segsize,
7188+                                          datalen)
7189+            self.failUnlessEqual(prefix, expected_prefix)
7190+            self.failUnlessEqual(offsets, self.offsets)
7191+        d.addCallback(_check_verinfo)
7192+        return d
7193+
7194+
7195+    def test_verinfo_with_mdmf_file(self):
7196+        self.write_test_share_to_server("si1")
7197+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7198+        d = defer.succeed(None)
7199+        d.addCallback(lambda ignored:
7200+            mr.get_verinfo())
7201+        def _check_verinfo(verinfo):
7202+            self.failUnless(verinfo)
7203+            self.failUnlessEqual(len(verinfo), 9)
7204+            (seqnum,
7205+             root_hash,
7206+             IV,
7207+             segsize,
7208+             datalen,
7209+             k,
7210+             n,
7211+             prefix,
7212+             offsets) = verinfo
7213+            self.failUnlessEqual(seqnum, 0)
7214+            self.failUnlessEqual(root_hash, self.root_hash)
7215+            self.failIf(IV)
7216+            self.failUnlessEqual(segsize, 6)
7217+            self.failUnlessEqual(datalen, 36)
7218+            self.failUnlessEqual(k, 3)
7219+            self.failUnlessEqual(n, 10)
7220+            expected_prefix = struct.pack(">BQ32s BBQQ",
7221+                                          1,
7222+                                          seqnum,
7223+                                          root_hash,
7224+                                          k,
7225+                                          n,
7226+                                          segsize,
7227+                                          datalen)
7228+            self.failUnlessEqual(prefix, expected_prefix)
7229+            self.failUnlessEqual(offsets, self.offsets)
7230+        d.addCallback(_check_verinfo)
7231+        return d
7232+
7233+
7234+    def test_reader_queue(self):
7235+        self.write_test_share_to_server('si1')
7236+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7237+        d1 = mr.get_block_and_salt(0, queue=True)
7238+        d2 = mr.get_blockhashes(queue=True)
7239+        d3 = mr.get_sharehashes(queue=True)
7240+        d4 = mr.get_signature(queue=True)
7241+        d5 = mr.get_verification_key(queue=True)
7242+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7243+        mr.flush()
7244+        def _print(results):
7245+            self.failUnlessEqual(len(results), 5)
7246+            # We have one read for version information and offsets, and
7247+            # one for everything else.
7248+            self.failUnlessEqual(self.rref.read_count, 2)
7249+            block, salt = results[0][1] # results[0] is a boolean that says
7250+                                           # whether or not the operation
7251+                                           # worked.
7252+            self.failUnlessEqual(self.block, block)
7253+            self.failUnlessEqual(self.salt, salt)
7254+
7255+            blockhashes = results[1][1]
7256+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7257+
7258+            sharehashes = results[2][1]
7259+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7260+
7261+            signature = results[3][1]
7262+            self.failUnlessEqual(self.signature, signature)
7263+
7264+            verification_key = results[4][1]
7265+            self.failUnlessEqual(self.verification_key, verification_key)
7266+        dl.addCallback(_print)
7267+        return dl
7268+
7269+
7270+    def test_sdmf_writer(self):
7271+        # Go through the motions of writing an SDMF share to the storage
7272+        # server. Then read the storage server to see that the share got
7273+        # written in the way that we think it should have.
7274+
7275+        # We do this first so that the necessary instance variables get
7276+        # set the way we want them for the tests below.
7277+        data = self.build_test_sdmf_share()
7278+        sdmfr = SDMFSlotWriteProxy(0,
7279+                                   self.rref,
7280+                                   "si1",
7281+                                   self.secrets,
7282+                                   0, 3, 10, 36, 36)
7283+        # Put the block and salt.
7284+        sdmfr.put_block(self.blockdata, 0, self.salt)
7285+
7286+        # Put the encprivkey
7287+        sdmfr.put_encprivkey(self.encprivkey)
7288+
7289+        # Put the block and share hash chains
7290+        sdmfr.put_blockhashes(self.block_hash_tree)
7291+        sdmfr.put_sharehashes(self.share_hash_chain)
7292+        sdmfr.put_root_hash(self.root_hash)
7293+
7294+        # Put the signature
7295+        sdmfr.put_signature(self.signature)
7296+
7297+        # Put the verification key
7298+        sdmfr.put_verification_key(self.verification_key)
7299+
7300+        # Now check to make sure that nothing has been written yet.
7301+        self.failUnlessEqual(self.rref.write_count, 0)
7302+
7303+        # Now finish publishing
7304+        d = sdmfr.finish_publishing()
7305+        def _then(ignored):
7306+            self.failUnlessEqual(self.rref.write_count, 1)
7307+            read = self.ss.remote_slot_readv
7308+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7309+                                 {0: [data]})
7310+        d.addCallback(_then)
7311+        return d
7312+
7313+
7314+    def test_sdmf_writer_preexisting_share(self):
7315+        data = self.build_test_sdmf_share()
7316+        self.write_sdmf_share_to_server("si1")
7317+
7318+        # Now there is a share on the storage server. To successfully
7319+        # write, we need to set the checkstring correctly. When we
7320+        # don't, no write should occur.
7321+        sdmfw = SDMFSlotWriteProxy(0,
7322+                                   self.rref,
7323+                                   "si1",
7324+                                   self.secrets,
7325+                                   1, 3, 10, 36, 36)
7326+        sdmfw.put_block(self.blockdata, 0, self.salt)
7327+
7328+        # Put the encprivkey
7329+        sdmfw.put_encprivkey(self.encprivkey)
7330+
7331+        # Put the block and share hash chains
7332+        sdmfw.put_blockhashes(self.block_hash_tree)
7333+        sdmfw.put_sharehashes(self.share_hash_chain)
7334+
7335+        # Put the root hash
7336+        sdmfw.put_root_hash(self.root_hash)
7337+
7338+        # Put the signature
7339+        sdmfw.put_signature(self.signature)
7340+
7341+        # Put the verification key
7342+        sdmfw.put_verification_key(self.verification_key)
7343+
7344+        # We shouldn't have a checkstring yet
7345+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7346+
7347+        d = sdmfw.finish_publishing()
7348+        def _then(results):
7349+            self.failIf(results[0])
7350+            # this is the correct checkstring
7351+            self._expected_checkstring = results[1][0][0]
7352+            return self._expected_checkstring
7353+
7354+        d.addCallback(_then)
7355+        d.addCallback(sdmfw.set_checkstring)
7356+        d.addCallback(lambda ignored:
7357+            sdmfw.get_checkstring())
7358+        d.addCallback(lambda checkstring:
7359+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7360+        d.addCallback(lambda ignored:
7361+            sdmfw.finish_publishing())
7362+        def _then_again(results):
7363+            self.failUnless(results[0])
7364+            read = self.ss.remote_slot_readv
7365+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7366+                                 {0: [struct.pack(">Q", 1)]})
7367+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7368+                                 {0: [data[9:]]})
7369+        d.addCallback(_then_again)
7370+        return d
7371+
7372+
7373 class Stats(unittest.TestCase):
7374 
7375     def setUp(self):
7376}
7377[mutable/publish.py: cleanup + simplification
7378Kevan Carstensen <kevan@isnotajoke.com>**20100702225554
7379 Ignore-this: 36a58424ceceffb1ddc55cc5934399e2
7380] {
7381hunk ./src/allmydata/mutable/publish.py 19
7382      UncoordinatedWriteError, NotEnoughServersError
7383 from allmydata.mutable.servermap import ServerMap
7384 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
7385-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
7386+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
7387+     SDMFSlotWriteProxy
7388 
7389 KiB = 1024
7390 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
7391hunk ./src/allmydata/mutable/publish.py 24
7392+PUSHING_BLOCKS_STATE = 0
7393+PUSHING_EVERYTHING_ELSE_STATE = 1
7394+DONE_STATE = 2
7395 
7396 class PublishStatus:
7397     implements(IPublishStatus)
7398hunk ./src/allmydata/mutable/publish.py 229
7399 
7400         self.bad_share_checkstrings = {}
7401 
7402+        # This is set at the last step of the publishing process.
7403+        self.versioninfo = ""
7404+
7405         # we use the servermap to populate the initial goal: this way we will
7406         # try to update each existing share in place.
7407         for (peerid, shnum) in self._servermap.servermap:
7408hunk ./src/allmydata/mutable/publish.py 245
7409             self.bad_share_checkstrings[key] = old_checkstring
7410             self.connections[peerid] = self._servermap.connections[peerid]
7411 
7412-        # Now, the process dovetails -- if this is an SDMF file, we need
7413-        # to write an SDMF file. Otherwise, we need to write an MDMF
7414-        # file.
7415-        if self._version == MDMF_VERSION:
7416-            return self._publish_mdmf()
7417-        else:
7418-            return self._publish_sdmf()
7419-        #return self.done_deferred
7420-
7421-    def _publish_mdmf(self):
7422-        # Next, we find homes for all of the shares that we don't have
7423-        # homes for yet.
7424         # TODO: Make this part do peer selection.
7425         self.update_goal()
7426         self.writers = {}
7427hunk ./src/allmydata/mutable/publish.py 248
7428-        # For each (peerid, shnum) in self.goal, we make an
7429-        # MDMFSlotWriteProxy for that peer. We'll use this to write
7430+        if self._version == MDMF_VERSION:
7431+            writer_class = MDMFSlotWriteProxy
7432+        else:
7433+            writer_class = SDMFSlotWriteProxy
7434+
7435+        # For each (peerid, shnum) in self.goal, we make a
7436+        # write proxy for that peer. We'll use this to write
7437         # shares to the peer.
7438         for key in self.goal:
7439             peerid, shnum = key
7440hunk ./src/allmydata/mutable/publish.py 263
7441             cancel_secret = self._node.get_cancel_secret(peerid)
7442             secrets = (write_enabler, renew_secret, cancel_secret)
7443 
7444-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
7445-                                                      self.connections[peerid],
7446-                                                      self._storage_index,
7447-                                                      secrets,
7448-                                                      self._new_seqnum,
7449-                                                      self.required_shares,
7450-                                                      self.total_shares,
7451-                                                      self.segment_size,
7452-                                                      len(self.newdata))
7453+            self.writers[shnum] =  writer_class(shnum,
7454+                                                self.connections[peerid],
7455+                                                self._storage_index,
7456+                                                secrets,
7457+                                                self._new_seqnum,
7458+                                                self.required_shares,
7459+                                                self.total_shares,
7460+                                                self.segment_size,
7461+                                                len(self.newdata))
7462+            self.writers[shnum].peerid = peerid
7463             if (peerid, shnum) in self._servermap.servermap:
7464                 old_versionid, old_timestamp = self._servermap.servermap[key]
7465                 (old_seqnum, old_root_hash, old_salt, old_segsize,
7466hunk ./src/allmydata/mutable/publish.py 278
7467                  old_datalength, old_k, old_N, old_prefix,
7468                  old_offsets_tuple) = old_versionid
7469-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
7470+                self.writers[shnum].set_checkstring(old_seqnum,
7471+                                                    old_root_hash,
7472+                                                    old_salt)
7473+            elif (peerid, shnum) in self.bad_share_checkstrings:
7474+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
7475+                self.writers[shnum].set_checkstring(old_checkstring)
7476+
7477+        # Our remote shares will not have a complete checkstring until
7478+        # after we are done writing share data and have started to write
7479+        # blocks. In the meantime, we need to know what to look for when
7480+        # writing, so that we can detect UncoordinatedWriteErrors.
7481+        self._checkstring = self.writers.values()[0].get_checkstring()
7482 
7483         # Now, we start pushing shares.
7484         self._status.timings["setup"] = time.time() - self._started
7485hunk ./src/allmydata/mutable/publish.py 293
7486-        def _start_pushing(res):
7487-            self._started_pushing = time.time()
7488-            return res
7489-
7490         # First, we encrypt, encode, and publish the shares that we need
7491         # to encrypt, encode, and publish.
7492 
7493hunk ./src/allmydata/mutable/publish.py 306
7494 
7495         d = defer.succeed(None)
7496         self.log("Starting push")
7497-        for i in xrange(self.num_segments - 1):
7498-            d.addCallback(lambda ignored, i=i:
7499-                self.push_segment(i))
7500-            d.addCallback(self._turn_barrier)
7501-        # We have at least one segment, so we will have a tail segment
7502-        if self.num_segments > 0:
7503-            d.addCallback(lambda ignored:
7504-                self.push_tail_segment())
7505-
7506-        d.addCallback(lambda ignored:
7507-            self.push_encprivkey())
7508-        d.addCallback(lambda ignored:
7509-            self.push_blockhashes())
7510-        d.addCallback(lambda ignored:
7511-            self.push_sharehashes())
7512-        d.addCallback(lambda ignored:
7513-            self.push_toplevel_hashes_and_signature())
7514-        d.addCallback(lambda ignored:
7515-            self.finish_publishing())
7516-        return d
7517-
7518-
7519-    def _publish_sdmf(self):
7520-        self._status.timings["setup"] = time.time() - self._started
7521-        self.salt = os.urandom(16)
7522 
7523hunk ./src/allmydata/mutable/publish.py 307
7524-        d = self._encrypt_and_encode()
7525-        d.addCallback(self._generate_shares)
7526-        def _start_pushing(res):
7527-            self._started_pushing = time.time()
7528-            return res
7529-        d.addCallback(_start_pushing)
7530-        d.addCallback(self.loop) # trigger delivery
7531-        d.addErrback(self._fatal_error)
7532+        self._state = PUSHING_BLOCKS_STATE
7533+        self._push()
7534 
7535         return self.done_deferred
7536 
7537hunk ./src/allmydata/mutable/publish.py 327
7538                                                   segment_size)
7539         else:
7540             self.num_segments = 0
7541+
7542+        self.log("building encoding parameters for file")
7543+        self.log("got segsize %d" % self.segment_size)
7544+        self.log("got %d segments" % self.num_segments)
7545+
7546         if self._version == SDMF_VERSION:
7547             assert self.num_segments in (0, 1) # SDMF
7548hunk ./src/allmydata/mutable/publish.py 334
7549-            return
7550         # calculate the tail segment size.
7551hunk ./src/allmydata/mutable/publish.py 335
7552-        self.tail_segment_size = len(self.newdata) % segment_size
7553 
7554hunk ./src/allmydata/mutable/publish.py 336
7555-        if self.tail_segment_size == 0:
7556+        if segment_size and self.newdata:
7557+            self.tail_segment_size = len(self.newdata) % segment_size
7558+        else:
7559+            self.tail_segment_size = 0
7560+
7561+        if self.tail_segment_size == 0 and segment_size:
7562             # The tail segment is the same size as the other segments.
7563             self.tail_segment_size = segment_size
7564 
7565hunk ./src/allmydata/mutable/publish.py 345
7566-        # We'll make an encoder ahead-of-time for the normal-sized
7567-        # segments (defined as any segment of segment_size size.
7568-        # (the part of the code that puts the tail segment will make its
7569-        #  own encoder for that part)
7570+        # Make FEC encoders
7571         fec = codec.CRSEncoder()
7572         fec.set_params(self.segment_size,
7573                        self.required_shares, self.total_shares)
7574hunk ./src/allmydata/mutable/publish.py 352
7575         self.piece_size = fec.get_block_size()
7576         self.fec = fec
7577 
7578+        if self.tail_segment_size == self.segment_size:
7579+            self.tail_fec = self.fec
7580+        else:
7581+            tail_fec = codec.CRSEncoder()
7582+            tail_fec.set_params(self.tail_segment_size,
7583+                                self.required_shares,
7584+                                self.total_shares)
7585+            self.tail_fec = tail_fec
7586+
7587+        self._current_segment = 0
7588+
7589+
7590+    def _push(self, ignored=None):
7591+        """
7592+        I manage state transitions. In particular, I see that we still
7593+        have a good enough number of writers to complete the upload
7594+        successfully.
7595+        """
7596+        # Can we still successfully publish this file?
7597+        # TODO: Keep track of outstanding queries before aborting the
7598+        #       process.
7599+        if len(self.writers) <= self.required_shares or self.surprised:
7600+            return self._failure()
7601+
7602+        # Figure out what we need to do next. Each of these needs to
7603+        # return a deferred so that we don't block execution when this
7604+        # is first called in the upload method.
7605+        if self._state == PUSHING_BLOCKS_STATE:
7606+            return self.push_segment(self._current_segment)
7607+
7608+        # XXX: Do we want more granularity in states? Is that useful at
7609+        #      all?
7610+        #      Yes -- quicker reaction to UCW.
7611+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
7612+            return self.push_everything_else()
7613+
7614+        # If we make it to this point, we were successful in placing the
7615+        # file.
7616+        return self._done(None)
7617+
7618 
7619     def push_segment(self, segnum):
7620hunk ./src/allmydata/mutable/publish.py 394
7621+        if self.num_segments == 0 and self._version == SDMF_VERSION:
7622+            self._add_dummy_salts()
7623+
7624+        if segnum == self.num_segments:
7625+            # We don't have any more segments to push.
7626+            self._state = PUSHING_EVERYTHING_ELSE_STATE
7627+            return self._push()
7628+
7629+        d = self._encode_segment(segnum)
7630+        d.addCallback(self._push_segment, segnum)
7631+        def _increment_segnum(ign):
7632+            self._current_segment += 1
7633+        # XXX: I don't think we need to do addBoth here -- any errBacks
7634+        # should be handled within push_segment.
7635+        d.addBoth(_increment_segnum)
7636+        d.addBoth(self._push)
7637+
7638+
7639+    def _add_dummy_salts(self):
7640+        """
7641+        SDMF files need a salt even if they're empty, or the signature
7642+        won't make sense. This method adds a dummy salt to each of our
7643+        SDMF writers so that they can write the signature later.
7644+        """
7645+        salt = os.urandom(16)
7646+        assert self._version == SDMF_VERSION
7647+
7648+        for writer in self.writers.itervalues():
7649+            writer.put_salt(salt)
7650+
7651+
7652+    def _encode_segment(self, segnum):
7653+        """
7654+        I encrypt and encode the segment segnum.
7655+        """
7656         started = time.time()
7657hunk ./src/allmydata/mutable/publish.py 430
7658-        segsize = self.segment_size
7659+
7660+        if segnum + 1 == self.num_segments:
7661+            segsize = self.tail_segment_size
7662+        else:
7663+            segsize = self.segment_size
7664+
7665+
7666+        offset = self.segment_size * segnum
7667+        length = segsize + offset
7668         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
7669hunk ./src/allmydata/mutable/publish.py 440
7670-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
7671+        data = self.newdata[offset:length]
7672         assert len(data) == segsize
7673 
7674         salt = os.urandom(16)
7675hunk ./src/allmydata/mutable/publish.py 455
7676         started = now
7677 
7678         # now apply FEC
7679+        if segnum + 1 == self.num_segments:
7680+            fec = self.tail_fec
7681+        else:
7682+            fec = self.fec
7683 
7684         self._status.set_status("Encoding")
7685         crypttext_pieces = [None] * self.required_shares
7686hunk ./src/allmydata/mutable/publish.py 462
7687-        piece_size = self.piece_size
7688+        piece_size = fec.get_block_size()
7689         for i in range(len(crypttext_pieces)):
7690             offset = i * piece_size
7691             piece = crypttext[offset:offset+piece_size]
7692hunk ./src/allmydata/mutable/publish.py 469
7693             piece = piece + "\x00"*(piece_size - len(piece)) # padding
7694             crypttext_pieces[i] = piece
7695             assert len(piece) == piece_size
7696-        d = self.fec.encode(crypttext_pieces)
7697+        d = fec.encode(crypttext_pieces)
7698         def _done_encoding(res):
7699             elapsed = time.time() - started
7700             self._status.timings["encode"] = elapsed
7701hunk ./src/allmydata/mutable/publish.py 473
7702-            return res
7703+            return (res, salt)
7704         d.addCallback(_done_encoding)
7705hunk ./src/allmydata/mutable/publish.py 475
7706-
7707-        def _push_shares_and_salt(results):
7708-            shares, shareids = results
7709-            dl = []
7710-            for i in xrange(len(shares)):
7711-                sharedata = shares[i]
7712-                shareid = shareids[i]
7713-                block_hash = hashutil.block_hash(salt + sharedata)
7714-                self.blockhashes[shareid].append(block_hash)
7715-
7716-                # find the writer for this share
7717-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
7718-                dl.append(d)
7719-            # TODO: Naturally, we need to check on the results of these.
7720-            return defer.DeferredList(dl)
7721-        d.addCallback(_push_shares_and_salt)
7722         return d
7723 
7724 
7725hunk ./src/allmydata/mutable/publish.py 478
7726-    def push_tail_segment(self):
7727-        # This is essentially the same as push_segment, except that we
7728-        # don't use the cached encoder that we use elsewhere.
7729-        self.log("Pushing tail segment")
7730+    def _push_segment(self, encoded_and_salt, segnum):
7731+        """
7732+        I push (data, salt) as segment number segnum.
7733+        """
7734+        results, salt = encoded_and_salt
7735+        shares, shareids = results
7736         started = time.time()
7737hunk ./src/allmydata/mutable/publish.py 485
7738-        segsize = self.segment_size
7739-        data = self.newdata[segsize * (self.num_segments-1):]
7740-        assert len(data) == self.tail_segment_size
7741-        salt = os.urandom(16)
7742-
7743-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
7744-        enc = AES(key)
7745-        crypttext = enc.process(data)
7746-        assert len(crypttext) == len(data)
7747+        dl = []
7748+        for i in xrange(len(shares)):
7749+            sharedata = shares[i]
7750+            shareid = shareids[i]
7751+            if self._version == MDMF_VERSION:
7752+                hashed = salt + sharedata
7753+            else:
7754+                hashed = sharedata
7755+            block_hash = hashutil.block_hash(hashed)
7756+            self.blockhashes[shareid].append(block_hash)
7757 
7758hunk ./src/allmydata/mutable/publish.py 496
7759-        now = time.time()
7760-        self._status.timings['encrypt'] = now - started
7761-        started = now
7762+            # find the writer for this share
7763+            writer = self.writers[shareid]
7764+            d = writer.put_block(sharedata, segnum, salt)
7765+            d.addCallback(self._got_write_answer, writer, started)
7766+            d.addErrback(self._connection_problem, writer)
7767+            dl.append(d)
7768+            # TODO: Naturally, we need to check on the results of these.
7769+        return defer.DeferredList(dl)
7770 
7771hunk ./src/allmydata/mutable/publish.py 505
7772-        self._status.set_status("Encoding")
7773-        tail_fec = codec.CRSEncoder()
7774-        tail_fec.set_params(self.tail_segment_size,
7775-                            self.required_shares,
7776-                            self.total_shares)
7777 
7778hunk ./src/allmydata/mutable/publish.py 506
7779-        crypttext_pieces = [None] * self.required_shares
7780-        piece_size = tail_fec.get_block_size()
7781-        for i in range(len(crypttext_pieces)):
7782-            offset = i * piece_size
7783-            piece = crypttext[offset:offset+piece_size]
7784-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
7785-            crypttext_pieces[i] = piece
7786-            assert len(piece) == piece_size
7787-        d = tail_fec.encode(crypttext_pieces)
7788-        def _push_shares_and_salt(results):
7789-            shares, shareids = results
7790-            dl = []
7791-            for i in xrange(len(shares)):
7792-                sharedata = shares[i]
7793-                shareid = shareids[i]
7794-                block_hash = hashutil.block_hash(salt + sharedata)
7795-                self.blockhashes[shareid].append(block_hash)
7796-                # find the writer for this share
7797-                d = self.writers[shareid].put_block(sharedata,
7798-                                                    self.num_segments - 1,
7799-                                                    salt)
7800-                dl.append(d)
7801-            # TODO: Naturally, we need to check on the results of these.
7802-            return defer.DeferredList(dl)
7803-        d.addCallback(_push_shares_and_salt)
7804+    def push_everything_else(self):
7805+        """
7806+        I put everything else associated with a share.
7807+        """
7808+        encprivkey = self._encprivkey
7809+        d = self.push_encprivkey()
7810+        d.addCallback(self.push_blockhashes)
7811+        d.addCallback(self.push_sharehashes)
7812+        d.addCallback(self.push_toplevel_hashes_and_signature)
7813+        d.addCallback(self.finish_publishing)
7814+        def _change_state(ignored):
7815+            self._state = DONE_STATE
7816+        d.addCallback(_change_state)
7817+        d.addCallback(self._push)
7818         return d
7819 
7820 
7821hunk ./src/allmydata/mutable/publish.py 527
7822         started = time.time()
7823         encprivkey = self._encprivkey
7824         dl = []
7825-        def _spy_on_writer(results):
7826-            print results
7827-            return results
7828-        for shnum, writer in self.writers.iteritems():
7829+        for writer in self.writers.itervalues():
7830             d = writer.put_encprivkey(encprivkey)
7831hunk ./src/allmydata/mutable/publish.py 529
7832+            d.addCallback(self._got_write_answer, writer, started)
7833+            d.addErrback(self._connection_problem, writer)
7834             dl.append(d)
7835         d = defer.DeferredList(dl)
7836         return d
7837hunk ./src/allmydata/mutable/publish.py 536
7838 
7839 
7840-    def push_blockhashes(self):
7841+    def push_blockhashes(self, ignored):
7842         started = time.time()
7843         dl = []
7844hunk ./src/allmydata/mutable/publish.py 539
7845-        def _spy_on_results(results):
7846-            print results
7847-            return results
7848         self.sharehash_leaves = [None] * len(self.blockhashes)
7849         for shnum, blockhashes in self.blockhashes.iteritems():
7850             t = hashtree.HashTree(blockhashes)
7851hunk ./src/allmydata/mutable/publish.py 545
7852             self.blockhashes[shnum] = list(t)
7853             # set the leaf for future use.
7854             self.sharehash_leaves[shnum] = t[0]
7855-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
7856+            writer = self.writers[shnum]
7857+            d = writer.put_blockhashes(self.blockhashes[shnum])
7858+            d.addCallback(self._got_write_answer, writer, started)
7859+            d.addErrback(self._connection_problem, self.writers[shnum])
7860             dl.append(d)
7861         d = defer.DeferredList(dl)
7862         return d
7863hunk ./src/allmydata/mutable/publish.py 554
7864 
7865 
7866-    def push_sharehashes(self):
7867+    def push_sharehashes(self, ignored):
7868+        started = time.time()
7869         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
7870         share_hash_chain = {}
7871         ds = []
7872hunk ./src/allmydata/mutable/publish.py 559
7873-        def _spy_on_results(results):
7874-            print results
7875-            return results
7876         for shnum in xrange(len(self.sharehash_leaves)):
7877             needed_indices = share_hash_tree.needed_hashes(shnum)
7878             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
7879hunk ./src/allmydata/mutable/publish.py 563
7880                                              for i in needed_indices] )
7881-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
7882+            writer = self.writers[shnum]
7883+            d = writer.put_sharehashes(self.sharehashes[shnum])
7884+            d.addCallback(self._got_write_answer, writer, started)
7885+            d.addErrback(self._connection_problem, writer)
7886             ds.append(d)
7887         self.root_hash = share_hash_tree[0]
7888         d = defer.DeferredList(ds)
7889hunk ./src/allmydata/mutable/publish.py 573
7890         return d
7891 
7892 
7893-    def push_toplevel_hashes_and_signature(self):
7894+    def push_toplevel_hashes_and_signature(self, ignored):
7895         # We need to to three things here:
7896         #   - Push the root hash and salt hash
7897         #   - Get the checkstring of the resulting layout; sign that.
7898hunk ./src/allmydata/mutable/publish.py 578
7899         #   - Push the signature
7900+        started = time.time()
7901         ds = []
7902hunk ./src/allmydata/mutable/publish.py 580
7903-        def _spy_on_results(results):
7904-            print results
7905-            return results
7906         for shnum in xrange(self.total_shares):
7907hunk ./src/allmydata/mutable/publish.py 581
7908-            d = self.writers[shnum].put_root_hash(self.root_hash)
7909+            writer = self.writers[shnum]
7910+            d = writer.put_root_hash(self.root_hash)
7911+            d.addCallback(self._got_write_answer, writer, started)
7912             ds.append(d)
7913         d = defer.DeferredList(ds)
7914hunk ./src/allmydata/mutable/publish.py 586
7915-        def _make_and_place_signature(ignored):
7916-            signable = self.writers[0].get_signable()
7917-            self.signature = self._privkey.sign(signable)
7918-
7919-            ds = []
7920-            for (shnum, writer) in self.writers.iteritems():
7921-                d = writer.put_signature(self.signature)
7922-                ds.append(d)
7923-            return defer.DeferredList(ds)
7924-        d.addCallback(_make_and_place_signature)
7925+        d.addCallback(self._update_checkstring)
7926+        d.addCallback(self._make_and_place_signature)
7927         return d
7928 
7929 
7930hunk ./src/allmydata/mutable/publish.py 591
7931-    def finish_publishing(self):
7932+    def _update_checkstring(self, ignored):
7933+        """
7934+        After putting the root hash, MDMF files will have the
7935+        checkstring written to the storage server. This means that we
7936+        can update our copy of the checkstring so we can detect
7937+        uncoordinated writes. SDMF files will have the same checkstring,
7938+        so we need not do anything.
7939+        """
7940+        self._checkstring = self.writers.values()[0].get_checkstring()
7941+
7942+
7943+    def _make_and_place_signature(self, ignored):
7944+        """
7945+        I create and place the signature.
7946+        """
7947+        started = time.time()
7948+        signable = self.writers[0].get_signable()
7949+        self.signature = self._privkey.sign(signable)
7950+
7951+        ds = []
7952+        for (shnum, writer) in self.writers.iteritems():
7953+            d = writer.put_signature(self.signature)
7954+            d.addCallback(self._got_write_answer, writer, started)
7955+            d.addErrback(self._connection_problem, writer)
7956+            ds.append(d)
7957+        return defer.DeferredList(ds)
7958+
7959+
7960+    def finish_publishing(self, ignored):
7961         # We're almost done -- we just need to put the verification key
7962         # and the offsets
7963hunk ./src/allmydata/mutable/publish.py 622
7964+        started = time.time()
7965         ds = []
7966         verification_key = self._pubkey.serialize()
7967 
7968hunk ./src/allmydata/mutable/publish.py 626
7969-        def _spy_on_results(results):
7970-            print results
7971-            return results
7972+
7973+        # TODO: Bad, since we remove from this same dict. We need to
7974+        # make a copy, or just use a non-iterated value.
7975         for (shnum, writer) in self.writers.iteritems():
7976             d = writer.put_verification_key(verification_key)
7977hunk ./src/allmydata/mutable/publish.py 631
7978+            d.addCallback(self._got_write_answer, writer, started)
7979+            d.addCallback(self._record_verinfo)
7980             d.addCallback(lambda ignored, writer=writer:
7981                 writer.finish_publishing())
7982hunk ./src/allmydata/mutable/publish.py 635
7983+            d.addCallback(self._got_write_answer, writer, started)
7984+            d.addErrback(self._connection_problem, writer)
7985             ds.append(d)
7986         return defer.DeferredList(ds)
7987 
7988hunk ./src/allmydata/mutable/publish.py 641
7989 
7990-    def _turn_barrier(self, res):
7991-        # putting this method in a Deferred chain imposes a guaranteed
7992-        # reactor turn between the pre- and post- portions of that chain.
7993-        # This can be useful to limit memory consumption: since Deferreds do
7994-        # not do tail recursion, code which uses defer.succeed(result) for
7995-        # consistency will cause objects to live for longer than you might
7996-        # normally expect.
7997-        return fireEventually(res)
7998+    def _record_verinfo(self, ignored):
7999+        self.versioninfo = self.writers.values()[0].get_verinfo()
8000 
8001 
8002hunk ./src/allmydata/mutable/publish.py 645
8003-    def _fatal_error(self, f):
8004-        self.log("error during loop", failure=f, level=log.UNUSUAL)
8005-        self._done(f)
8006+    def _connection_problem(self, f, writer):
8007+        """
8008+        We ran into a connection problem while working with writer, and
8009+        need to deal with that.
8010+        """
8011+        self.log("found problem: %s" % str(f))
8012+        self._last_failure = f
8013+        del(self.writers[writer.shnum])
8014 
8015hunk ./src/allmydata/mutable/publish.py 654
8016-    def _update_status(self):
8017-        self._status.set_status("Sending Shares: %d placed out of %d, "
8018-                                "%d messages outstanding" %
8019-                                (len(self.placed),
8020-                                 len(self.goal),
8021-                                 len(self.outstanding)))
8022-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
8023 
8024     def loop(self, ignored=None):
8025         self.log("entering loop", level=log.NOISY)
8026hunk ./src/allmydata/mutable/publish.py 778
8027             self.log_goal(self.goal, "after update: ")
8028 
8029 
8030-    def _encrypt_and_encode(self):
8031-        # this returns a Deferred that fires with a list of (sharedata,
8032-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
8033-        # shares that we care about.
8034-        self.log("_encrypt_and_encode")
8035-
8036-        self._status.set_status("Encrypting")
8037-        started = time.time()
8038+    def _got_write_answer(self, answer, writer, started):
8039+        if not answer:
8040+            # SDMF writers only pretend to write when readers set their
8041+            # blocks, salts, and so on -- they actually just write once,
8042+            # at the end of the upload process. In fake writes, they
8043+            # return defer.succeed(None). If we see that, we shouldn't
8044+            # bother checking it.
8045+            return
8046 
8047hunk ./src/allmydata/mutable/publish.py 787
8048-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
8049-        enc = AES(key)
8050-        crypttext = enc.process(self.newdata)
8051-        assert len(crypttext) == len(self.newdata)
8052+        peerid = writer.peerid
8053+        lp = self.log("_got_write_answer from %s, share %d" %
8054+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
8055 
8056         now = time.time()
8057hunk ./src/allmydata/mutable/publish.py 792
8058-        self._status.timings["encrypt"] = now - started
8059-        started = now
8060-
8061-        # now apply FEC
8062-
8063-        self._status.set_status("Encoding")
8064-        fec = codec.CRSEncoder()
8065-        fec.set_params(self.segment_size,
8066-                       self.required_shares, self.total_shares)
8067-        piece_size = fec.get_block_size()
8068-        crypttext_pieces = [None] * self.required_shares
8069-        for i in range(len(crypttext_pieces)):
8070-            offset = i * piece_size
8071-            piece = crypttext[offset:offset+piece_size]
8072-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
8073-            crypttext_pieces[i] = piece
8074-            assert len(piece) == piece_size
8075-
8076-        d = fec.encode(crypttext_pieces)
8077-        def _done_encoding(res):
8078-            elapsed = time.time() - started
8079-            self._status.timings["encode"] = elapsed
8080-            return res
8081-        d.addCallback(_done_encoding)
8082-        return d
8083-
8084-
8085-    def _generate_shares(self, shares_and_shareids):
8086-        # this sets self.shares and self.root_hash
8087-        self.log("_generate_shares")
8088-        self._status.set_status("Generating Shares")
8089-        started = time.time()
8090-
8091-        # we should know these by now
8092-        privkey = self._privkey
8093-        encprivkey = self._encprivkey
8094-        pubkey = self._pubkey
8095-
8096-        (shares, share_ids) = shares_and_shareids
8097-
8098-        assert len(shares) == len(share_ids)
8099-        assert len(shares) == self.total_shares
8100-        all_shares = {}
8101-        block_hash_trees = {}
8102-        share_hash_leaves = [None] * len(shares)
8103-        for i in range(len(shares)):
8104-            share_data = shares[i]
8105-            shnum = share_ids[i]
8106-            all_shares[shnum] = share_data
8107-
8108-            # build the block hash tree. SDMF has only one leaf.
8109-            leaves = [hashutil.block_hash(share_data)]
8110-            t = hashtree.HashTree(leaves)
8111-            block_hash_trees[shnum] = list(t)
8112-            share_hash_leaves[shnum] = t[0]
8113-        for leaf in share_hash_leaves:
8114-            assert leaf is not None
8115-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
8116-        share_hash_chain = {}
8117-        for shnum in range(self.total_shares):
8118-            needed_hashes = share_hash_tree.needed_hashes(shnum)
8119-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
8120-                                              for i in needed_hashes ] )
8121-        root_hash = share_hash_tree[0]
8122-        assert len(root_hash) == 32
8123-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
8124-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
8125-
8126-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
8127-                             self.required_shares, self.total_shares,
8128-                             self.segment_size, len(self.newdata))
8129-
8130-        # now pack the beginning of the share. All shares are the same up
8131-        # to the signature, then they have divergent share hash chains,
8132-        # then completely different block hash trees + salt + share data,
8133-        # then they all share the same encprivkey at the end. The sizes
8134-        # of everything are the same for all shares.
8135-
8136-        sign_started = time.time()
8137-        signature = privkey.sign(prefix)
8138-        self._status.timings["sign"] = time.time() - sign_started
8139-
8140-        verification_key = pubkey.serialize()
8141-
8142-        final_shares = {}
8143-        for shnum in range(self.total_shares):
8144-            final_share = pack_share(prefix,
8145-                                     verification_key,
8146-                                     signature,
8147-                                     share_hash_chain[shnum],
8148-                                     block_hash_trees[shnum],
8149-                                     all_shares[shnum],
8150-                                     encprivkey)
8151-            final_shares[shnum] = final_share
8152-        elapsed = time.time() - started
8153-        self._status.timings["pack"] = elapsed
8154-        self.shares = final_shares
8155-        self.root_hash = root_hash
8156-
8157-        # we also need to build up the version identifier for what we're
8158-        # pushing. Extract the offsets from one of our shares.
8159-        assert final_shares
8160-        offsets = unpack_header(final_shares.values()[0])[-1]
8161-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8162-        verinfo = (self._new_seqnum, root_hash, self.salt,
8163-                   self.segment_size, len(self.newdata),
8164-                   self.required_shares, self.total_shares,
8165-                   prefix, offsets_tuple)
8166-        self.versioninfo = verinfo
8167-
8168-
8169-
8170-    def _send_shares(self, needed):
8171-        self.log("_send_shares")
8172-
8173-        # we're finally ready to send out our shares. If we encounter any
8174-        # surprises here, it's because somebody else is writing at the same
8175-        # time. (Note: in the future, when we remove the _query_peers() step
8176-        # and instead speculate about [or remember] which shares are where,
8177-        # surprises here are *not* indications of UncoordinatedWriteError,
8178-        # and we'll need to respond to them more gracefully.)
8179-
8180-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
8181-        # organize it by peerid.
8182-
8183-        peermap = DictOfSets()
8184-        for (peerid, shnum) in needed:
8185-            peermap.add(peerid, shnum)
8186-
8187-        # the next thing is to build up a bunch of test vectors. The
8188-        # semantics of Publish are that we perform the operation if the world
8189-        # hasn't changed since the ServerMap was constructed (more or less).
8190-        # For every share we're trying to place, we create a test vector that
8191-        # tests to see if the server*share still corresponds to the
8192-        # map.
8193-
8194-        all_tw_vectors = {} # maps peerid to tw_vectors
8195-        sm = self._servermap.servermap
8196-
8197-        for key in needed:
8198-            (peerid, shnum) = key
8199-
8200-            if key in sm:
8201-                # an old version of that share already exists on the
8202-                # server, according to our servermap. We will create a
8203-                # request that attempts to replace it.
8204-                old_versionid, old_timestamp = sm[key]
8205-                (old_seqnum, old_root_hash, old_salt, old_segsize,
8206-                 old_datalength, old_k, old_N, old_prefix,
8207-                 old_offsets_tuple) = old_versionid
8208-                old_checkstring = pack_checkstring(old_seqnum,
8209-                                                   old_root_hash,
8210-                                                   old_salt)
8211-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8212-
8213-            elif key in self.bad_share_checkstrings:
8214-                old_checkstring = self.bad_share_checkstrings[key]
8215-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8216-
8217-            else:
8218-                # add a testv that requires the share not exist
8219-
8220-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
8221-                # constraints are handled. If the same object is referenced
8222-                # multiple times inside the arguments, foolscap emits a
8223-                # 'reference' token instead of a distinct copy of the
8224-                # argument. The bug is that these 'reference' tokens are not
8225-                # accepted by the inbound constraint code. To work around
8226-                # this, we need to prevent python from interning the
8227-                # (constant) tuple, by creating a new copy of this vector
8228-                # each time.
8229-
8230-                # This bug is fixed in foolscap-0.2.6, and even though this
8231-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
8232-                # supposed to be able to interoperate with older versions of
8233-                # Tahoe which are allowed to use older versions of foolscap,
8234-                # including foolscap-0.2.5 . In addition, I've seen other
8235-                # foolscap problems triggered by 'reference' tokens (see #541
8236-                # for details). So we must keep this workaround in place.
8237-
8238-                #testv = (0, 1, 'eq', "")
8239-                testv = tuple([0, 1, 'eq', ""])
8240-
8241-            testvs = [testv]
8242-            # the write vector is simply the share
8243-            writev = [(0, self.shares[shnum])]
8244-
8245-            if peerid not in all_tw_vectors:
8246-                all_tw_vectors[peerid] = {}
8247-                # maps shnum to (testvs, writevs, new_length)
8248-            assert shnum not in all_tw_vectors[peerid]
8249-
8250-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
8251-
8252-        # we read the checkstring back from each share, however we only use
8253-        # it to detect whether there was a new share that we didn't know
8254-        # about. The success or failure of the write will tell us whether
8255-        # there was a collision or not. If there is a collision, the first
8256-        # thing we'll do is update the servermap, which will find out what
8257-        # happened. We could conceivably reduce a roundtrip by using the
8258-        # readv checkstring to populate the servermap, but really we'd have
8259-        # to read enough data to validate the signatures too, so it wouldn't
8260-        # be an overall win.
8261-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
8262-
8263-        # ok, send the messages!
8264-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
8265-        started = time.time()
8266-        for (peerid, tw_vectors) in all_tw_vectors.items():
8267-
8268-            write_enabler = self._node.get_write_enabler(peerid)
8269-            renew_secret = self._node.get_renewal_secret(peerid)
8270-            cancel_secret = self._node.get_cancel_secret(peerid)
8271-            secrets = (write_enabler, renew_secret, cancel_secret)
8272-            shnums = tw_vectors.keys()
8273-
8274-            for shnum in shnums:
8275-                self.outstanding.add( (peerid, shnum) )
8276-
8277-            d = self._do_testreadwrite(peerid, secrets,
8278-                                       tw_vectors, read_vector)
8279-            d.addCallbacks(self._got_write_answer, self._got_write_error,
8280-                           callbackArgs=(peerid, shnums, started),
8281-                           errbackArgs=(peerid, shnums, started))
8282-            # tolerate immediate errback, like with DeadReferenceError
8283-            d.addBoth(fireEventually)
8284-            d.addCallback(self.loop)
8285-            d.addErrback(self._fatal_error)
8286-
8287-        self._update_status()
8288-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
8289+        elapsed = now - started
8290 
8291hunk ./src/allmydata/mutable/publish.py 794
8292-    def _do_testreadwrite(self, peerid, secrets,
8293-                          tw_vectors, read_vector):
8294-        storage_index = self._storage_index
8295-        ss = self.connections[peerid]
8296+        self._status.add_per_server_time(peerid, elapsed)
8297 
8298hunk ./src/allmydata/mutable/publish.py 796
8299-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
8300-        d = ss.callRemote("slot_testv_and_readv_and_writev",
8301-                          storage_index,
8302-                          secrets,
8303-                          tw_vectors,
8304-                          read_vector)
8305-        return d
8306+        wrote, read_data = answer
8307 
8308hunk ./src/allmydata/mutable/publish.py 798
8309-    def _got_write_answer(self, answer, peerid, shnums, started):
8310-        lp = self.log("_got_write_answer from %s" %
8311-                      idlib.shortnodeid_b2a(peerid))
8312-        for shnum in shnums:
8313-            self.outstanding.discard( (peerid, shnum) )
8314+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
8315 
8316hunk ./src/allmydata/mutable/publish.py 800
8317-        now = time.time()
8318-        elapsed = now - started
8319-        self._status.add_per_server_time(peerid, elapsed)
8320+        # We need to remove from surprise_shares any shares that we are
8321+        # knowingly also writing to that peer from other writers.
8322 
8323hunk ./src/allmydata/mutable/publish.py 803
8324-        wrote, read_data = answer
8325+        # TODO: Precompute this.
8326+        known_shnums = [x.shnum for x in self.writers.values()
8327+                        if x.peerid == peerid]
8328+        surprise_shares -= set(known_shnums)
8329+        self.log("found the following surprise shares: %s" %
8330+                 str(surprise_shares))
8331 
8332hunk ./src/allmydata/mutable/publish.py 810
8333-        surprise_shares = set(read_data.keys()) - set(shnums)
8334+        # Now surprise shares contains all of the shares that we did not
8335+        # expect to be there.
8336 
8337         surprised = False
8338         for shnum in surprise_shares:
8339hunk ./src/allmydata/mutable/publish.py 817
8340             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
8341             checkstring = read_data[shnum][0]
8342-            their_version_info = unpack_checkstring(checkstring)
8343-            if their_version_info == self._new_version_info:
8344+            # What we want to do here is to see if their (seqnum,
8345+            # roothash, salt) is the same as our (seqnum, roothash,
8346+            # salt), or the equivalent for MDMF. The best way to do this
8347+            # is to store a packed representation of our checkstring
8348+            # somewhere, then not bother unpacking the other
8349+            # checkstring.
8350+            if checkstring == self._checkstring:
8351                 # they have the right share, somehow
8352 
8353                 if (peerid,shnum) in self.goal:
8354hunk ./src/allmydata/mutable/publish.py 902
8355             self.log("our testv failed, so the write did not happen",
8356                      parent=lp, level=log.WEIRD, umid="8sc26g")
8357             self.surprised = True
8358-            self.bad_peers.add(peerid) # don't ask them again
8359+            # TODO: This needs to
8360+            self.bad_peers.add(writer) # don't ask them again
8361             # use the checkstring to add information to the log message
8362             for (shnum,readv) in read_data.items():
8363                 checkstring = readv[0]
8364hunk ./src/allmydata/mutable/publish.py 928
8365             # self.loop() will take care of finding new homes
8366             return
8367 
8368-        for shnum in shnums:
8369-            self.placed.add( (peerid, shnum) )
8370-            # and update the servermap
8371-            self._servermap.add_new_share(peerid, shnum,
8372+        # and update the servermap
8373+        # self.versioninfo is set during the last phase of publishing.
8374+        # If we get there, we know that responses correspond to placed
8375+        # shares, and can safely execute these statements.
8376+        if self.versioninfo:
8377+            self.log("wrote successfully: adding new share to servermap")
8378+            self._servermap.add_new_share(peerid, writer.shnum,
8379                                           self.versioninfo, started)
8380hunk ./src/allmydata/mutable/publish.py 936
8381-
8382-        # self.loop() will take care of checking to see if we're done
8383-        return
8384+            self.placed.add( (peerid, writer.shnum) )
8385 
8386hunk ./src/allmydata/mutable/publish.py 938
8387-    def _got_write_error(self, f, peerid, shnums, started):
8388-        for shnum in shnums:
8389-            self.outstanding.discard( (peerid, shnum) )
8390-        self.bad_peers.add(peerid)
8391-        if self._first_write_error is None:
8392-            self._first_write_error = f
8393-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
8394-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
8395-                 failure=f,
8396-                 level=log.UNUSUAL)
8397         # self.loop() will take care of checking to see if we're done
8398         return
8399 
8400hunk ./src/allmydata/mutable/publish.py 949
8401         now = time.time()
8402         self._status.timings["total"] = now - self._started
8403         self._status.set_active(False)
8404-        if isinstance(res, failure.Failure):
8405-            self.log("Publish done, with failure", failure=res,
8406-                     level=log.WEIRD, umid="nRsR9Q")
8407-            self._status.set_status("Failed")
8408-        elif self.surprised:
8409-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
8410-            self._status.set_status("UncoordinatedWriteError")
8411-            # deliver a failure
8412-            res = failure.Failure(UncoordinatedWriteError())
8413-            # TODO: recovery
8414-        else:
8415-            self.log("Publish done, success")
8416-            self._status.set_status("Finished")
8417-            self._status.set_progress(1.0)
8418+        self.log("Publish done, success")
8419+        self._status.set_status("Finished")
8420+        self._status.set_progress(1.0)
8421         eventually(self.done_deferred.callback, res)
8422 
8423hunk ./src/allmydata/mutable/publish.py 954
8424+    def _failure(self):
8425+
8426+        if not self.surprised:
8427+            # We ran out of servers
8428+            self.log("Publish ran out of good servers, "
8429+                     "last failure was: %s" % str(self._last_failure))
8430+            e = NotEnoughServersError("Ran out of non-bad servers, "
8431+                                      "last failure was %s" %
8432+                                      str(self._last_failure))
8433+        else:
8434+            # We ran into shares that we didn't recognize, which means
8435+            # that we need to return an UncoordinatedWriteError.
8436+            self.log("Publish failed with UncoordinatedWriteError")
8437+            e = UncoordinatedWriteError()
8438+        f = failure.Failure(e)
8439+        eventually(self.done_deferred.callback, f)
8440}
8441[test/test_mutable.py: remove tests that are no longer relevant
8442Kevan Carstensen <kevan@isnotajoke.com>**20100702225710
8443 Ignore-this: 90a26b4cc4b2e190a635474ba7097e21
8444] hunk ./src/allmydata/test/test_mutable.py 627
8445         return d
8446 
8447 
8448-class MakeShares(unittest.TestCase):
8449-    def test_encrypt(self):
8450-        nm = make_nodemaker()
8451-        CONTENTS = "some initial contents"
8452-        d = nm.create_mutable_file(CONTENTS)
8453-        def _created(fn):
8454-            p = Publish(fn, nm.storage_broker, None)
8455-            p.salt = "SALT" * 4
8456-            p.readkey = "\x00" * 16
8457-            p.newdata = CONTENTS
8458-            p.required_shares = 3
8459-            p.total_shares = 10
8460-            p.setup_encoding_parameters()
8461-            return p._encrypt_and_encode()
8462-        d.addCallback(_created)
8463-        def _done(shares_and_shareids):
8464-            (shares, share_ids) = shares_and_shareids
8465-            self.failUnlessEqual(len(shares), 10)
8466-            for sh in shares:
8467-                self.failUnless(isinstance(sh, str))
8468-                self.failUnlessEqual(len(sh), 7)
8469-            self.failUnlessEqual(len(share_ids), 10)
8470-        d.addCallback(_done)
8471-        return d
8472-    test_encrypt.todo = "Write an equivalent of this for the new uploader"
8473-
8474-    def test_generate(self):
8475-        nm = make_nodemaker()
8476-        CONTENTS = "some initial contents"
8477-        d = nm.create_mutable_file(CONTENTS)
8478-        def _created(fn):
8479-            self._fn = fn
8480-            p = Publish(fn, nm.storage_broker, None)
8481-            self._p = p
8482-            p.newdata = CONTENTS
8483-            p.required_shares = 3
8484-            p.total_shares = 10
8485-            p.setup_encoding_parameters()
8486-            p._new_seqnum = 3
8487-            p.salt = "SALT" * 4
8488-            # make some fake shares
8489-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
8490-            p._privkey = fn.get_privkey()
8491-            p._encprivkey = fn.get_encprivkey()
8492-            p._pubkey = fn.get_pubkey()
8493-            return p._generate_shares(shares_and_ids)
8494-        d.addCallback(_created)
8495-        def _generated(res):
8496-            p = self._p
8497-            final_shares = p.shares
8498-            root_hash = p.root_hash
8499-            self.failUnlessEqual(len(root_hash), 32)
8500-            self.failUnless(isinstance(final_shares, dict))
8501-            self.failUnlessEqual(len(final_shares), 10)
8502-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
8503-            for i,sh in final_shares.items():
8504-                self.failUnless(isinstance(sh, str))
8505-                # feed the share through the unpacker as a sanity-check
8506-                pieces = unpack_share(sh)
8507-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
8508-                 pubkey, signature, share_hash_chain, block_hash_tree,
8509-                 share_data, enc_privkey) = pieces
8510-                self.failUnlessEqual(u_seqnum, 3)
8511-                self.failUnlessEqual(u_root_hash, root_hash)
8512-                self.failUnlessEqual(k, 3)
8513-                self.failUnlessEqual(N, 10)
8514-                self.failUnlessEqual(segsize, 21)
8515-                self.failUnlessEqual(datalen, len(CONTENTS))
8516-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
8517-                sig_material = struct.pack(">BQ32s16s BBQQ",
8518-                                           0, p._new_seqnum, root_hash, IV,
8519-                                           k, N, segsize, datalen)
8520-                self.failUnless(p._pubkey.verify(sig_material, signature))
8521-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
8522-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
8523-                for shnum,share_hash in share_hash_chain.items():
8524-                    self.failUnless(isinstance(shnum, int))
8525-                    self.failUnless(isinstance(share_hash, str))
8526-                    self.failUnlessEqual(len(share_hash), 32)
8527-                self.failUnless(isinstance(block_hash_tree, list))
8528-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
8529-                self.failUnlessEqual(IV, "SALT"*4)
8530-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
8531-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
8532-        d.addCallback(_generated)
8533-        return d
8534-    test_generate.todo = "Write an equivalent of this for the new uploader"
8535-
8536-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
8537-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
8538-    # when we publish to zero peers, we should get a NotEnoughSharesError
8539-
8540 class PublishMixin:
8541     def publish_one(self):
8542         # publish a file and create shares, which can then be manipulated
8543[interfaces.py: create IMutableUploadable
8544Kevan Carstensen <kevan@isnotajoke.com>**20100706215217
8545 Ignore-this: bee202ec2bfbd8e41f2d4019cce176c7
8546] hunk ./src/allmydata/interfaces.py 1693
8547         """The upload is finished, and whatever filehandle was in use may be
8548         closed."""
8549 
8550+
8551+class IMutableUploadable(Interface):
8552+    """
8553+    I represent content that is due to be uploaded to a mutable filecap.
8554+    """
8555+    # This is somewhat simpler than the IUploadable interface above
8556+    # because mutable files do not need to be concerned with possibly
8557+    # generating a CHK, nor with per-file keys. It is a subset of the
8558+    # methods in IUploadable, though, so we could just as well implement
8559+    # the mutable uploadables as IUploadables that don't happen to use
8560+    # those methods (with the understanding that the unused methods will
8561+    # never be called on such objects)
8562+    def get_size():
8563+        """
8564+        Returns a Deferred that fires with the size of the content held
8565+        by the uploadable.
8566+        """
8567+
8568+    def read(length):
8569+        """
8570+        Returns a list of strings which, when concatenated, are the next
8571+        length bytes of the file, or fewer if there are fewer bytes
8572+        between the current location and the end of the file.
8573+        """
8574+
8575+    def close():
8576+        """
8577+        The process that used the Uploadable is finished using it, so
8578+        the uploadable may be closed.
8579+        """
8580+
8581 class IUploadResults(Interface):
8582     """I am returned by upload() methods. I contain a number of public
8583     attributes which can be read to determine the results of the upload. Some
8584[mutable/publish.py: add MutableDataHandle and MutableFileHandle
8585Kevan Carstensen <kevan@isnotajoke.com>**20100706215257
8586 Ignore-this: 295ea3bc2a962fd14fb7877fc76c011c
8587] {
8588hunk ./src/allmydata/mutable/publish.py 8
8589 from zope.interface import implements
8590 from twisted.internet import defer
8591 from twisted.python import failure
8592-from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
8593+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
8594+                                 IMutableUploadable
8595 from allmydata.util import base32, hashutil, mathutil, idlib, log
8596 from allmydata import hashtree, codec
8597 from allmydata.storage.server import si_b2a
8598hunk ./src/allmydata/mutable/publish.py 971
8599             e = UncoordinatedWriteError()
8600         f = failure.Failure(e)
8601         eventually(self.done_deferred.callback, f)
8602+
8603+
8604+class MutableFileHandle:
8605+    """
8606+    I am a mutable uploadable built around a filehandle-like object,
8607+    usually either a StringIO instance or a handle to an actual file.
8608+    """
8609+    implements(IMutableUploadable)
8610+
8611+    def __init__(self, filehandle):
8612+        # The filehandle is defined as a generally file-like object that
8613+        # has these two methods. We don't care beyond that.
8614+        assert hasattr(filehandle, "read")
8615+        assert hasattr(filehandle, "close")
8616+
8617+        self._filehandle = filehandle
8618+
8619+
8620+    def get_size(self):
8621+        """
8622+        I return the amount of data in my filehandle.
8623+        """
8624+        if not hasattr(self, "_size"):
8625+            old_position = self._filehandle.tell()
8626+            # Seek to the end of the file by seeking 0 bytes from the
8627+            # file's end
8628+            self._filehandle.seek(0, os.SEEK_END)
8629+            self._size = self._filehandle.tell()
8630+            # Restore the previous position, in case this was called
8631+            # after a read.
8632+            self._filehandle.seek(old_position)
8633+            assert self._filehandle.tell() == old_position
8634+
8635+        assert hasattr(self, "_size")
8636+        return self._size
8637+
8638+
8639+    def read(self, length):
8640+        """
8641+        I return some data (up to length bytes) from my filehandle.
8642+
8643+        In most cases, I return length bytes. If I don't, it is because
8644+        length is longer than the distance between my current position
8645+        in the file that I represent and its end. In that case, I return
8646+        as many bytes as I can before going over the EOF.
8647+        """
8648+        return [self._filehandle.read(length)]
8649+
8650+
8651+    def close(self):
8652+        """
8653+        I close the underlying filehandle. Any further operations on the
8654+        filehandle fail at this point.
8655+        """
8656+        self._filehandle.close()
8657+
8658+
8659+class MutableDataHandle(MutableFileHandle):
8660+    """
8661+    I am a mutable uploadable built around a string, which I then cast
8662+    into a StringIO and treat as a filehandle.
8663+    """
8664+
8665+    def __init__(self, s):
8666+        # Take a string and return a file-like uploadable.
8667+        assert isinstance(s, str)
8668+
8669+        MutableFileHandle.__init__(self, StringIO(s))
8670}
8671[mutable/publish.py: reorganize in preparation of file-like uploadables
8672Kevan Carstensen <kevan@isnotajoke.com>**20100706215541
8673 Ignore-this: 5346c9f919ee5b73807c8f287c64e8ce
8674] {
8675hunk ./src/allmydata/mutable/publish.py 4
8676 
8677 
8678 import os, struct, time
8679+from StringIO import StringIO
8680 from itertools import count
8681 from zope.interface import implements
8682 from twisted.internet import defer
8683hunk ./src/allmydata/mutable/publish.py 118
8684         self._status.set_helper(False)
8685         self._status.set_progress(0.0)
8686         self._status.set_active(True)
8687-        # We use this to control how the file is written.
8688-        version = self._node.get_version()
8689-        assert version in (SDMF_VERSION, MDMF_VERSION)
8690-        self._version = version
8691+        self._version = self._node.get_version()
8692+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
8693+
8694 
8695     def get_status(self):
8696         return self._status
8697hunk ./src/allmydata/mutable/publish.py 141
8698 
8699         # 0. Setup encoding parameters, encoder, and other such things.
8700         # 1. Encrypt, encode, and publish segments.
8701+        self.data = StringIO(newdata)
8702+        self.datalength = len(newdata)
8703 
8704hunk ./src/allmydata/mutable/publish.py 144
8705-        self.log("starting publish, datalen is %s" % len(newdata))
8706-        self._status.set_size(len(newdata))
8707+        self.log("starting publish, datalen is %s" % self.datalength)
8708+        self._status.set_size(self.datalength)
8709         self._status.set_status("Started")
8710         self._started = time.time()
8711 
8712hunk ./src/allmydata/mutable/publish.py 193
8713         self.full_peerlist = full_peerlist # for use later, immutable
8714         self.bad_peers = set() # peerids who have errbacked/refused requests
8715 
8716-        self.newdata = newdata
8717-
8718         # This will set self.segment_size, self.num_segments, and
8719         # self.fec.
8720         self.setup_encoding_parameters()
8721hunk ./src/allmydata/mutable/publish.py 272
8722                                                 self.required_shares,
8723                                                 self.total_shares,
8724                                                 self.segment_size,
8725-                                                len(self.newdata))
8726+                                                self.datalength)
8727             self.writers[shnum].peerid = peerid
8728             if (peerid, shnum) in self._servermap.servermap:
8729                 old_versionid, old_timestamp = self._servermap.servermap[key]
8730hunk ./src/allmydata/mutable/publish.py 318
8731         if self._version == MDMF_VERSION:
8732             segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
8733         else:
8734-            segment_size = len(self.newdata) # SDMF is only one segment
8735+            segment_size = self.datalength # SDMF is only one segment
8736         # this must be a multiple of self.required_shares
8737         segment_size = mathutil.next_multiple(segment_size,
8738                                               self.required_shares)
8739hunk ./src/allmydata/mutable/publish.py 324
8740         self.segment_size = segment_size
8741         if segment_size:
8742-            self.num_segments = mathutil.div_ceil(len(self.newdata),
8743+            self.num_segments = mathutil.div_ceil(self.datalength,
8744                                                   segment_size)
8745         else:
8746             self.num_segments = 0
8747hunk ./src/allmydata/mutable/publish.py 337
8748             assert self.num_segments in (0, 1) # SDMF
8749         # calculate the tail segment size.
8750 
8751-        if segment_size and self.newdata:
8752-            self.tail_segment_size = len(self.newdata) % segment_size
8753+        if segment_size and self.datalength:
8754+            self.tail_segment_size = self.datalength % segment_size
8755         else:
8756             self.tail_segment_size = 0
8757 
8758hunk ./src/allmydata/mutable/publish.py 438
8759             segsize = self.segment_size
8760 
8761 
8762-        offset = self.segment_size * segnum
8763-        length = segsize + offset
8764         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
8765hunk ./src/allmydata/mutable/publish.py 439
8766-        data = self.newdata[offset:length]
8767+        data = self.data.read(segsize)
8768+
8769         assert len(data) == segsize
8770 
8771         salt = os.urandom(16)
8772hunk ./src/allmydata/mutable/publish.py 502
8773             d.addCallback(self._got_write_answer, writer, started)
8774             d.addErrback(self._connection_problem, writer)
8775             dl.append(d)
8776-            # TODO: Naturally, we need to check on the results of these.
8777         return defer.DeferredList(dl)
8778 
8779 
8780}
8781[test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
8782Kevan Carstensen <kevan@isnotajoke.com>**20100706215649
8783 Ignore-this: df719a0c52b4bbe9be4fae206c7ab3e7
8784] {
8785hunk ./src/allmydata/test/test_mutable.py 2
8786 
8787-import struct
8788+import struct, os
8789 from cStringIO import StringIO
8790 from twisted.trial import unittest
8791 from twisted.internet import defer, reactor
8792hunk ./src/allmydata/test/test_mutable.py 26
8793      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
8794      NotEnoughServersError, CorruptShareError
8795 from allmydata.mutable.retrieve import Retrieve
8796-from allmydata.mutable.publish import Publish
8797+from allmydata.mutable.publish import Publish, MutableFileHandle, \
8798+                                      MutableDataHandle
8799 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
8800 from allmydata.mutable.layout import unpack_header, unpack_share, \
8801                                      MDMFSlotReadProxy
8802hunk ./src/allmydata/test/test_mutable.py 2465
8803         d.addCallback(lambda data:
8804             self.failUnlessEqual(data, CONTENTS))
8805         return d
8806+
8807+
8808+class FileHandle(unittest.TestCase):
8809+    def setUp(self):
8810+        self.test_data = "Test Data" * 50000
8811+        self.sio = StringIO(self.test_data)
8812+        self.uploadable = MutableFileHandle(self.sio)
8813+
8814+
8815+    def test_filehandle_read(self):
8816+        self.basedir = "mutable/FileHandle/test_filehandle_read"
8817+        chunk_size = 10
8818+        for i in xrange(0, len(self.test_data), chunk_size):
8819+            data = self.uploadable.read(chunk_size)
8820+            data = "".join(data)
8821+            start = i
8822+            end = i + chunk_size
8823+            self.failUnlessEqual(data, self.test_data[start:end])
8824+
8825+
8826+    def test_filehandle_get_size(self):
8827+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
8828+        actual_size = len(self.test_data)
8829+        size = self.uploadable.get_size()
8830+        self.failUnlessEqual(size, actual_size)
8831+
8832+
8833+    def test_filehandle_get_size_out_of_order(self):
8834+        # We should be able to call get_size whenever we want without
8835+        # disturbing the location of the seek pointer.
8836+        chunk_size = 100
8837+        data = self.uploadable.read(chunk_size)
8838+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8839+
8840+        # Now get the size.
8841+        size = self.uploadable.get_size()
8842+        self.failUnlessEqual(size, len(self.test_data))
8843+
8844+        # Now get more data. We should be right where we left off.
8845+        more_data = self.uploadable.read(chunk_size)
8846+        start = chunk_size
8847+        end = chunk_size * 2
8848+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8849+
8850+
8851+    def test_filehandle_file(self):
8852+        # Make sure that the MutableFileHandle works on a file as well
8853+        # as a StringIO object, since in some cases it will be asked to
8854+        # deal with files.
8855+        self.basedir = self.mktemp()
8856+        # necessary? What am I doing wrong here?
8857+        os.mkdir(self.basedir)
8858+        f_path = os.path.join(self.basedir, "test_file")
8859+        f = open(f_path, "w")
8860+        f.write(self.test_data)
8861+        f.close()
8862+        f = open(f_path, "r")
8863+
8864+        uploadable = MutableFileHandle(f)
8865+
8866+        data = uploadable.read(len(self.test_data))
8867+        self.failUnlessEqual("".join(data), self.test_data)
8868+        size = uploadable.get_size()
8869+        self.failUnlessEqual(size, len(self.test_data))
8870+
8871+
8872+    def test_close(self):
8873+        # Make sure that the MutableFileHandle closes its handle when
8874+        # told to do so.
8875+        self.uploadable.close()
8876+        self.failUnless(self.sio.closed)
8877+
8878+
8879+class DataHandle(unittest.TestCase):
8880+    def setUp(self):
8881+        self.test_data = "Test Data" * 50000
8882+        self.uploadable = MutableDataHandle(self.test_data)
8883+
8884+
8885+    def test_datahandle_read(self):
8886+        chunk_size = 10
8887+        for i in xrange(0, len(self.test_data), chunk_size):
8888+            data = self.uploadable.read(chunk_size)
8889+            data = "".join(data)
8890+            start = i
8891+            end = i + chunk_size
8892+            self.failUnlessEqual(data, self.test_data[start:end])
8893+
8894+
8895+    def test_datahandle_get_size(self):
8896+        actual_size = len(self.test_data)
8897+        size = self.uploadable.get_size()
8898+        self.failUnlessEqual(size, actual_size)
8899+
8900+
8901+    def test_datahandle_get_size_out_of_order(self):
8902+        # We should be able to call get_size whenever we want without
8903+        # disturbing the location of the seek pointer.
8904+        chunk_size = 100
8905+        data = self.uploadable.read(chunk_size)
8906+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8907+
8908+        # Now get the size.
8909+        size = self.uploadable.get_size()
8910+        self.failUnlessEqual(size, len(self.test_data))
8911+
8912+        # Now get more data. We should be right where we left off.
8913+        more_data = self.uploadable.read(chunk_size)
8914+        start = chunk_size
8915+        end = chunk_size * 2
8916+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8917}
8918[Alter tests to work with the new APIs
8919Kevan Carstensen <kevan@isnotajoke.com>**20100708000031
8920 Ignore-this: 1f377904ac61ce40e9a04716fbd2ad95
8921] {
8922hunk ./src/allmydata/test/common.py 12
8923 from allmydata import uri, dirnode, client
8924 from allmydata.introducer.server import IntroducerNode
8925 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
8926-     FileTooLargeError, NotEnoughSharesError, ICheckable
8927+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
8928+     IMutableUploadable
8929 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
8930      DeepCheckResults, DeepCheckAndRepairResults
8931 from allmydata.mutable.common import CorruptShareError
8932hunk ./src/allmydata/test/common.py 18
8933 from allmydata.mutable.layout import unpack_header
8934+from allmydata.mutable.publish import MutableDataHandle
8935 from allmydata.storage.server import storage_index_to_dir
8936 from allmydata.storage.mutable import MutableShareFile
8937 from allmydata.util import hashutil, log, fileutil, pollmixin
8938hunk ./src/allmydata/test/common.py 182
8939         self.init_from_cap(make_mutable_file_cap())
8940     def create(self, contents, key_generator=None, keysize=None):
8941         initial_contents = self._get_initial_contents(contents)
8942-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
8943+        if initial_contents.get_size() > self.MUTABLE_SIZELIMIT:
8944             raise FileTooLargeError("SDMF is limited to one segment, and "
8945hunk ./src/allmydata/test/common.py 184
8946-                                    "%d > %d" % (len(initial_contents),
8947+                                    "%d > %d" % (initial_contents.get_size(),
8948                                                  self.MUTABLE_SIZELIMIT))
8949hunk ./src/allmydata/test/common.py 186
8950-        self.all_contents[self.storage_index] = initial_contents
8951+        data = initial_contents.read(initial_contents.get_size())
8952+        data = "".join(data)
8953+        self.all_contents[self.storage_index] = data
8954         return defer.succeed(self)
8955     def _get_initial_contents(self, contents):
8956hunk ./src/allmydata/test/common.py 191
8957-        if isinstance(contents, str):
8958-            return contents
8959         if contents is None:
8960hunk ./src/allmydata/test/common.py 192
8961-            return ""
8962+            return MutableDataHandle("")
8963+
8964+        if IMutableUploadable.providedBy(contents):
8965+            return contents
8966+
8967         assert callable(contents), "%s should be callable, not %s" % \
8968                (contents, type(contents))
8969         return contents(self)
8970hunk ./src/allmydata/test/common.py 309
8971         return defer.succeed(self.all_contents[self.storage_index])
8972 
8973     def overwrite(self, new_contents):
8974-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
8975+        if new_contents.get_size() > self.MUTABLE_SIZELIMIT:
8976             raise FileTooLargeError("SDMF is limited to one segment, and "
8977hunk ./src/allmydata/test/common.py 311
8978-                                    "%d > %d" % (len(new_contents),
8979+                                    "%d > %d" % (new_contents.get_size(),
8980                                                  self.MUTABLE_SIZELIMIT))
8981         assert not self.is_readonly()
8982hunk ./src/allmydata/test/common.py 314
8983-        self.all_contents[self.storage_index] = new_contents
8984+        new_data = new_contents.read(new_contents.get_size())
8985+        new_data = "".join(new_data)
8986+        self.all_contents[self.storage_index] = new_data
8987         return defer.succeed(None)
8988     def modify(self, modifier):
8989         # this does not implement FileTooLargeError, but the real one does
8990hunk ./src/allmydata/test/common.py 324
8991     def _modify(self, modifier):
8992         assert not self.is_readonly()
8993         old_contents = self.all_contents[self.storage_index]
8994-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
8995+        new_data = modifier(old_contents, None, True)
8996+        if new_data is not None:
8997+            new_data = new_data.read(new_data.get_size())
8998+            new_data = "".join(new_data)
8999+        self.all_contents[self.storage_index] = new_data
9000         return None
9001 
9002 def make_mutable_file_cap():
9003hunk ./src/allmydata/test/test_checker.py 11
9004 from allmydata.test.no_network import GridTestMixin
9005 from allmydata.immutable.upload import Data
9006 from allmydata.test.common_web import WebRenderingMixin
9007+from allmydata.mutable.publish import MutableDataHandle
9008 
9009 class FakeClient:
9010     def get_storage_broker(self):
9011hunk ./src/allmydata/test/test_checker.py 291
9012         def _stash_immutable(ur):
9013             self.imm = c0.create_node_from_uri(ur.uri)
9014         d.addCallback(_stash_immutable)
9015-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9016+        d.addCallback(lambda ign:
9017+            c0.create_mutable_file(MutableDataHandle("contents")))
9018         def _stash_mutable(node):
9019             self.mut = node
9020         d.addCallback(_stash_mutable)
9021hunk ./src/allmydata/test/test_cli.py 12
9022 from allmydata.util import fileutil, hashutil, base32
9023 from allmydata import uri
9024 from allmydata.immutable import upload
9025+from allmydata.mutable.publish import MutableDataHandle
9026 from allmydata.dirnode import normalize
9027 
9028 # Test that the scripts can be imported -- although the actual tests of their
9029hunk ./src/allmydata/test/test_cli.py 1983
9030         self.set_up_grid()
9031         c0 = self.g.clients[0]
9032         DATA = "data" * 100
9033-        d = c0.create_mutable_file(DATA)
9034+        DATA_uploadable = MutableDataHandle(DATA)
9035+        d = c0.create_mutable_file(DATA_uploadable)
9036         def _stash_uri(n):
9037             self.uri = n.get_uri()
9038         d.addCallback(_stash_uri)
9039hunk ./src/allmydata/test/test_cli.py 2085
9040                                            upload.Data("literal",
9041                                                         convergence="")))
9042         d.addCallback(_stash_uri, "small")
9043-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9044+        d.addCallback(lambda ign:
9045+            c0.create_mutable_file(MutableDataHandle(DATA+"1")))
9046         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9047         d.addCallback(_stash_uri, "mutable")
9048 
9049hunk ./src/allmydata/test/test_cli.py 2104
9050         # root/small
9051         # root/mutable
9052 
9053+        # We haven't broken anything yet, so this should all be healthy.
9054         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9055                                               self.rooturi))
9056         def _check2((rc, out, err)):
9057hunk ./src/allmydata/test/test_cli.py 2119
9058                             in lines, out)
9059         d.addCallback(_check2)
9060 
9061+        # Similarly, all of these results should be as we expect them to
9062+        # be for a healthy file layout.
9063         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9064         def _check_stats((rc, out, err)):
9065             self.failUnlessReallyEqual(err, "")
9066hunk ./src/allmydata/test/test_cli.py 2136
9067             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9068         d.addCallback(_check_stats)
9069 
9070+        # Now we break things.
9071         def _clobber_shares(ignored):
9072             shares = self.find_shares(self.uris[u"gööd"])
9073             self.failUnlessReallyEqual(len(shares), 10)
9074hunk ./src/allmydata/test/test_cli.py 2155
9075         d.addCallback(_clobber_shares)
9076 
9077         # root
9078-        # root/gööd  [9 shares]
9079+        # root/gööd  [1 missing share]
9080         # root/small
9081         # root/mutable [1 corrupt share]
9082 
9083hunk ./src/allmydata/test/test_cli.py 2161
9084         d.addCallback(lambda ign:
9085                       self.do_cli("deep-check", "--verbose", self.rooturi))
9086+        # This should reveal the missing share, but not the corrupt
9087+        # share, since we didn't tell the deep check operation to also
9088+        # verify.
9089         def _check3((rc, out, err)):
9090             self.failUnlessReallyEqual(err, "")
9091             self.failUnlessReallyEqual(rc, 0)
9092hunk ./src/allmydata/test/test_cli.py 2212
9093                                   "--verbose", "--verify", "--repair",
9094                                   self.rooturi))
9095         def _check6((rc, out, err)):
9096+            # We've just repaired the directory. There is no reason for
9097+            # that repair to be unsuccessful.
9098             self.failUnlessReallyEqual(err, "")
9099             self.failUnlessReallyEqual(rc, 0)
9100             lines = out.splitlines()
9101hunk ./src/allmydata/test/test_deepcheck.py 9
9102 from twisted.internet import threads # CLI tests use deferToThread
9103 from allmydata.immutable import upload
9104 from allmydata.mutable.common import UnrecoverableFileError
9105+from allmydata.mutable.publish import MutableDataHandle
9106 from allmydata.util import idlib
9107 from allmydata.util import base32
9108 from allmydata.scripts import runner
9109hunk ./src/allmydata/test/test_deepcheck.py 38
9110         self.basedir = "deepcheck/MutableChecker/good"
9111         self.set_up_grid()
9112         CONTENTS = "a little bit of data"
9113-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9114+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9115+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9116         def _created(node):
9117             self.node = node
9118             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9119hunk ./src/allmydata/test/test_deepcheck.py 61
9120         self.basedir = "deepcheck/MutableChecker/corrupt"
9121         self.set_up_grid()
9122         CONTENTS = "a little bit of data"
9123-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9124+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9125+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9126         def _stash_and_corrupt(node):
9127             self.node = node
9128             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9129hunk ./src/allmydata/test/test_deepcheck.py 99
9130         self.basedir = "deepcheck/MutableChecker/delete_share"
9131         self.set_up_grid()
9132         CONTENTS = "a little bit of data"
9133-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9134+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9135+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9136         def _stash_and_delete(node):
9137             self.node = node
9138             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9139hunk ./src/allmydata/test/test_deepcheck.py 223
9140             self.root = n
9141             self.root_uri = n.get_uri()
9142         d.addCallback(_created_root)
9143-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9144+        d.addCallback(lambda ign:
9145+            c0.create_mutable_file(MutableDataHandle("mutable file contents")))
9146         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9147         def _created_mutable(n):
9148             self.mutable = n
9149hunk ./src/allmydata/test/test_deepcheck.py 965
9150     def create_mangled(self, ignored, name):
9151         nodetype, mangletype = name.split("-", 1)
9152         if nodetype == "mutable":
9153-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9154+            mutable_uploadable = MutableDataHandle("mutable file contents")
9155+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9156             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9157         elif nodetype == "large":
9158             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9159hunk ./src/allmydata/test/test_dirnode.py 1281
9160     implements(IMutableFileNode)
9161     counter = 0
9162     def __init__(self, initial_contents=""):
9163-        self.data = self._get_initial_contents(initial_contents)
9164+        data = self._get_initial_contents(initial_contents)
9165+        self.data = data.read(data.get_size())
9166+        self.data = "".join(self.data)
9167+
9168         counter = FakeMutableFile.counter
9169         FakeMutableFile.counter += 1
9170         writekey = hashutil.ssk_writekey_hash(str(counter))
9171hunk ./src/allmydata/test/test_dirnode.py 1331
9172         pass
9173 
9174     def modify(self, modifier):
9175-        self.data = modifier(self.data, None, True)
9176+        data = modifier(self.data, None, True)
9177+        self.data = data.read(data.get_size())
9178+        self.data = "".join(self.data)
9179         return defer.succeed(None)
9180 
9181 class FakeNodeMaker(NodeMaker):
9182hunk ./src/allmydata/test/test_hung_server.py 10
9183 from allmydata.util.consumer import download_to_data
9184 from allmydata.immutable import upload
9185 from allmydata.mutable.common import UnrecoverableFileError
9186+from allmydata.mutable.publish import MutableDataHandle
9187 from allmydata.storage.common import storage_index_to_dir
9188 from allmydata.test.no_network import GridTestMixin
9189 from allmydata.test.common import ShouldFailMixin, _corrupt_share_data
9190hunk ./src/allmydata/test/test_hung_server.py 96
9191         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9192 
9193         if mutable:
9194-            d = nm.create_mutable_file(mutable_plaintext)
9195+            uploadable = MutableDataHandle(mutable_plaintext)
9196+            d = nm.create_mutable_file(uploadable)
9197             def _uploaded_mutable(node):
9198                 self.uri = node.get_uri()
9199                 self.shares = self.find_shares(self.uri)
9200hunk ./src/allmydata/test/test_mutable.py 297
9201             d.addCallback(lambda smap: smap.dump(StringIO()))
9202             d.addCallback(lambda sio:
9203                           self.failUnless("3-of-10" in sio.getvalue()))
9204-            d.addCallback(lambda res: n.overwrite("contents 1"))
9205+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9206             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9207             d.addCallback(lambda res: n.download_best_version())
9208             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9209hunk ./src/allmydata/test/test_mutable.py 304
9210             d.addCallback(lambda res: n.get_size_of_best_version())
9211             d.addCallback(lambda size:
9212                           self.failUnlessEqual(size, len("contents 1")))
9213-            d.addCallback(lambda res: n.overwrite("contents 2"))
9214+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9215             d.addCallback(lambda res: n.download_best_version())
9216             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9217             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9218hunk ./src/allmydata/test/test_mutable.py 308
9219-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9220+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9221             d.addCallback(lambda res: n.download_best_version())
9222             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9223             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9224hunk ./src/allmydata/test/test_mutable.py 320
9225             # mapupdate-to-retrieve data caching (i.e. make the shares larger
9226             # than the default readsize, which is 2000 bytes). A 15kB file
9227             # will have 5kB shares.
9228-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
9229+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("large size file" * 1000)))
9230             d.addCallback(lambda res: n.download_best_version())
9231             d.addCallback(lambda res:
9232                           self.failUnlessEqual(res, "large size file" * 1000))
9233hunk ./src/allmydata/test/test_mutable.py 343
9234             # to make them big enough to force the file to be uploaded
9235             # in more than one segment.
9236             big_contents = "contents1" * 100000 # about 900 KiB
9237+            big_contents_uploadable = MutableDataHandle(big_contents)
9238             d.addCallback(lambda ignored:
9239hunk ./src/allmydata/test/test_mutable.py 345
9240-                n.overwrite(big_contents))
9241+                n.overwrite(big_contents_uploadable))
9242             d.addCallback(lambda ignored:
9243                 n.download_best_version())
9244             d.addCallback(lambda data:
9245hunk ./src/allmydata/test/test_mutable.py 355
9246             # segments, so that we make the downloader deal with
9247             # multiple segments.
9248             bigger_contents = "contents2" * 1000000 # about 9MiB
9249+            bigger_contents_uploadable = MutableDataHandle(bigger_contents)
9250             d.addCallback(lambda ignored:
9251hunk ./src/allmydata/test/test_mutable.py 357
9252-                n.overwrite(bigger_contents))
9253+                n.overwrite(bigger_contents_uploadable))
9254             d.addCallback(lambda ignored:
9255                 n.download_best_version())
9256             d.addCallback(lambda data:
9257hunk ./src/allmydata/test/test_mutable.py 368
9258 
9259 
9260     def test_create_with_initial_contents(self):
9261-        d = self.nodemaker.create_mutable_file("contents 1")
9262+        upload1 = MutableDataHandle("contents 1")
9263+        d = self.nodemaker.create_mutable_file(upload1)
9264         def _created(n):
9265             d = n.download_best_version()
9266             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9267hunk ./src/allmydata/test/test_mutable.py 373
9268-            d.addCallback(lambda res: n.overwrite("contents 2"))
9269+            upload2 = MutableDataHandle("contents 2")
9270+            d.addCallback(lambda res: n.overwrite(upload2))
9271             d.addCallback(lambda res: n.download_best_version())
9272             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9273             return d
9274hunk ./src/allmydata/test/test_mutable.py 380
9275         d.addCallback(_created)
9276         return d
9277+    test_create_with_initial_contents.timeout = 15
9278 
9279 
9280     def test_create_mdmf_with_initial_contents(self):
9281hunk ./src/allmydata/test/test_mutable.py 385
9282         initial_contents = "foobarbaz" * 131072 # 900KiB
9283-        d = self.nodemaker.create_mutable_file(initial_contents,
9284+        initial_contents_uploadable = MutableDataHandle(initial_contents)
9285+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
9286                                                version=MDMF_VERSION)
9287         def _created(n):
9288             d = n.download_best_version()
9289hunk ./src/allmydata/test/test_mutable.py 392
9290             d.addCallback(lambda data:
9291                 self.failUnlessEqual(data, initial_contents))
9292+            uploadable2 = MutableDataHandle(initial_contents + "foobarbaz")
9293             d.addCallback(lambda ignored:
9294hunk ./src/allmydata/test/test_mutable.py 394
9295-                n.overwrite(initial_contents + "foobarbaz"))
9296+                n.overwrite(uploadable2))
9297             d.addCallback(lambda ignored:
9298                 n.download_best_version())
9299             d.addCallback(lambda data:
9300hunk ./src/allmydata/test/test_mutable.py 413
9301             key = n.get_writekey()
9302             self.failUnless(isinstance(key, str), key)
9303             self.failUnlessEqual(len(key), 16) # AES key size
9304-            return data
9305+            return MutableDataHandle(data)
9306         d = self.nodemaker.create_mutable_file(_make_contents)
9307         def _created(n):
9308             return n.download_best_version()
9309hunk ./src/allmydata/test/test_mutable.py 429
9310             key = n.get_writekey()
9311             self.failUnless(isinstance(key, str), key)
9312             self.failUnlessEqual(len(key), 16)
9313-            return data
9314+            return MutableDataHandle(data)
9315         d = self.nodemaker.create_mutable_file(_make_contents,
9316                                                version=MDMF_VERSION)
9317         d.addCallback(lambda n:
9318hunk ./src/allmydata/test/test_mutable.py 441
9319 
9320     def test_create_with_too_large_contents(self):
9321         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9322-        d = self.nodemaker.create_mutable_file(BIG)
9323+        BIG_uploadable = MutableDataHandle(BIG)
9324+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
9325         def _created(n):
9326hunk ./src/allmydata/test/test_mutable.py 444
9327-            d = n.overwrite(BIG)
9328+            other_BIG_uploadable = MutableDataHandle(BIG)
9329+            d = n.overwrite(other_BIG_uploadable)
9330             return d
9331         d.addCallback(_created)
9332         return d
9333hunk ./src/allmydata/test/test_mutable.py 459
9334 
9335     def test_modify(self):
9336         def _modifier(old_contents, servermap, first_time):
9337-            return old_contents + "line2"
9338+            new_contents = old_contents + "line2"
9339+            return MutableDataHandle(new_contents)
9340         def _non_modifier(old_contents, servermap, first_time):
9341hunk ./src/allmydata/test/test_mutable.py 462
9342-            return old_contents
9343+            return MutableDataHandle(old_contents)
9344         def _none_modifier(old_contents, servermap, first_time):
9345             return None
9346         def _error_modifier(old_contents, servermap, first_time):
9347hunk ./src/allmydata/test/test_mutable.py 468
9348             raise ValueError("oops")
9349         def _toobig_modifier(old_contents, servermap, first_time):
9350-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
9351+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9352+            return MutableDataHandle(new_content)
9353         calls = []
9354         def _ucw_error_modifier(old_contents, servermap, first_time):
9355             # simulate an UncoordinatedWriteError once
9356hunk ./src/allmydata/test/test_mutable.py 476
9357             calls.append(1)
9358             if len(calls) <= 1:
9359                 raise UncoordinatedWriteError("simulated")
9360-            return old_contents + "line3"
9361+            new_contents = old_contents + "line3"
9362+            return MutableDataHandle(new_contents)
9363         def _ucw_error_non_modifier(old_contents, servermap, first_time):
9364             # simulate an UncoordinatedWriteError once, and don't actually
9365             # modify the contents on subsequent invocations
9366hunk ./src/allmydata/test/test_mutable.py 484
9367             calls.append(1)
9368             if len(calls) <= 1:
9369                 raise UncoordinatedWriteError("simulated")
9370-            return old_contents
9371+            return MutableDataHandle(old_contents)
9372 
9373hunk ./src/allmydata/test/test_mutable.py 486
9374-        d = self.nodemaker.create_mutable_file("line1")
9375+        initial_contents = "line1"
9376+        d = self.nodemaker.create_mutable_file(MutableDataHandle(initial_contents))
9377         def _created(n):
9378             d = n.modify(_modifier)
9379             d.addCallback(lambda res: n.download_best_version())
9380hunk ./src/allmydata/test/test_mutable.py 548
9381 
9382     def test_modify_backoffer(self):
9383         def _modifier(old_contents, servermap, first_time):
9384-            return old_contents + "line2"
9385+            return MutableDataHandle(old_contents + "line2")
9386         calls = []
9387         def _ucw_error_modifier(old_contents, servermap, first_time):
9388             # simulate an UncoordinatedWriteError once
9389hunk ./src/allmydata/test/test_mutable.py 555
9390             calls.append(1)
9391             if len(calls) <= 1:
9392                 raise UncoordinatedWriteError("simulated")
9393-            return old_contents + "line3"
9394+            return MutableDataHandle(old_contents + "line3")
9395         def _always_ucw_error_modifier(old_contents, servermap, first_time):
9396             raise UncoordinatedWriteError("simulated")
9397         def _backoff_stopper(node, f):
9398hunk ./src/allmydata/test/test_mutable.py 570
9399         giveuper._delay = 0.1
9400         giveuper.factor = 1
9401 
9402-        d = self.nodemaker.create_mutable_file("line1")
9403+        d = self.nodemaker.create_mutable_file(MutableDataHandle("line1"))
9404         def _created(n):
9405             d = n.modify(_modifier)
9406             d.addCallback(lambda res: n.download_best_version())
9407hunk ./src/allmydata/test/test_mutable.py 620
9408             d.addCallback(lambda smap: smap.dump(StringIO()))
9409             d.addCallback(lambda sio:
9410                           self.failUnless("3-of-10" in sio.getvalue()))
9411-            d.addCallback(lambda res: n.overwrite("contents 1"))
9412+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9413             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9414             d.addCallback(lambda res: n.download_best_version())
9415             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9416hunk ./src/allmydata/test/test_mutable.py 624
9417-            d.addCallback(lambda res: n.overwrite("contents 2"))
9418+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9419             d.addCallback(lambda res: n.download_best_version())
9420             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9421             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9422hunk ./src/allmydata/test/test_mutable.py 628
9423-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9424+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9425             d.addCallback(lambda res: n.download_best_version())
9426             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9427             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9428hunk ./src/allmydata/test/test_mutable.py 646
9429         # publish a file and create shares, which can then be manipulated
9430         # later.
9431         self.CONTENTS = "New contents go here" * 1000
9432+        self.uploadable = MutableDataHandle(self.CONTENTS)
9433         self._storage = FakeStorage()
9434         self._nodemaker = make_nodemaker(self._storage)
9435         self._storage_broker = self._nodemaker.storage_broker
9436hunk ./src/allmydata/test/test_mutable.py 650
9437-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9438+        d = self._nodemaker.create_mutable_file(self.uploadable)
9439         def _created(node):
9440             self._fn = node
9441             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9442hunk ./src/allmydata/test/test_mutable.py 662
9443         # an MDMF file.
9444         # self.CONTENTS should have more than one segment.
9445         self.CONTENTS = "This is an MDMF file" * 100000
9446+        self.uploadable = MutableDataHandle(self.CONTENTS)
9447         self._storage = FakeStorage()
9448         self._nodemaker = make_nodemaker(self._storage)
9449         self._storage_broker = self._nodemaker.storage_broker
9450hunk ./src/allmydata/test/test_mutable.py 666
9451-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
9452+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
9453         def _created(node):
9454             self._fn = node
9455             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9456hunk ./src/allmydata/test/test_mutable.py 678
9457         # like publish_one, except that the result is guaranteed to be
9458         # an SDMF file
9459         self.CONTENTS = "This is an SDMF file" * 1000
9460+        self.uploadable = MutableDataHandle(self.CONTENTS)
9461         self._storage = FakeStorage()
9462         self._nodemaker = make_nodemaker(self._storage)
9463         self._storage_broker = self._nodemaker.storage_broker
9464hunk ./src/allmydata/test/test_mutable.py 682
9465-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
9466+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
9467         def _created(node):
9468             self._fn = node
9469             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9470hunk ./src/allmydata/test/test_mutable.py 696
9471                          "Contents 2",
9472                          "Contents 3a",
9473                          "Contents 3b"]
9474+        self.uploadables = [MutableDataHandle(d) for d in self.CONTENTS]
9475         self._copied_shares = {}
9476         self._storage = FakeStorage()
9477         self._nodemaker = make_nodemaker(self._storage)
9478hunk ./src/allmydata/test/test_mutable.py 700
9479-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
9480+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
9481         def _created(node):
9482             self._fn = node
9483             # now create multiple versions of the same file, and accumulate
9484hunk ./src/allmydata/test/test_mutable.py 707
9485             # their shares, so we can mix and match them later.
9486             d = defer.succeed(None)
9487             d.addCallback(self._copy_shares, 0)
9488-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
9489+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
9490             d.addCallback(self._copy_shares, 1)
9491hunk ./src/allmydata/test/test_mutable.py 709
9492-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
9493+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
9494             d.addCallback(self._copy_shares, 2)
9495hunk ./src/allmydata/test/test_mutable.py 711
9496-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
9497+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
9498             d.addCallback(self._copy_shares, 3)
9499             # now we replace all the shares with version s3, and upload a new
9500             # version to get s4b.
9501hunk ./src/allmydata/test/test_mutable.py 717
9502             rollback = dict([(i,2) for i in range(10)])
9503             d.addCallback(lambda res: self._set_versions(rollback))
9504-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
9505+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
9506             d.addCallback(self._copy_shares, 4)
9507             # we leave the storage in state 4
9508             return d
9509hunk ./src/allmydata/test/test_mutable.py 826
9510         # create a new file, which is large enough to knock the privkey out
9511         # of the early part of the file
9512         LARGE = "These are Larger contents" * 200 # about 5KB
9513-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
9514+        LARGE_uploadable = MutableDataHandle(LARGE)
9515+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
9516         def _created(large_fn):
9517             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
9518             return self.make_servermap(MODE_WRITE, large_fn2)
9519hunk ./src/allmydata/test/test_mutable.py 1842
9520 class MultipleEncodings(unittest.TestCase):
9521     def setUp(self):
9522         self.CONTENTS = "New contents go here"
9523+        self.uploadable = MutableDataHandle(self.CONTENTS)
9524         self._storage = FakeStorage()
9525         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
9526         self._storage_broker = self._nodemaker.storage_broker
9527hunk ./src/allmydata/test/test_mutable.py 1846
9528-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9529+        d = self._nodemaker.create_mutable_file(self.uploadable)
9530         def _created(node):
9531             self._fn = node
9532         d.addCallback(_created)
9533hunk ./src/allmydata/test/test_mutable.py 1872
9534         s = self._storage
9535         s._peers = {} # clear existing storage
9536         p2 = Publish(fn2, self._storage_broker, None)
9537-        d = p2.publish(data)
9538+        uploadable = MutableDataHandle(data)
9539+        d = p2.publish(uploadable)
9540         def _published(res):
9541             shares = s._peers
9542             s._peers = {}
9543hunk ./src/allmydata/test/test_mutable.py 2049
9544         self._set_versions(target)
9545 
9546         def _modify(oldversion, servermap, first_time):
9547-            return oldversion + " modified"
9548+            return MutableDataHandle(oldversion + " modified")
9549         d = self._fn.modify(_modify)
9550         d.addCallback(lambda res: self._fn.download_best_version())
9551         expected = self.CONTENTS[2] + " modified"
9552hunk ./src/allmydata/test/test_mutable.py 2175
9553         self.basedir = "mutable/Problems/test_publish_surprise"
9554         self.set_up_grid()
9555         nm = self.g.clients[0].nodemaker
9556-        d = nm.create_mutable_file("contents 1")
9557+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9558         def _created(n):
9559             d = defer.succeed(None)
9560             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9561hunk ./src/allmydata/test/test_mutable.py 2185
9562             d.addCallback(_got_smap1)
9563             # then modify the file, leaving the old map untouched
9564             d.addCallback(lambda res: log.msg("starting winning write"))
9565-            d.addCallback(lambda res: n.overwrite("contents 2"))
9566+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9567             # now attempt to modify the file with the old servermap. This
9568             # will look just like an uncoordinated write, in which every
9569             # single share got updated between our mapupdate and our publish
9570hunk ./src/allmydata/test/test_mutable.py 2194
9571                           self.shouldFail(UncoordinatedWriteError,
9572                                           "test_publish_surprise", None,
9573                                           n.upload,
9574-                                          "contents 2a", self.old_map))
9575+                                          MutableDataHandle("contents 2a"), self.old_map))
9576             return d
9577         d.addCallback(_created)
9578         return d
9579hunk ./src/allmydata/test/test_mutable.py 2203
9580         self.basedir = "mutable/Problems/test_retrieve_surprise"
9581         self.set_up_grid()
9582         nm = self.g.clients[0].nodemaker
9583-        d = nm.create_mutable_file("contents 1")
9584+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9585         def _created(n):
9586             d = defer.succeed(None)
9587             d.addCallback(lambda res: n.get_servermap(MODE_READ))
9588hunk ./src/allmydata/test/test_mutable.py 2213
9589             d.addCallback(_got_smap1)
9590             # then modify the file, leaving the old map untouched
9591             d.addCallback(lambda res: log.msg("starting winning write"))
9592-            d.addCallback(lambda res: n.overwrite("contents 2"))
9593+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9594             # now attempt to retrieve the old version with the old servermap.
9595             # This will look like someone has changed the file since we
9596             # updated the servermap.
9597hunk ./src/allmydata/test/test_mutable.py 2241
9598         self.basedir = "mutable/Problems/test_unexpected_shares"
9599         self.set_up_grid()
9600         nm = self.g.clients[0].nodemaker
9601-        d = nm.create_mutable_file("contents 1")
9602+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9603         def _created(n):
9604             d = defer.succeed(None)
9605             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9606hunk ./src/allmydata/test/test_mutable.py 2253
9607                 self.g.remove_server(peer0)
9608                 # then modify the file, leaving the old map untouched
9609                 log.msg("starting winning write")
9610-                return n.overwrite("contents 2")
9611+                return n.overwrite(MutableDataHandle("contents 2"))
9612             d.addCallback(_got_smap1)
9613             # now attempt to modify the file with the old servermap. This
9614             # will look just like an uncoordinated write, in which every
9615hunk ./src/allmydata/test/test_mutable.py 2263
9616                           self.shouldFail(UncoordinatedWriteError,
9617                                           "test_surprise", None,
9618                                           n.upload,
9619-                                          "contents 2a", self.old_map))
9620+                                          MutableDataHandle("contents 2a"), self.old_map))
9621             return d
9622         d.addCallback(_created)
9623         return d
9624hunk ./src/allmydata/test/test_mutable.py 2267
9625+    test_unexpected_shares.timeout = 15
9626 
9627     def test_bad_server(self):
9628         # Break one server, then create the file: the initial publish should
9629hunk ./src/allmydata/test/test_mutable.py 2303
9630         d.addCallback(_break_peer0)
9631         # now "create" the file, using the pre-established key, and let the
9632         # initial publish finally happen
9633-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
9634+        d.addCallback(lambda res: nm.create_mutable_file(MutableDataHandle("contents 1")))
9635         # that ought to work
9636         def _got_node(n):
9637             d = n.download_best_version()
9638hunk ./src/allmydata/test/test_mutable.py 2312
9639             def _break_peer1(res):
9640                 self.connection1.broken = True
9641             d.addCallback(_break_peer1)
9642-            d.addCallback(lambda res: n.overwrite("contents 2"))
9643+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9644             # that ought to work too
9645             d.addCallback(lambda res: n.download_best_version())
9646             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9647hunk ./src/allmydata/test/test_mutable.py 2344
9648         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
9649         self.g.break_server(peerids[0])
9650 
9651-        d = nm.create_mutable_file("contents 1")
9652+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9653         def _created(n):
9654             d = n.download_best_version()
9655             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9656hunk ./src/allmydata/test/test_mutable.py 2352
9657             def _break_second_server(res):
9658                 self.g.break_server(peerids[1])
9659             d.addCallback(_break_second_server)
9660-            d.addCallback(lambda res: n.overwrite("contents 2"))
9661+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9662             # that ought to work too
9663             d.addCallback(lambda res: n.download_best_version())
9664             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9665hunk ./src/allmydata/test/test_mutable.py 2371
9666         d = self.shouldFail(NotEnoughServersError,
9667                             "test_publish_all_servers_bad",
9668                             "Ran out of non-bad servers",
9669-                            nm.create_mutable_file, "contents")
9670+                            nm.create_mutable_file, MutableDataHandle("contents"))
9671         return d
9672 
9673     def test_publish_no_servers(self):
9674hunk ./src/allmydata/test/test_mutable.py 2383
9675         d = self.shouldFail(NotEnoughServersError,
9676                             "test_publish_no_servers",
9677                             "Ran out of non-bad servers",
9678-                            nm.create_mutable_file, "contents")
9679+                            nm.create_mutable_file, MutableDataHandle("contents"))
9680         return d
9681     test_publish_no_servers.timeout = 30
9682 
9683hunk ./src/allmydata/test/test_mutable.py 2401
9684         # we need some contents that are large enough to push the privkey out
9685         # of the early part of the file
9686         LARGE = "These are Larger contents" * 2000 # about 50KB
9687-        d = nm.create_mutable_file(LARGE)
9688+        LARGE_uploadable = MutableDataHandle(LARGE)
9689+        d = nm.create_mutable_file(LARGE_uploadable)
9690         def _created(n):
9691             self.uri = n.get_uri()
9692             self.n2 = nm.create_from_cap(self.uri)
9693hunk ./src/allmydata/test/test_mutable.py 2438
9694         self.set_up_grid(num_servers=20)
9695         nm = self.g.clients[0].nodemaker
9696         LARGE = "These are Larger contents" * 2000 # about 50KiB
9697+        LARGE_uploadable = MutableDataHandle(LARGE)
9698         nm._node_cache = DevNullDictionary() # disable the nodecache
9699 
9700hunk ./src/allmydata/test/test_mutable.py 2441
9701-        d = nm.create_mutable_file(LARGE)
9702+        d = nm.create_mutable_file(LARGE_uploadable)
9703         def _created(n):
9704             self.uri = n.get_uri()
9705             self.n2 = nm.create_from_cap(self.uri)
9706hunk ./src/allmydata/test/test_mutable.py 2464
9707         self.set_up_grid(num_servers=20)
9708         nm = self.g.clients[0].nodemaker
9709         CONTENTS = "contents" * 2000
9710-        d = nm.create_mutable_file(CONTENTS)
9711+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9712+        d = nm.create_mutable_file(CONTENTS_uploadable)
9713         def _created(node):
9714             self._node = node
9715         d.addCallback(_created)
9716hunk ./src/allmydata/test/test_system.py 22
9717 from allmydata.monitor import Monitor
9718 from allmydata.mutable.common import NotWriteableError
9719 from allmydata.mutable import layout as mutable_layout
9720+from allmydata.mutable.publish import MutableDataHandle
9721 from foolscap.api import DeadReferenceError
9722 from twisted.python.failure import Failure
9723 from twisted.web.client import getPage
9724hunk ./src/allmydata/test/test_system.py 460
9725     def test_mutable(self):
9726         self.basedir = "system/SystemTest/test_mutable"
9727         DATA = "initial contents go here."  # 25 bytes % 3 != 0
9728+        DATA_uploadable = MutableDataHandle(DATA)
9729         NEWDATA = "new contents yay"
9730hunk ./src/allmydata/test/test_system.py 462
9731+        NEWDATA_uploadable = MutableDataHandle(NEWDATA)
9732         NEWERDATA = "this is getting old"
9733hunk ./src/allmydata/test/test_system.py 464
9734+        NEWERDATA_uploadable = MutableDataHandle(NEWERDATA)
9735 
9736         d = self.set_up_nodes(use_key_generator=True)
9737 
9738hunk ./src/allmydata/test/test_system.py 471
9739         def _create_mutable(res):
9740             c = self.clients[0]
9741             log.msg("starting create_mutable_file")
9742-            d1 = c.create_mutable_file(DATA)
9743+            d1 = c.create_mutable_file(DATA_uploadable)
9744             def _done(res):
9745                 log.msg("DONE: %s" % (res,))
9746                 self._mutable_node_1 = res
9747hunk ./src/allmydata/test/test_system.py 558
9748             self.failUnlessEqual(res, DATA)
9749             # replace the data
9750             log.msg("starting replace1")
9751-            d1 = newnode.overwrite(NEWDATA)
9752+            d1 = newnode.overwrite(NEWDATA_uploadable)
9753             d1.addCallback(lambda res: newnode.download_best_version())
9754             return d1
9755         d.addCallback(_check_download_3)
9756hunk ./src/allmydata/test/test_system.py 572
9757             newnode2 = self.clients[3].create_node_from_uri(uri)
9758             self._newnode3 = self.clients[3].create_node_from_uri(uri)
9759             log.msg("starting replace2")
9760-            d1 = newnode1.overwrite(NEWERDATA)
9761+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
9762             d1.addCallback(lambda res: newnode2.download_best_version())
9763             return d1
9764         d.addCallback(_check_download_4)
9765hunk ./src/allmydata/test/test_system.py 642
9766         def _check_empty_file(res):
9767             # make sure we can create empty files, this usually screws up the
9768             # segsize math
9769-            d1 = self.clients[2].create_mutable_file("")
9770+            d1 = self.clients[2].create_mutable_file(MutableDataHandle(""))
9771             d1.addCallback(lambda newnode: newnode.download_best_version())
9772             d1.addCallback(lambda res: self.failUnlessEqual("", res))
9773             return d1
9774hunk ./src/allmydata/test/test_system.py 673
9775                                  self.key_generator_svc.key_generator.pool_size + size_delta)
9776 
9777         d.addCallback(check_kg_poolsize, 0)
9778-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
9779+        d.addCallback(lambda junk:
9780+            self.clients[3].create_mutable_file(MutableDataHandle('hello, world')))
9781         d.addCallback(check_kg_poolsize, -1)
9782         d.addCallback(lambda junk: self.clients[3].create_dirnode())
9783         d.addCallback(check_kg_poolsize, -2)
9784hunk ./src/allmydata/test/test_web.py 3166
9785         def _stash_mutable_uri(n, which):
9786             self.uris[which] = n.get_uri()
9787             assert isinstance(self.uris[which], str)
9788-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9789+        d.addCallback(lambda ign:
9790+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9791         d.addCallback(_stash_mutable_uri, "corrupt")
9792         d.addCallback(lambda ign:
9793                       c0.upload(upload.Data("literal", convergence="")))
9794hunk ./src/allmydata/test/test_web.py 3313
9795         def _stash_mutable_uri(n, which):
9796             self.uris[which] = n.get_uri()
9797             assert isinstance(self.uris[which], str)
9798-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9799+        d.addCallback(lambda ign:
9800+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9801         d.addCallback(_stash_mutable_uri, "corrupt")
9802 
9803         def _compute_fileurls(ignored):
9804hunk ./src/allmydata/test/test_web.py 3976
9805         def _stash_mutable_uri(n, which):
9806             self.uris[which] = n.get_uri()
9807             assert isinstance(self.uris[which], str)
9808-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
9809+        d.addCallback(lambda ign:
9810+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"2")))
9811         d.addCallback(_stash_mutable_uri, "mutable")
9812 
9813         def _compute_fileurls(ignored):
9814hunk ./src/allmydata/test/test_web.py 4076
9815                                                         convergence="")))
9816         d.addCallback(_stash_uri, "small")
9817 
9818-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
9819+        d.addCallback(lambda ign:
9820+            c0.create_mutable_file(publish.MutableDataHandle("mutable")))
9821         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9822         d.addCallback(_stash_uri, "mutable")
9823 
9824}
9825[Alter mutable files to use file-like objects for publishing instead of strings.
9826Kevan Carstensen <kevan@isnotajoke.com>**20100708000732
9827 Ignore-this: 8dd07d95386b6d540bc21289f981ebd0
9828] {
9829hunk ./src/allmydata/dirnode.py 11
9830 from allmydata.mutable.common import NotWriteableError
9831 from allmydata.mutable.filenode import MutableFileNode
9832 from allmydata.unknown import UnknownNode, strip_prefix_for_ro
9833+from allmydata.mutable.publish import MutableDataHandle
9834 from allmydata.interfaces import IFilesystemNode, IDirectoryNode, IFileNode, \
9835      IImmutableFileNode, IMutableFileNode, \
9836      ExistingChildError, NoSuchChildError, ICheckable, IDeepCheckable, \
9837hunk ./src/allmydata/dirnode.py 104
9838 
9839         del children[self.name]
9840         new_contents = self.node._pack_contents(children)
9841-        return new_contents
9842+        uploadable = MutableDataHandle(new_contents)
9843+        return uploadable
9844 
9845 
9846 class MetadataSetter:
9847hunk ./src/allmydata/dirnode.py 130
9848 
9849         children[name] = (child, metadata)
9850         new_contents = self.node._pack_contents(children)
9851-        return new_contents
9852+        uploadable = MutableDataHandle(new_contents)
9853+        return uploadable
9854 
9855 
9856 class Adder:
9857hunk ./src/allmydata/dirnode.py 175
9858 
9859             children[name] = (child, metadata)
9860         new_contents = self.node._pack_contents(children)
9861-        return new_contents
9862+        uploadable = MutableDataHandle(new_contents)
9863+        return uploadable
9864 
9865 
9866 def _encrypt_rw_uri(filenode, rw_uri):
9867hunk ./src/allmydata/mutable/filenode.py 7
9868 from zope.interface import implements
9869 from twisted.internet import defer, reactor
9870 from foolscap.api import eventually
9871-from allmydata.interfaces import IMutableFileNode, \
9872-     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
9873+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
9874+                                 NotEnoughSharesError, \
9875+                                 MDMF_VERSION, SDMF_VERSION, IMutableUploadable
9876 from allmydata.util import hashutil, log
9877 from allmydata.util.assertutil import precondition
9878 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
9879hunk ./src/allmydata/mutable/filenode.py 16
9880 from allmydata.monitor import Monitor
9881 from pycryptopp.cipher.aes import AES
9882 
9883-from allmydata.mutable.publish import Publish
9884+from allmydata.mutable.publish import Publish, MutableFileHandle, \
9885+                                      MutableDataHandle
9886 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
9887      ResponseCache, UncoordinatedWriteError
9888 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
9889hunk ./src/allmydata/mutable/filenode.py 133
9890         return self._upload(initial_contents, None)
9891 
9892     def _get_initial_contents(self, contents):
9893-        if isinstance(contents, str):
9894-            return contents
9895         if contents is None:
9896hunk ./src/allmydata/mutable/filenode.py 134
9897-            return ""
9898+            return MutableDataHandle("")
9899+
9900+        if IMutableUploadable.providedBy(contents):
9901+            return contents
9902+
9903         assert callable(contents), "%s should be callable, not %s" % \
9904                (contents, type(contents))
9905         return contents(self)
9906hunk ./src/allmydata/mutable/filenode.py 353
9907     def overwrite(self, new_contents):
9908         return self._do_serialized(self._overwrite, new_contents)
9909     def _overwrite(self, new_contents):
9910+        assert IMutableUploadable.providedBy(new_contents)
9911+
9912         servermap = ServerMap()
9913         d = self._update_servermap(servermap, mode=MODE_WRITE)
9914         d.addCallback(lambda ignored: self._upload(new_contents, servermap))
9915hunk ./src/allmydata/mutable/filenode.py 431
9916                 # recovery when it observes UCWE, we need to do a second
9917                 # publish. See #551 for details. We'll basically loop until
9918                 # we managed an uncontested publish.
9919-                new_contents = old_contents
9920-            precondition(isinstance(new_contents, str),
9921-                         "Modifier function must return a string or None")
9922+                old_uploadable = MutableDataHandle(old_contents)
9923+                new_contents = old_uploadable
9924+            precondition((IMutableUploadable.providedBy(new_contents) or
9925+                          new_contents is None),
9926+                         "Modifier function must return an IMutableUploadable "
9927+                         "or None")
9928             return self._upload(new_contents, servermap)
9929         d.addCallback(_apply)
9930         return d
9931hunk ./src/allmydata/mutable/filenode.py 472
9932         return self._do_serialized(self._upload, new_contents, servermap)
9933     def _upload(self, new_contents, servermap):
9934         assert self._pubkey, "update_servermap must be called before publish"
9935+        assert IMutableUploadable.providedBy(new_contents)
9936+
9937         p = Publish(self, self._storage_broker, servermap)
9938         if self._history:
9939hunk ./src/allmydata/mutable/filenode.py 476
9940-            self._history.notify_publish(p.get_status(), len(new_contents))
9941+            self._history.notify_publish(p.get_status(), new_contents.get_size())
9942         d = p.publish(new_contents)
9943hunk ./src/allmydata/mutable/filenode.py 478
9944-        d.addCallback(self._did_upload, len(new_contents))
9945+        d.addCallback(self._did_upload, new_contents.get_size())
9946         return d
9947     def _did_upload(self, res, size):
9948         self._most_recent_size = size
9949hunk ./src/allmydata/mutable/publish.py 141
9950 
9951         # 0. Setup encoding parameters, encoder, and other such things.
9952         # 1. Encrypt, encode, and publish segments.
9953-        self.data = StringIO(newdata)
9954-        self.datalength = len(newdata)
9955+        assert IMutableUploadable.providedBy(newdata)
9956+
9957+        self.data = newdata
9958+        self.datalength = newdata.get_size()
9959 
9960         self.log("starting publish, datalen is %s" % self.datalength)
9961         self._status.set_size(self.datalength)
9962hunk ./src/allmydata/mutable/publish.py 442
9963 
9964         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
9965         data = self.data.read(segsize)
9966+        # XXX: This is dumb. Why return a list?
9967+        data = "".join(data)
9968 
9969         assert len(data) == segsize
9970 
9971hunk ./src/allmydata/mutable/repairer.py 5
9972 from zope.interface import implements
9973 from twisted.internet import defer
9974 from allmydata.interfaces import IRepairResults, ICheckResults
9975+from allmydata.mutable.publish import MutableDataHandle
9976 
9977 class RepairResults:
9978     implements(IRepairResults)
9979hunk ./src/allmydata/mutable/repairer.py 108
9980             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
9981 
9982         d = self.node.download_version(smap, best_version, fetch_privkey=True)
9983+        d.addCallback(lambda data:
9984+            MutableDataHandle(data))
9985         d.addCallback(self.node.upload, smap)
9986         d.addCallback(self.get_results, smap)
9987         return d
9988hunk ./src/allmydata/nodemaker.py 9
9989 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
9990 from allmydata.immutable.upload import Data
9991 from allmydata.mutable.filenode import MutableFileNode
9992+from allmydata.mutable.publish import MutableDataHandle
9993 from allmydata.dirnode import DirectoryNode, pack_children
9994 from allmydata.unknown import UnknownNode
9995 from allmydata import uri
9996hunk ./src/allmydata/nodemaker.py 111
9997                          "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
9998             node.raise_error()
9999         d = self.create_mutable_file(lambda n:
10000-                                     pack_children(n, initial_children),
10001+                                     MutableDataHandle(
10002+                                        pack_children(n, initial_children)),
10003                                      version)
10004         d.addCallback(self._create_dirnode)
10005         return d
10006hunk ./src/allmydata/web/filenode.py 12
10007 from allmydata.interfaces import ExistingChildError
10008 from allmydata.monitor import Monitor
10009 from allmydata.immutable.upload import FileHandle
10010+from allmydata.mutable.publish import MutableFileHandle
10011 from allmydata.util import log, base32
10012 
10013 from allmydata.web.common import text_plain, WebError, RenderMixin, \
10014hunk ./src/allmydata/web/filenode.py 27
10015         # a new file is being uploaded in our place.
10016         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10017         if mutable:
10018-            req.content.seek(0)
10019-            data = req.content.read()
10020+            data = MutableFileHandle(req.content)
10021             d = client.create_mutable_file(data)
10022             def _uploaded(newnode):
10023                 d2 = self.parentnode.set_node(self.name, newnode,
10024hunk ./src/allmydata/web/filenode.py 61
10025         d.addCallback(lambda res: childnode.get_uri())
10026         return d
10027 
10028-    def _read_data_from_formpost(self, req):
10029-        # SDMF: files are small, and we can only upload data, so we read
10030-        # the whole file into memory before uploading.
10031-        contents = req.fields["file"]
10032-        contents.file.seek(0)
10033-        data = contents.file.read()
10034-        return data
10035 
10036     def replace_me_with_a_formpost(self, req, client, replace):
10037         # create a new file, maybe mutable, maybe immutable
10038hunk ./src/allmydata/web/filenode.py 66
10039         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10040 
10041+        # create an immutable file
10042+        contents = req.fields["file"]
10043         if mutable:
10044hunk ./src/allmydata/web/filenode.py 69
10045-            data = self._read_data_from_formpost(req)
10046-            d = client.create_mutable_file(data)
10047+            uploadable = MutableFileHandle(contents.file)
10048+            d = client.create_mutable_file(uploadable)
10049             def _uploaded(newnode):
10050                 d2 = self.parentnode.set_node(self.name, newnode,
10051                                               overwrite=replace)
10052hunk ./src/allmydata/web/filenode.py 78
10053                 return d2
10054             d.addCallback(_uploaded)
10055             return d
10056-        # create an immutable file
10057-        contents = req.fields["file"]
10058+
10059         uploadable = FileHandle(contents.file, convergence=client.convergence)
10060         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
10061         d.addCallback(lambda newnode: newnode.get_uri())
10062hunk ./src/allmydata/web/filenode.py 84
10063         return d
10064 
10065+
10066 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
10067     def __init__(self, client, parentnode, name):
10068         rend.Page.__init__(self)
10069hunk ./src/allmydata/web/filenode.py 278
10070 
10071     def replace_my_contents(self, req):
10072         req.content.seek(0)
10073-        new_contents = req.content.read()
10074+        new_contents = MutableFileHandle(req.content)
10075         d = self.node.overwrite(new_contents)
10076         d.addCallback(lambda res: self.node.get_uri())
10077         return d
10078hunk ./src/allmydata/web/filenode.py 286
10079     def replace_my_contents_with_a_formpost(self, req):
10080         # we have a mutable file. Get the data from the formpost, and replace
10081         # the mutable file's contents with it.
10082-        new_contents = self._read_data_from_formpost(req)
10083+        new_contents = req.fields['file']
10084+        new_contents = MutableFileHandle(new_contents.file)
10085+
10086         d = self.node.overwrite(new_contents)
10087         d.addCallback(lambda res: self.node.get_uri())
10088         return d
10089hunk ./src/allmydata/web/unlinked.py 7
10090 from twisted.internet import defer
10091 from nevow import rend, url, tags as T
10092 from allmydata.immutable.upload import FileHandle
10093+from allmydata.mutable.publish import MutableFileHandle
10094 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
10095      convert_children_json, WebError
10096 from allmydata.web import status
10097hunk ./src/allmydata/web/unlinked.py 23
10098 def PUTUnlinkedSSK(req, client):
10099     # SDMF: files are small, and we can only upload data
10100     req.content.seek(0)
10101-    data = req.content.read()
10102+    data = MutableFileHandle(req.content)
10103     d = client.create_mutable_file(data)
10104     d.addCallback(lambda n: n.get_uri())
10105     return d
10106hunk ./src/allmydata/web/unlinked.py 87
10107     # "POST /uri", to create an unlinked file.
10108     # SDMF: files are small, and we can only upload data
10109     contents = req.fields["file"]
10110-    contents.file.seek(0)
10111-    data = contents.file.read()
10112+    data = MutableFileHandle(contents.file)
10113     d = client.create_mutable_file(data)
10114     d.addCallback(lambda n: n.get_uri())
10115     return d
10116}
10117
10118Context:
10119
10120[SFTP: don't call .stopProducing on the producer registered with OverwriteableFileConsumer (which breaks with warner's new downloader).
10121david-sarah@jacaranda.org**20100628231926
10122 Ignore-this: 131b7a5787bc85a9a356b5740d9d996f
10123] 
10124[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
10125david-sarah@jacaranda.org**20100625223929
10126 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
10127] 
10128[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
10129zooko@zooko.com**20100619034928
10130 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
10131] 
10132[trivial: tiny update to in-line comment
10133zooko@zooko.com**20100614045715
10134 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
10135 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
10136] 
10137[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
10138zooko@zooko.com**20100619065318
10139 Ignore-this: dc6db03f696e5b6d2848699e754d8053
10140] 
10141[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
10142zooko@zooko.com**20100619065124
10143 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
10144] 
10145[TAG allmydata-tahoe-1.7.0
10146zooko@zooko.com**20100619052631
10147 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
10148] 
10149Patch bundle hash:
10150e4cbef6fb4c558ccd34bb8199aee89a03cc77c4b