#117 closed enhancement (fixed)

webapi for metadata in vdrive

Reported by: warner Owned by:
Priority: minor Milestone: eventually
Component: code-frontend-web Version: 0.5.0
Keywords: metadata Cc:
Launchpad Bug:

Description

For a backup application (in which the vdrive is used to mirror the filesystem that the user wants backed up), it would be useful to know quickly that a given file was already present in the vdrive at a given location. The DHT layer is responsible for efficiently handling files that are already present in the grid, but that still requires a good bit of encoding work (and convergence must be enabled). The goal here is to avoid getting to the encoding phase at all.

I'm thinking that the dirnode edges could contain some metadata that serves to roughly identify the file (like a quick hash of the file contents, or maybe just timestamp+length), and the backup process could refrain from re-pushing anything that looks like it hasn't changed. The exact criteria used would depend upon how the user wants to make the tradeoff between performance and robustness. If they're feeling casual and don't want to spend a lot of CPU or disk IO, then they set it to compare just timestamp+length. If they're paranoid that they've replaced one file on their disk with a different file of the exact same length (and also set the timestamp back to match), then they can set their node to use a full secure hash for comparison purposes. An intermediate position is possible too: in some circumstances, we could use a fast non-cryptographic hash here (perhaps just a CRC).

Note that the vdrive as it stands today has no concept of metadata: it is purely a graph of named edges connecting non-terminal dirnodes with terminal filenodes and other dirnodes. All of the usual disk-based filesystem metadata like ctimes and mtimes are left out, as well as even-less-appropriate metadata like "owner" and "permissions". A backup application, serving as a bridge between a disk-based filesystem and the vdrive, will want some of this metadata around (so that the app can restore the metadata as well as the data itself). This metadata doesn't necessarily have to go on the vdrive edges, though.. it depends upon how we want to build the backup app.

Also note that a backup application is not obligated to conform to the vdrive layer (although it would be handy for browsing if it did). The app could, for example, upload all files to the mesh, get their URIs, then write out a single big file with one pathname/metadata/URI tuple per line, and upload *that*, and remember its URI. (i.e. bypass the vdrive completely). It could also store the files in the vdrive normally, but stash just the metadata in a separately-uploaded file.

Attachments (3)

sha256sum.py (311 bytes) - added by zooko at 2007-08-20T21:11:04Z.
sha256sum.py
nullsum.py (256 bytes) - added by zooko at 2007-08-20T21:11:21Z.
nullsum.py
adler32sum.py (306 bytes) - added by zooko at 2007-08-20T21:11:34Z.
adler32sum.py

Download all attachments as: .zip

Change History (16)

comment:1 Changed at 2007-08-20T21:10:06Z by zooko

On my MacBook? Pro, adler32 (from Python's zlib module) hashes a 120-million-byte file in 0.475 seconds, and sha256 (from Python's hashlib module) hashes it in 1.717 seconds.

adler32 does a 734 million byte file in 2.646 seconds and sha256 does it in 10.166 seconds.

This kind of thing makes me wish we used the Tiger hash. The new implementation of Tiger hash in the crypto++ library is about 2.5 times as fast as sha-256.

http://cryptopp.com/benchmarks.html

That C++/assembly implementation of Tiger hashes 217 MiB/second on a modern desktop, which means that the limiting factors would really be disk speed and Python interpreter. For comparison, the Crypto++ implementation of SHA-256 does 81 MiB/second.

This makes me wonder how much overhead we have just to read in a file in Python. Here, the nullsum test just reads in the data. It processes the 734 million byte file in 2.224 seconds.

Changed at 2007-08-20T21:11:04Z by zooko

sha256sum.py

Changed at 2007-08-20T21:11:21Z by zooko

nullsum.py

Changed at 2007-08-20T21:11:34Z by zooko

adler32sum.py

comment:2 Changed at 2007-10-11T10:53:05Z by warner

I think that leveraging existing backup software will work pretty well if we can just preserve ctime/mtime/length and the usual unix/windows permission fields.

I'm currently thinking that the dirnode structure should be enhanced to record a dictionary of metadata next to the child URI.

comment:3 Changed at 2008-01-05T03:55:26Z by warner

  • Milestone changed from 1.0 to 0.9.0

comment:4 Changed at 2008-01-11T10:26:17Z by warner

we have a proposal for the metadata API floating around somewhere.. we should gather some more feedback from Mike and the others and then actually go and implement it.

The dirnode storage layout has a place for metadata already, as a JSON-encoded dictionary. We're just lacking the dirnode methods to modify it and the HTTP interfaces to drive them.

comment:5 Changed at 2008-01-23T02:23:45Z by zooko

  • Milestone changed from 0.9.0 (Allmydata 3.0 final) to 0.8.0 (Allmydata 3.0 Beta)

comment:6 Changed at 2008-01-31T01:47:51Z by warner

Here's the proposals that we threw together back in october.

d. metadata

The set of directories and files in a tahoe virtual drive forms a graph: each
directory is a node, each file is a leaf node, and each name that you use to
look at a child file or directory is an edge.

In Tahoe, all file metadata is kept on those edges. The filenode itself is
just a unadorned sequence of bytes. Everything else that is traditionally
associated with a "file" (filename, type, timestamps, notions like "owners"
and "permissions", etc) are recorded in the parent directory in the record
that points to the file's URI.

Note that this implies that two separate directories which link to the same
filenode will have completely independent metadata for that file.

Application programs can add arbitrary metadata to these edges. We are
establishing conventions for well-known metadata types, but apps are free to
add whatever other data they like. Backup applications which copy data from a
normal disk-based filesystem can use this metadata to record enough
information to recreate the original filesystem (using attributes like owner,
permissions, and timestamps).

This metadata is expressed as a dictionary, which maps ASCII strings to any
data type that can be serialized as JSON, which includes booleans, numbers
(ints and floats), strings (ascii or unicode), lists, and string-keyed
dictionaries. This does not include sets, so the convention is that sets are
expressed as lists with manual duplicate-pruning.

Metadata is retrieved by reading the dirnode using the "GET t=json" method
described below. Metadata is set by providing a dictionary of commands to the
"POST t=change-metadata" method also described below.


e. examining files or directories

  GET $URL?t=json

  This returns machine-parseable information about the indicated file or
  directory in the HTTP response body. The JSON always contains a list of two
  elements, and the first element of the list is always a flag that indicates
  whether the referenced object is a file or a directory. The second element
  is a dictionary of data about the object.

  If it is a file, then the information includes file size and URI, like this:

   [ 'filenode', { 'ro_uri': file_uri,
                   'size': bytes,
 } ]

  If it is a directory, then it includes information about the children of
  this directory, as a mapping from child name to a set of data about the
  child (the same data that would appear in a corresponding GET?t=json of the
  child itself, plus metadata). Like this:

   [ 'dirnode', { 'rw_uri': read_write_uri,
                  'ro_uri': read_only_uri,
                  'children': children } ]

  In the above example, 'children' is a dictionary in which the keys are
  child names and the values depend upon whether the child is a file or a
  directory:

   'foo.txt': [ 'filenode', { 'ro_uri': uri, 'size': bytes 
                              'ctime': 1192728870.6729021,
                              'tags': ['personal', 'pictures'], } ]
   'subdir':  [ 'dirnode', { 'rw_uri': rwuri, 'ro_uri': rouri } ]

  note that the value is the same as the JSON representation of the child
  object, except that directories do not recurse (the "children" entry of the
  child is omitted), and the value's second element contains any metadata for
  the child in addition to the URIs.

  The rw_uri field will be present in the information about a directory if
  and only if you have read-write access to that directory,

  So a complete example of the return value for 'GET $URL?t=json' for a
  directory could be:

[
 "dirnode", 
 {
  "rw_uri": "URI:DIR:pb:\/\/xextf3eap44o3wi27mf7ehiur6wvhzr6@207.7.153.180:56677,127.0.0.1:56677\/vdrive:gqu1fub33exw9cu63718yzx6gr", 
  "ro_uri": "URI:DIR-RO:pb:\/\/xextf3eap44o3wi27mf7ehiur6wvhzr6@207.7.153.180:56677,127.0.0.1:56677\/vdrive:3sjp7fwkrsojm9xsymtgkk5khh", 
  "children": {
   "subdir": [
    "dirnode", 
    {
     "rw_uri": "URI:DIR:pb:\/\/xextf3eap44o3wi27mf7ehiur6wvhzr6@207.7.153.180:56677,127.0.0.1:56677\/vdrive:wibsf933cn1y1zowume4uy1ajr", 
     "ro_uri": "URI:DIR-RO:pb:\/\/xextf3eap44o3wi27mf7ehiur6wvhzr6@207.7.153.180:56677,127.0.0.1:56677\/vdrive:97srnzip6muckorjopfe9gk8ae",
     "tags": ["personal"]
    }
   ], 
   "webapi.txt": [
    "filenode", 
    {
     "ro_uri": "URI:CHK:35p7umig6cc3omtn1go4j5nbgw:9ab9ek8b6nagrrk1g4nzepa4kgta9eh434o6shddxx9hfbrrydky:3:10:19653", 
     "size": 19653,
     "tags": ["work", "tahoe"],
     "ctime": 1192728870.6729021,
     "mtime": 1192728899.8800000
    }
   ], 
   "architecture.txt": [
    "filenode", 
    {
     "ro_uri": "URI:CHK:iah546ukk6eqntehth1s3ndeoh:it9o1k5db7mjwjbj3tdd7qjyghf3qjkq7ntjd6ad5e3ufii3uwwo:3:10:35769", 
     "size": 35769,
     "tags": ["work", "tahoe"],
     "ctime": 1192728870.6729021,
     "mtime": 1192728899.8800000
    }
   ]
  }
 }
]



k. modifying metadata

  POST $URL?t=change-metadata

  This operation is used to modify the metadata attached to a directory. The
  body of the POST method is a specially-formatted JSON data structure with
  commands to set and remove metadata keys. Because the body is not an
  encoded HTML form, this method is for use by programmatic clients, rather
  than a web browser. At present there is no provision for a web browser to
  modify metadata.

 API PROPOSAL ONE:

  The POST body is a JSON-encoded dictionary, with one member for each edge
  (file or subdirectory) that you want to modify. The key of this member is
  the child name (filename or subdir name), and the value associated with
  that key is a "command dictionary", which describes how you want to modify
  that edge's metadata.

  There are currently two commands defined in the command dictionary: "set"
  and "delete". The value associated with the "set" command is a dictionary
  that maps metadata-key to metadata-value. For each metadata key that
  appears in this dictionary, any existing metadata is replaced with the
  given value.

  The value associated with the "delete" key is a list of metadata key names.
  For each such name, the associated metadata is removed.

  For example, if the client wishes to set ctime and mtime, and remove an
  associated icon, it could use the following:

   {
     "webapi.txt": {"set": {"ctime": 1192728870,
                            "mtime": 1192728899, },
                    "delete": ["icon"],
                   },
   }

 API PROPOSAL TWO:

  The POST body is a JSON-encoded list, with one element for each
  modification you wish to make to the metadata. Each element is "command",
  encoded as as a list of length 3 (for "delete" commands) or 4 (for "set"
  commands).

  The "set" command provides the name of the child to be modified, the name
  of the metadata key to be modified, and the new value for that metadata
  key. The new value completely replaces the old one. For example:

    ["set", "webapi.txt", "tags", ["work", "tahoe"]]

  The "delete" command provides the child name and the name of the metadata
  key to be deleted:

    ["delete", "webapi.txt, "icon"]

  If the client wishes to set ctime and mtime, and remove any icon, it could
  use the following as the body of the POST request:

    [
     ["delete", "webapi.txt", "icon"],
     ["set", "webapi.txt", "ctime", 1192728870],
     ["set", "webapi.txt", "mtime", 1192728899],
    ]

the feedback we've received so far is from Mike:

The metadata seems to be very flexible.  I think that working with proposal
two is slightly easier, but I personally prefer "PROPOSAL ONE" because it's
less verbose, and I like how the metadata is grouped together logically by
command, according to the file node or directory node that it describes.
Plus, "PROPOSAL ONE" more closely resembles the format of the JSON returned
when you examine the file or directory with the GET $URL?t=JSON HTTP
request. 

It might be nice to have a way to set metadata for a file or directory node
in the same request that the node is created in too.

-Mike

comment:7 Changed at 2008-02-04T23:11:24Z by warner

  • Milestone changed from 0.8.0 (Allmydata 3.0 Beta) to 0.9.0 (Allmydata 3.0 final)
  • Summary changed from metadata in vdrive to improve backup-application performance to metadata in vdrive to record timestamps

We have three conceivable uses for this metadata. The first is to record enough information to allow a roundtrip (from a normal unix-ish filesystem, through tahoe, and back again) to restore most of the original file. This means a timestamp, at least, and maybe some mode information (the same sort of information that 'tar' records).

The second is a set of hashes, something application-specific, to allow a custom backup tool to be more efficient.

The third is arbitrary application data, perhaps tags or something.

The most important need right now is for the first use case (timestamps), so I'm retitling this ticket. We'll probably implement the framework for all three at the same time, so I'm not creating a separate ticket for the second two use cases.

comment:8 Changed at 2008-02-08T01:43:56Z by warner

It occurred to me today that a good starting point would just be to automatically add timestamps every time we do DirectoryNode.add_child. Rob pointed out that the algorithm could be:

 metadata['mtime'] = now
 if 'ctime' not in metadata:
  metadata['ctime'] = now

that would probably give us enough information for the FUSE plugins to show meaningful timestamps.

To allow 'cp -r' to look right, we'd still need to provide a way to set this metadata, of course. The tasks are:

  • present metadata in the JSON output
  • implement the webapi set-metdata call

comment:9 Changed at 2008-02-12T02:13:58Z by warner

  • Priority changed from major to minor

I've modified the dirnode methods to add metadata setters on add_node/add_uri, and arranged the defaults to automatically add ctime/mtime keys. I've also added metadata the the ?t=json output (i.e. zooko's "WAPI"), and to the human-oriented HTML web page (zooko's "WUI").

This is enough for front-ends to display timestamps, and might be sufficient for 1.0 . In the longer run, we still need:

  • web API setters for metadata, to allow things like 'cp -p' to preserve timestamps from some other filesystem
  • web API setters/modifiers for generalized metadata

comment:10 Changed at 2008-03-08T04:15:46Z by zooko

  • Milestone changed from 0.9.0 (Allmydata 3.0 final) to undecided

comment:11 Changed at 2008-06-01T20:55:11Z by warner

  • Summary changed from metadata in vdrive to record timestamps to webapi for metadata in vdrive

comment:12 Changed at 2009-03-08T22:09:53Z by warner

  • Component changed from code to code-frontend-web
  • Owner somebody deleted

comment:13 Changed at 2010-04-13T00:59:03Z by davidsarah

  • Keywords metadata added
  • Resolution set to fixed
  • Status changed from new to closed

This was fixed long ago.

Note: See TracTickets for help on using tickets.