#597 new enhancement

add 'tahoe mirror' command, use backupdb

Reported by: warner Owned by:
Priority: major Milestone: undecided
Component: code-frontend-cli Version: 1.2.0
Keywords: usability backup Cc:
Launchpad Bug:

Description (last modified by warner)

It would be nice to have a CLI tool which does a minimum-effort copy, from local disk into a Tahoe directory. This tool should behave like "rsync -a --delete": when done, the target directory should look exactly like the local disk's directory. By running the tool on a periodic basis, the Tahoe directory will contain a single most-recent-version backup of the local disk.

tahoe mirror ~/music tahoe:my-music

To make this as fast as possible, the default mode will use two assumptions:

  • the target Tahoe directory has not been changed by other parties since the last invocation of 'tahoe mirror'
  • any file changes will modify either their timestamp or filesize

Both assumptions can be disabled with argv flags, at the expense of doing more work. If both assumptions are accepted, then a null backup should require no network traffic and no file reads (only directory reads).

This command would use the 'backupdb', stored in ~/.tahoe/private/backupdb, as discussed in this thread: http://allmydata.org/pipermail/tahoe-dev/2008-May/000620.html . The backupdb would allow the client to quickly determine which files and directories have already been copied.

The proposed "tahoe backup" command (#598) would use the same backupdb.

Attachments (1)

597-tahoesync-stub.diff (2.9 KB) - added by warner at 2009-01-29T20:07:19Z.
starting point, just a stub of the CLI command

Download all attachments as: .zip

Change History (7)

comment:1 Changed at 2009-01-29T20:05:59Z by warner

so, it looks like #598 ("tahoe backup": versioned shared backups) is more interesting right now, so I'll be working on it instead of "tahoe sync". To checkpoint my work so far: here's my pseudocode, and the stub of the CLI code.

create target directory

loop(localdir, target tahoe dir):
 fetch targetdir
 delete any:
  children that don't exist locally
  files that should be dirs
  dirs that should be files
 create missing dirs
 for each file in localdir:
  chk = upload(file) # uses backupdb to short-circuit
  if chk,metadata == targetdir[child]:
   continue
  else:
   set_child(child, chk+metadata)
 for each subdir in localdir:
  loop()

assuming upload() uses a backupdb successfully, a null backup with this
algorithm will read all targetdirs but will not upload or modify anything.

Changed at 2009-01-29T20:07:19Z by warner

starting point, just a stub of the CLI command

comment:2 follow-up: Changed at 2009-01-31T02:04:58Z by warner

after some discussion, we decided that "tahoe mirror" was a better name for this than "tahoe sync". "sync" implies bidirectionality, whereas "mirror" has a definite real-world side and looking-glass-world side.

So this ticket is about "tahoe mirror", which does whatever is necessary to make a target directory look like a source directory, without modifying the source directory. The "tahoe sync" idea (which makes the directories look the same, but is allowed to modify *both* directories) has been moved to #601.

comment:3 Changed at 2009-01-31T02:24:48Z by warner

  • Description modified (diff)
  • Summary changed from add 'tahoe sync' command, use backupdb to add 'tahoe mirror' command, use backupdb

oops, forgot to modify the ticket description

comment:4 in reply to: ↑ 2 Changed at 2009-03-11T02:14:08Z by stockrt

Warner, isn't this ticket about the functionality already provided by 'tahoe backup' #598?

Wouldn't be good to close this one?

comment:5 Changed at 2009-03-12T21:10:29Z by warner

stockrt: nope, "tahoe backup" is defined to create successive timestamped snapshots, whereas "tahoe mirror" is defined to create/modify a single snapshot.

After you've used "tahoe backup ... alias:Backups" daily for a few days, you'll have:

After you've used "tahoe mirror ... alias:Backups" daily for a few days (or a month, or just once), you'll have:

  • Backups/...

If you used "tahoe backup ... alias:Backups" and then ignored the Backups/Archives/? directory, you'd get the same thing as you'd get with "tahoe mirror ... alias:Backups/Latest". But someone who wants just the latest copy would 1) be annoyed by the old archives piling up and 2) would be annoyed by the extra "Latest/" subdirectory that they didn't ask for. That's why it seems like a separate command would be useful. (but, not as useful as "tahoe backup", which is why "tahoe mirror" got de-prioritized).

comment:6 Changed at 2010-03-25T02:02:58Z by davidsarah

  • Keywords usability backup added
Note: See TracTickets for help on using tickets.