#729 closed defect (fixed)

Tahoe backup should WARN and go on when finding errors like: links to deleted files or access/read permission denied in local files/directories

Reported by: stockrt Owned by: francois
Priority: major Milestone: 1.6.0
Component: code-frontend-cli Version: 1.4.1
Keywords: reviewed tahoe-backup symlink permissions reliability news-done Cc: stockrt@…
Launchpad Bug:

Description (last modified by warner)

I am facing this problem when backing up my files:

stockrt@host ~/ $ tahoe backup -v stockrt tahoe:backups/stockrt

Traceback (most recent call last):
  File "/usr/bin/tahoe", line 8, in <module>
    load_entry_point('allmydata-tahoe==1.4.1', 'console_scripts', 'tahoe')()
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/runner.py", line 91, in run
    rc = runner(sys.argv[1:])
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/runner.py", line 78, in runner
    rc = cli.dispatch[command](so)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/cli.py", line 424, in backup
    rc = tahoe_backup.backup(options)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 374, in backup
    return bu.run()
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 216, in run
    new_backup_dircap = self.process(options.from_dir, latest_backup_dircap)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process
    newchilddircap = self.process(childpath, oldchildcap)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process
    newchilddircap = self.process(childpath, oldchildcap)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process
    newchilddircap = self.process(childpath, oldchildcap)
  File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 272, in process
    raise BackupProcessingError("Cannot backup this file %r" % childpath)
allmydata.scripts.tahoe_backup.BackupProcessingError: Cannot backup this file 'stockrt/Roger/etc/namedb'

stockrt@host ~/ $ ll stockrt/Roger/etc/namedb
lrwxrwxrwx 1 stockrt stockrt 21 2009-01-29 23:00 stockrt/Roger/etc/namedb -> /var/named/etc/namedb

This destination does not exists in my local filesystem: /var/named/etc/namedb but I have this (broken) link in my local "to backup" directory: stockrt/Roger/etc

I think Tahoe should warn it but do not break just because of it.

The same should occur for read/access permission denied for files and directories.

Regards,

Rogério Schneider

Attachments (3)

tahoe-backup-tolerate-errors.dpatch (38.4 KB) - added by francois at 2010-01-10T01:57:39Z.
tahoe-backup-tolerate-errors-v2.dpatch (46.3 KB) - added by francois at 2010-01-19T22:58:46Z.
tahoe-backup-tolerate-errors-v3.dpatch (50.7 KB) - added by francois at 2010-01-20T09:49:17Z.

Download all attachments as: .zip

Change History (21)

comment:1 Changed at 2009-07-11T11:32:25Z by warner

  • Component changed from unknown to code-frontend-cli
  • Description modified (diff)

reformatted traceback

comment:2 Changed at 2009-12-07T02:40:02Z by davidsarah

  • Keywords tahoe-backup added; backup file directory removed
  • Priority changed from trivial to major

comment:3 Changed at 2009-12-07T02:40:56Z by davidsarah

  • Keywords reliability added

comment:4 Changed at 2009-12-21T00:28:45Z by davidsarah

  • Keywords symlink permissions added; link permission denied not found removed

The failure for dangling symlinks is #641. (This ticket is not quite a duplicate because it also asks to tolerate other failures.)

comment:5 Changed at 2010-01-10T01:35:31Z by francois

  • Owner changed from nobody to francois
  • Status changed from new to assigned

comment:6 Changed at 2010-01-10T02:03:14Z by francois

  • Keywords review-needed added

This patch renders tahoe backup more tolerant to special or unreadable files. A warning is displayed on stderr for each file which was skipped.

I think it's a good first step which allows tahoe backup to work out of the box for most users before it can store all those Unix specialties (see ticket #641).

comment:7 Changed at 2010-01-10T02:09:25Z by francois

  • Milestone changed from undecided to 1.6.0

comment:8 Changed at 2010-01-10T06:51:57Z by davidsarah

+    def test_ignore_symlinks(self):
+        if not hasattr(os, 'symlink'):
+            raise unittest.SkipTest("There is no symlink on this platform.")

Vista has symbolic links (NTFS junction points), but Windows Python probably doesn't have os.symlink yet. There is a mklink command that could be used to test behaviour with junction points. I don't think that the lack of a test on Windows should hold up this fix, though.

comment:9 Changed at 2010-01-10T06:58:04Z by davidsarah

Wouldn't it be better to catch the exception caused by an access error, rather than testing using os.access etc?

comment:10 Changed at 2010-01-10T08:13:02Z by warner

I agree, although I'd also like to avoid even trying to open some things. It would be pretty unfortunate to discover a copy of /dev/zero or /dev/urandom lying in your backup source tree.

Our discussions on #833 (dealing with illegal mutable children inside an immutable directory) led us to want "tahoe cp" to warn-and-skip "bad" things found on the tahoe side, and we figured that the same policy would be appropriate for "bad" things found on the local disk side (specifically symlinks and device files). I think it'd be reasonable to have "tahoe backup" do the same.

If so, the way I'd do it would be to have both "tahoe backup" and "tahoe cp" test the potential local source filesystem object with os.path.isfile and os.path.isdir before opening or entering it. These tests should rule out special files and symlinks (whether dangling or not). Skipping symlinks to directories is more important than skipping symlinks to files, but I think it'd be ok to skip both unless/until we come up with some reasonable way to record these sorts of things in tahoe directly.

If these tests pass, then we try to open the thing, and if that raises EnvironmentError (as would be caused by permission problems), we should emit a warning and skip over it. It might be nice to count how many such warnings we emit and dump a summary at the end, so that folks using --verbose never get to see that something went wrong. Also, we should consider having the "tahoe backup" exit code be non-zero if anything was skipped due to errors or non-normal-fileness.

comment:11 Changed at 2010-01-19T22:58:26Z by francois

Thanks Brian and David-Sarah for the comments, here is a newer version of the patch which takes your comments into account. What do you think now?

comment:12 follow-up: Changed at 2010-01-20T00:50:54Z by davidsarah

Looks good, just some nitpicks:

  • why except OSError: when listing a directory, but except IOError: when processing a file?
  • typo in the comment for upload: "when call" should be "when called"

comment:13 in reply to: ↑ 12 Changed at 2010-01-20T08:43:41Z by francois

Replying to davidsarah:

  • why except OSError: when listing a directory, but except IOError: when processing a file?

It seems to be the way the Python stdlib works. But as Brian mentioned, I should probably catch the parent exception EnvironmentError instead in both cases.

  • typo in the comment for upload: "when call" should be "when called"

Thanks.

I'll send an updated patch.

comment:14 Changed at 2010-01-24T21:23:18Z by francois

Brian, Zooko, do you feel comfortable commiting this patch before 1.6?

comment:15 Changed at 2010-01-24T21:37:53Z by davidsarah

  • Keywords reviewed added; review-needed removed

Reviewed, can't see any problems. Tests look sufficient.

comment:16 Changed at 2010-01-26T15:32:50Z by zooko

  • Resolution set to fixed
  • Status changed from assigned to closed

applied in b03406af9d216278. Thank you very much!

comment:17 Changed at 2010-01-27T20:13:28Z by warner

I kinda think we should also be skipping symlinks in general (not just dangling ones). The os.path.isdir and os.path.isfile tests will return True for symlinks that point to directories and files. So if we want to skip symlinks, those tests need to turn into if os.path.isdir() and not os.path.islink(), etc.

comment:18 Changed at 2010-02-02T06:03:51Z by davidsarah

  • Keywords news-done added
Note: See TracTickets for help on using tickets.