#637 closed defect (fixed)

support "keep this much disk space free" on Windows as well as other platforms

Reported by: zooko Owned by: zooko
Priority: major Milestone: 1.6.0
Component: code-storage Version: 1.3.0
Keywords: win32 easy reviewed news-done Cc: tahoe-dev@…, kevan@…
Launchpad Bug:

Description (last modified by zooko)

This patch 2e45619844cb7c36 makes it so that if we don't know how to find out the free disk space on a platform, we fall back gracefully by display something like "I dunno" instead of the free disk space. We also already have code (before that patch) which falls back gracefully in the sense of emitting a warning at startup if the reserved_space feature has been configured (see [src:doc/configuration.txt]) and the current platform can't figure out how much free disk space it has.

However, rather than having such fallbacks, I would really rather just ensure that Tahoe figures out how much free disk space it has on all supported platforms. For Windows (the only platform which currently does not work using statvfs), the way to do that is to require pywin32, import win32api, and call win32api.GetDiskFreeSpaceEx()

http://docs.activestate.com/activepython/2.5/pywin32/win32api__GetDiskFreeSpaceEx_meth.html http://msdn.microsoft.com/en-us/library/aa364937(VS.85).aspx

Careful, don't call win32api.GetDiskFreeSpace() -- that's an old function that predates disks with more than 232 bytes: http://msdn.microsoft.com/en-us/library/aa364935.aspx

Attachments (3)

ticket637-untested.darcspatch.txt (34.5 KB) - added by davidsarah at 2009-10-26T02:08:21Z.
improved patch for ticket 637
ticket637-tested.darcspatch.txt (13.0 KB) - added by davidsarah at 2009-11-06T07:27:45Z.
Tested fix for #637 and #814
ticket637.darcspatch.txt (34.4 KB) - added by davidsarah at 2009-11-21T06:02:13Z.
storage server: detect disk space usage on Windows too (fixes #637)

Download all attachments as: .zip

Change History (43)

comment:1 Changed at 2009-04-01T14:21:41Z by zooko

  • Milestone changed from 1.4.0 to 1.4.1
  • Owner set to zooko
  • Status changed from new to assigned

comment:2 Changed at 2009-10-25T17:39:07Z by zooko

  • Description modified (diff)
  • Keywords easy added
  • Milestone changed from 1.6.0 to eventually

Any Windows hackers want to fix this? It really bothers me to see Windows getting second-class-citizen treatment, but I don't have convenient access to a Windows box right now to work on, and I'm prioritizing testing, packaging, and release-management for the imminent Tahoe-LAFS v1.6 release.

It really should be pretty easy. In fact, I'm marking it with the "easy" tag.

comment:3 Changed at 2009-10-25T17:40:41Z by zooko

  • Description modified (diff)

comment:4 Changed at 2009-10-25T22:28:23Z by davidsarah

  • Milestone changed from eventually to 1.6.0
  • Owner changed from zooko to davidsarah
  • Status changed from assigned to new

I'll do this.

comment:5 Changed at 2009-10-25T22:37:36Z by davidsarah

  • Status changed from new to assigned

Hmm. It is possible that the call to get disk stats might fail, for some other reason than the API not being available. What should happen in that case?

Is it OK to delete redundant or now-misnamed helper methods like do_statvfs and stat_disk in storage/server.py? (I'll keep get_available_space.)

comment:6 follow-ups: Changed at 2009-10-25T23:09:54Z by zooko

Hm, if the call to get disk stats were to fail, what should we do? I think I would suggest logging an error message and returning that there is zero disk space free.

It is okay (indeed encouraged) to refactor code to be cleaner whenever you make a change. (Part of the reason that this is okay is that each change is expected to have full test coverage.)

comment:7 in reply to: ↑ 6 Changed at 2009-10-26T00:54:03Z by davidsarah

In _auto_deps.py:

## The following block is commented-out because there is not currently a pywin32 package which
## can be easy_install'ed and also which actually makes "import win32api" succeed.  Users have
## to manually install pywin32 on Windows before installing Tahoe.
##import platform
##if platform.system() == "Windows":
##    # Twisted requires pywin32 if it is going to offer process management functionality, or if
##    # it is going to offer iocp reactor.  We currently require process management.  It would be
##    # better if Twisted would declare that it requires pywin32 if it is going to offer process
##    # management.  Then the specification and the evolution of Twisted's reliance on pywin32 can
##    # be confined to the Twisted setup data, and Tahoe can remain blissfully ignorant about such
##    # things as if a future version of Twisted requires a different version of pywin32, or if a
##    # future version of Twisted implements process management without using pywin32 at all,
##    # etc..  That is twisted ticket #3238 -- http://twistedmatrix.com/trac/ticket/3238 .  But
##    # until Twisted does that, Tahoe needs to be non-ignorant of the following requirement:
##    install_requires.append('pywin32')

Is this still a problem? Will things break if it is uncommented?

comment:8 Changed at 2009-10-26T01:02:23Z by davidsarah

The easy_install bug against pywin32: http://sourceforge.net/tracker/index.php?func=detail&aid=1799934&group_id=78018&atid=551954 isn't fixed. So I'll leave this as it is; the Windows build instructions at http://allmydata.org/trac/tahoe/wiki/InstallDetails already say to manually install pywin32.

comment:9 Changed at 2009-10-26T01:06:34Z by davidsarah

Gack, I'm suffering from random Vista brokenness (I think it is this issue: http://cygwin.com/ml/cygwin/2009-04/msg00252.html ) that's stopping me from testing this patch. Might be a couple of days before I can get round to it.

comment:10 Changed at 2009-10-26T01:28:09Z by zooko

Sigh. Waitaminute, why do you need cygwin to test your patch?

You are right to leave the pywin32 requirement commented-out for now.

comment:11 in reply to: ↑ 6 Changed at 2009-10-26T02:07:38Z by davidsarah

Replying to zooko:

Hm, if the call to get disk stats were to fail, what should we do? I think I would suggest logging an error message and returning that there is zero disk space free.

Done, and added a unit test for this.

Waitaminute, why do you need cygwin to test your patch?

Python depends on cygwin. Actually I'm not getting the same error any more, even though I all I did was build in a different bash prompt. (Doesn't it worry you when computers work nondeterministically?)

Changed at 2009-10-26T02:08:21Z by davidsarah

improved patch for ticket 637

comment:12 Changed at 2009-10-26T02:15:10Z by davidsarah

Ah, I have two different Python installations (cygwin Python 2.5 and native-Windows Python 2.6). That explains it.

comment:13 Changed at 2009-10-26T20:07:01Z by zooko

Okay so the current status is that we're waiting for unit tests, right?

comment:14 Changed at 2009-10-27T02:33:35Z by zooko

If you like this ticket you may also like #814 (v1.4.1 storage servers sending a negative number for maximum-immutable-share-size?).

comment:15 Changed at 2009-10-29T18:19:46Z by zooko

David-Sarah, are you planning to write a unit test for this?

comment:16 Changed at 2009-10-29T22:00:20Z by davidsarah

Yes. It would just be a simple sanity check that get_disk_stats doesn't fail and returns values greater than zero on all platforms.

I've been a bit swamped and haven't had time to debug the current patch yet; currently it is causing other storage server tests to fail. Should get round to it later today.

Changed at 2009-11-06T07:27:45Z by davidsarah

Tested fix for #637 and #814

comment:17 Changed at 2009-11-06T07:29:38Z by davidsarah

Sorry that took so long. The new patch is only tested on Windows (native); it needs review and testing on Unix and cygwin.

comment:18 Changed at 2009-11-07T04:06:33Z by zooko

  • Keywords review added

comment:19 Changed at 2009-11-08T03:34:37Z by davidsarah

  • Owner davidsarah deleted
  • Status changed from assigned to new

comment:20 Changed at 2009-11-18T06:19:54Z by kevan

I'll take a stab at reviewing this (I can at least test it on OS X). ticket637-tested.darcspatch.txt looks like the output of darcs diff, though -- darcs apply barfs on it, at least. Would you mind updating that to be something from darcs send?

comment:21 Changed at 2009-11-18T16:46:08Z by zooko

Here are instructions for how to generate patches: http://allmydata.org/trac/tahoe/wiki/Patches . Also if the patch submitter doesn't provide a darcs send-style patch but just a diff-style patch, you can always just apply it with patch. That loses some metadata such as the original patch author's name, and it doesn't work well for patches which mv files, rename tokens or otherwise do something fancy, but if it works it works.

Kevan: if you're planning to review this maybe "accept" the ticket so that it is assigned to you.

comment:22 Changed at 2009-11-18T17:28:03Z by kevan

  • Owner set to kevan
  • Status changed from new to assigned

comment:23 Changed at 2009-11-20T21:28:40Z by kevan

The tests work okay on my Mac.

I'm not sure how closely we adhere to PEP-8 (the CodingStandards page mentions it, but lots of code in tahoe-lafs ignores it), but a number of places in the patch violate it. In particular,

  • The stuff in _auto_deps.py
  • Line 45 of src/allmydata/storage/server.py
  • The docstring starting on line 160 of src/allmydata/storage/server.py.
  • The docstring starting on line 243 of src/allmydata/storage/server.py.

On lines 42 and 46 of src/allmydata/storage/server.py, you have Windows line endings instead of unix line endings. I'm not sure how big a deal this is, but most of the other code seems to use unix line endings.

I don't understand why we treat a failure in the win32api system call the same way as not having an API to determine disk usage. In the former case, we make the server read-only and report that it has no free space. In the latter case, we make the server writable and report that it has an arbitrary amount of free space. To me, these errors seem like two different expressions of the same problem -- that the server can't figure out how much free space there is. Why not treat them the same?

Or, put another way: If the user specifies a reserved space requirement, and we can't honor it (for whatever reason), we should make the server read only (or at least that seems more right to me).

(this isn't so much a critique of the patch -- indeed, this is existing behavior -- as a question that the patch makes me want to ask. Perhaps I should open another ticket for it?)

In any case, expressions of this issue are on lines 225 and 249 of src/allmydata/storage/server.py.

I like the tests -- I don't see any problems with them.

comment:24 Changed at 2009-11-20T21:29:20Z by kevan

  • Cc kevan@… added
  • Owner kevan deleted
  • Status changed from assigned to new

comment:25 Changed at 2009-11-21T00:18:53Z by davidsarah

For _auto_deps.py and the docstrings in server.py, I just followed the conventions already used in those files.

The line endings should indeed be fixed.

import win32api, win32con also violates PEP 8, although I can't actually see the reason for that style constraint.

Re: error handling -- I assume you mean, why don't we treat a failure in the system call in the same way as not having an API? The reason is that not having an API is not an indication that there is anything wrong; just that the platform has a limitation that means we can't implement space checking.

If the platform does have the API, OTOH, then we're supposed to be enforcing space checking, and an admin may be expecting that we do. So a system call failure should be treated as an error in that case, as suggested by Zooko in comment:6.

comment:26 Changed at 2009-11-21T02:44:50Z by kevan

That's right -- why aren't they treated the same. Sorry for the confusion.

I think I have a reply for your comment on error handling, but it is off topic -- your patch is consistent with the existing behavior of not enforcing space reservations when an API isn't available, and attempting to strongly enforce them on platforms that have an API. If I can articulate that into a good argument, I'll post it in another ticket.

In summary: with fixed windows line endings, and if the PEP8 leeway is okay, then this patch looks okay.

comment:27 Changed at 2009-11-21T04:30:39Z by zooko

  • Owner set to kevan

Thanks for the code review, Kevan! What about _auto_deps.py violates PEP-8?

I updated CodingStandards to explicitly reject PEP-8's rule about importing multiple modules on a single import line and PEP-257's rule about closing multi-line docstrings with \n\n"""\n.

David-Sarah: The patch description currently says:


q
ticket637

I.e., the first line is a blank line, the second line is 'q', and the third line is 'ticket637'. Please change it (by running darcs amend) to something like:

storage server: detect disk space usage on Windows too (fixes #637)

Then re-attach the patch with the new description. Actually if you prefer I can make that change when I commit your patch.

comment:28 Changed at 2009-11-21T04:38:34Z by kevan

From http://www.python.org/dev/peps/pep-0008/:

Maximum Line Length

Limit all lines to a maximum of 79 characters.

and

For flowing long blocks of text (docstrings or comments), limiting the length to 72 characters is recommended.

comment:29 Changed at 2009-11-21T05:05:49Z by zooko

Okay I've updated CodingStandards to add this pearl of wisdom:

  • Ignore the part of PEP-8 which specifes 79- or 72- char line widths. Use whatever line-widths look best to you in your editor on your screen. Please don't change other people's line-widths unnecessarily, making it look better on your display and worse on theirs. That's not nice. (Note: unless you are Brian. Brian can do anything he wants!)

Therefore David-Sarah's original patch is now in compliance with our coding standards. I'll amend it to have a nice patch description and I'll commit it.

comment:30 Changed at 2009-11-21T05:06:04Z by zooko

  • Keywords reviewed added; review removed
  • Owner changed from kevan to zooko

comment:31 Changed at 2009-11-21T05:13:19Z by davidsarah

We're already way over 79 characters for many lines in the existing source.

Actually I think (although this is probably the wrong place to have the discussion) that the 79-character limit is completely obsolete for modern screens and programming languages. Consider lines such as

stats['storage_server.disk_free_for_nonroot'] = disk['free_for_nonroot']

or

storagedir = os.path.join(self.sharedir, storage_index_to_dir(storage_index))

Wrapping them would just make them less readable, not more.

OTOH, 100 characters is a reasonable limit, so I'll try to reformat the patch to satisfy that.

Apologies for the darcs patch description; darcs dumped me into an editor that was using the wrong code page, so I got a screenful of mojibake and had to type the description blind.

comment:32 Changed at 2009-11-21T05:18:39Z by kevan

Okay, cool. I don't really care either way about the line widths; it was intended as an informational item more than anything else. :)

Changed at 2009-11-21T06:02:13Z by davidsarah

storage server: detect disk space usage on Windows too (fixes #637)

comment:33 Changed at 2009-11-21T06:03:01Z by davidsarah

Line widths fixed, like it or not :-) Also the line endings and patch description.

comment:34 follow-up: Changed at 2009-11-28T21:55:28Z by warner

It looks like the clause at line 229 (in get_stats) will put a negative number into storage_server.disk_avail, while I think it should be clamped at zero. Likewise, get_available_space and thus remote_get_version will see negative numbers. If #814 is supposed to be subsumed by this ticket, we'll need to fix that.

Maybe get_disk_stats should clamp the avail value at zero before returning anything?

comment:35 in reply to: ↑ 34 ; follow-up: Changed at 2009-11-29T00:47:39Z by davidsarah

Replying to warner:

Maybe get_disk_stats should clamp the avail value at zero before returning anything?

It already does; see server.py line 205.

Also see test_disk_stats in test_storage.py, line 481, although that obviously only tests the result of one call.

comment:36 in reply to: ↑ 35 Changed at 2009-11-29T00:58:03Z by davidsarah

Replying to davidsarah:

Also see test_disk_stats in test_storage.py, line 481, although that obviously only tests the result of one call.

Actually this doesn't test the clamp on avail because it uses reserved_space=0. But test_disk_stats_avail_nonnegative just after it does test that avail is clamped to zero.

comment:37 follow-up: Changed at 2009-11-30T20:05:01Z by warner

  • Resolution set to fixed
  • Status changed from new to closed

Doh, I don't know how I missed that.. must have been sleepy.

Excellent patch! Committed, in ef002c935a15eb76. The only change I made (in a second patch) was to the comment about readonly_storage causing avail<=0. The original was correct: it always causes avail=0.

thanks!

comment:38 in reply to: ↑ 37 Changed at 2009-11-30T23:45:15Z by davidsarah

Replying to warner:

The only change I made (in a second patch) was to the comment about readonly_storage causing avail<=0. The original was correct: it always causes avail=0.

You mean server.py line 326? It was remaining_space <= 0, and my change was correct. remaining_space is >= 0 or None at line 309, but notice line 315, remaining_space -= self.allocated_size().

comment:39 Changed at 2009-12-01T02:53:32Z by warner

Gahh.. you're absolutely right. I've reverted to your version. And I'm going to stop trying to fix other people's fixes now.

(actually, I recently learned a term for these inevitable bugs in correcting-someone-else patches: Muphry's Law)

comment:40 Changed at 2010-02-02T05:51:37Z by davidsarah

  • Keywords news-done added
Note: See TracTickets for help on using tickets.