#771 closed defect (fixed)

tahoe ls doesn't work on files

Reported by: kevan Owned by: nobody
Priority: major Milestone: 1.6.0
Component: code-frontend-cli Version: 1.4.1
Keywords: tahoe-ls Cc:
Launchpad Bug:

Description

If I have an immutable file file2 at testdir/file2 on my remote filesystem, and do tahoe ls testdir/file2, I get

Traceback (most recent call last):
  File "/Users/kacarstensen/Documents/code/tahoe/support/bin/tahoe", line 8, in <module>
    load_entry_point('allmydata-tahoe==1.4.1-r4026', 'console_scripts', 'tahoe')()
  File "/Users/kacarstensen/Documents/code/tahoe/src/allmydata/scripts/runner.py", line 91, in run
    rc = runner(sys.argv[1:])
  File "/Users/kacarstensen/Documents/code/tahoe/src/allmydata/scripts/runner.py", line 78, in runner
    rc = cli.dispatch[command](so)
  File "/Users/kacarstensen/Documents/code/tahoe/src/allmydata/scripts/cli.py", line 412, in list
    rc = tahoe_ls.list(options)
  File "/Users/kacarstensen/Documents/code/tahoe/src/allmydata/scripts/tahoe_ls.py", line 70, in list
    childtype = child[0]
KeyError: 0

If I do tahoe ls --json testdir/file2, I get

[
 "filenode", 
 {
  "mutable": false, 
  "ro_uri": "URI:LIT:a_ro_cap", 
  "size": 13
 }
]

If I do tahoe ls --json testdir/, I get

[
 "dirnode", 
 {
  "rw_uri": "URI:DIR2:rw_cap", 
  "verify_uri": "URI:DIR2-Verifier:verify_cap", 
  "ro_uri": "URI:DIR2-RO:ro_cap", 
  "children": {
   "dir2": [
    "dirnode", 
    {
     "mutable": true, 
     "verify_uri": "URI:DIR2-Verifier:verify_cap", 
     "rw_uri": "URI:DIR2:rw_cap", 
     "ro_uri": "URI:DIR2-RO:ro_cap", 
     "metadata": {
      "ctime": 1248055314.2892411, 
      "tahoe": {
       "linkmotime": 1248055314.2892411, 
       "linkcrtime": 1248055314.2892411
      }, 
      "mtime": 1248055314.2892411
     }
    }
   ], 
   "test.pdf": [
    "filenode", 
    {
     "mutable": false, 
     "verify_uri": "URI:CHK-Verifier:verify_cap", 
     "metadata": {
      "ctime": 1246685054.166369, 
      "tahoe": {
       "linkmotime": 1246685054.166369, 
       "linkcrtime": 1246685054.166369
      }, 
      "mtime": 1246685054.166369
     }, 
     "ro_uri": "URI:CHK:ro_cap", 
     "size": 700145
    }
   ], 
   "file2": [
    "filenode", 
    {
     "mutable": false, 
     "metadata": {
      "ctime": 1247192345.969193, 
      "tahoe": {
       "linkmotime": 1247192345.969193, 
       "linkcrtime": 1247192345.969193
      }, 
      "mtime": 1247192345.969193
     }, 
     "ro_uri": "URI:LIT:ro_cap", 
     "size": 13
    }
   ]
  }, 
  "mutable": true
 }
]

If I do tahoe ls testdir/, it works as normal.

I think the problem is that we haven't correctly implemented parsing for GETing a filecap. Instead of doing, as in source:/src/allmydata/scripts/tahoe_ls.py#L55,

    children = {childname: d}

thus losing the child[0] item that the loop is looking for, we should do something like

    children = {childname: [nodetype, d]}

Unfortunately, this too fails because the loop wants to find metadata that isn't there.

I think the best solution here would be (one way or another) to follow source:/docs/frontends/webapi.txt#L402 -- that is, to GET the file's information using the containing directory's dircap, rather than the filecap of the file. That way, we have the metadata that we want for, e.g., tahoe ls -l. I'm just kind of stuck on the best way to do that.

The most workable idea I've come up with is to check the results from the first GET, and, if they're for a filenode without metadata, attempt to re-fetch them by using get_alias on the parent directory (which we could infer by parsing the where option. This seems kind of roundabout, though -- it seems like it'd be easier to determine, before the initial GET, whether the rootcap returned from get_alias corresponded to a directory or a file, and do that check there. I'm just not sure if there's an easy way to do that, hence this long-winded explanation. If there's a more obvious way, I'm open to that, too.

I'm attaching a patch with a test that demonstrates this bug.

Attachments (3)

ls_test.txt (31.4 KB) - added by kevan at 2009-07-24T05:49:25Z.
fixes.txt (3.8 KB) - added by kevan at 2009-08-08T02:00:24Z.
tests.txt (3.4 KB) - added by kevan at 2009-08-12T01:56:13Z.

Download all attachments as: .zip

Change History (16)

Changed at 2009-07-24T05:49:25Z by kevan

comment:1 Changed at 2009-08-02T20:33:25Z by kevan

Hm. If I look at the cap for the test case above, it is actually a dircap, i.e., if I do tahoe ls testdir/file2, the GET request sent to the server looks like

http://127.0.0.1:3456/uri/URI%3ADIR2%3Adircap/testdir/file2?t=json

which contains a read-write dircap. By the webapi docs (http://allmydata.org/trac/tahoe/browser/docs/frontends/webapi.txt#L398), the results of this GET should include metadata, but they don't. This seems like a bug -- is there any place in particular that I might look to find it?

comment:2 Changed at 2009-08-07T23:51:41Z by zooko

  • Keywords review added

Adding keyword "review" because Kevan needs somebody to look at this. Kevan: is ls_test.txt ready for inclusion, you think? (by marking failure as TODO)

comment:3 Changed at 2009-08-08T01:59:51Z by kevan

I think I've solved the issue.

When you make a GET request with a dircap and a file-terminated path after it, the childFactory method (first of URIHandler in source:src/allmydata/web/root.py, then of DirectoryNodeHandler in source:src/allmydata/web/directory.py) will traverse directories in the path until it finds the file that you're looking for: it then instantiates a FileNodeHandler with the found node, parentnode, name, and some other stuff. FileNodeHandler doesn't seem to make a distinction between being instantiated in this way (with a parent directory node from which it can get metadata) and being instantiated without one (which I assume means that it is the result of a GET request on the specific filecap), and just returns JSON without the metadata.

To fix this, I altered FileJSONMetadata to take parentnode and name as optional parameters. If they're set, FileJSONMetadata will get the file's metadata from the parent node, and then return it along with the other metadata. This makes the webapi behave as expected (and as specified). I also modified source:src/allmydata/scripts/tahoe_ls.py to provide a consistent children dictionary, regardless of whether or not the listed object is a filenode or a dirnode; since it uses the rootcap to make requests, we don't need to make any more modifications to it. I've also written (revised) tests for the list and webapi behavior that I've modified.

zooko: No, if you're including any tests, you should include the one from tests.txt.

Changed at 2009-08-08T02:00:24Z by kevan

comment:4 Changed at 2009-08-08T14:29:05Z by zooko

  • Owner changed from nobody to warner

Brian: could you review this?

comment:5 Changed at 2009-08-08T23:55:10Z by kevan

The one part of the tests that I think could use some improvements is the check for metadata. Rather than check to see if the metadata returned it correct for bar.txt, I simply check to make sure that it exists. If there's an easy way of figuring out what the metadata for bar.txt should be (either by setting it to something, or something else), and then checking for that, we might want to revise that predicate to actually check the metadata.

Changed at 2009-08-12T01:56:13Z by kevan

comment:6 Changed at 2009-08-12T01:57:00Z by kevan

I'm attaching reworked tests that I'm happy with. I chose not to check mtime because, from reading the description of metadata in webapi.txt, I get the idea that the other tests in test_web.py may cause mtime to differ from the recorded value, while ctime should probably stay the same. Let me know if this is off-base.

comment:7 Changed at 2009-08-22T19:43:22Z by warner

  • Component changed from unknown to code-frontend-cli
  • Status changed from new to assigned

comment:8 Changed at 2009-10-25T18:11:42Z by zooko

  • Owner changed from warner to nobody
  • Status changed from assigned to new

comment:9 Changed at 2009-11-20T19:23:03Z by warner

#457 was a duplicate of this one, and has been closed.

comment:10 Changed at 2009-12-06T17:11:47Z by davidsarah

  • Keywords tahoe-ls added

comment:11 Changed at 2009-12-21T02:53:12Z by davidsarah

  • Keywords review-needed added; review removed

comment:12 Changed at 2009-12-27T23:28:40Z by warner

  • Milestone changed from undecided to 1.6.0
  • Resolution set to fixed
  • Status changed from new to closed

Good work! I've applied your patch (with some modifications) in a8a768ef9dbd2ffa. I changed the FileNodeHandler interface to accept metadata rather than parentnode+childname, in the spirit of POLA. The tests were great! I added a few more in 00d0ca3902b87cbe, in particular to make sure that 'tahoe ls FILECAP' works.

I'm still thinking about changing it further, because the current form will cause two separate directory reads (one to find the child file object, a second to fetch its edge-metadata), and really we should be doing just one. But that will take some larger changes to the whole DirectoryNodeHandler architecture, and probably a new dirnode API (to grab both child object and edge metadata at the same time).

But for now, this ticket is closed.

comment:13 Changed at 2010-01-29T20:51:53Z by davidsarah

  • Keywords review-needed removed
Note: See TracTickets for help on using tickets.