#239 closed defect (fixed)

document new logging system

Reported by: zooko Owned by: warner
Priority: major Milestone: 1.3.0
Component: code-nodeadmin Version: 0.7.0
Keywords: Cc:
Launchpad Bug:

Description (last modified by warner)

There is a nice new logging tool in foolscap 0.2.2 which we use in tahoe 0.7.0.

Unfortunately the way to use it currently requires a couple of manual steps:

  1. pass FLOGFILE=flog.pickle.bz2 and FLOGTWISTED=1 environment variables to the tahoe process
  1. find the resulting flog.pickle.bz2 file (in your ~/.tahoe directory)
  1. run flogtool dump flog.pickle.bz2 on it to get a text output

3.b. alternately, run flogtool web-viewer flog.pickle.bz2 on it to get a nice HTML view with indentation and coloration

We should probably spend some time trying to automate or at least document this process so that people who use tahoe v0.7.0 will have good logging.

Change History (17)

comment:1 Changed at 2008-01-01T22:44:14Z by zooko

Actually I think the current logging system already works. I need to double-check this.

comment:2 Changed at 2008-01-05T05:59:51Z by warner

  • Description modified (diff)

updated to use FLOGFILE= and to use .bz2-compressed event pickles.

The intention for production nodes is to use the "log gatherer" feature to automatically send all log events to the central gatherer.

Eventually, my plan is to have one logfile to which major events are written, and then have minor events only get written (to the pickle) when things go wrong. But we're only halfway there, and I fear that we've got the wrong half done so far, resulting in fewer log messages recorded rather than too many.

comment:3 Changed at 2008-01-05T06:02:04Z by warner

  • Description modified (diff)

oops, update description again.

Also note that the FLOGFILE= and FLOGTWISTED= environment variables are mainly intended to get foolscap log events recorded during unit tests. But they're still useful for actual application runs until I make the same functionality available to the tahoe startup code. At that point, the node will initialize foolscap logging itself, rather than requiring this environment-variable end-run.

comment:4 Changed at 2008-01-05T20:33:48Z by zooko

  • Owner changed from zooko to warner

So, what's the plan for logging for v0.7.0? I'm not sure, but I think that what we should do is change bin/tahoe to set those environment variables on by default.

comment:5 Changed at 2008-01-06T07:25:04Z by zooko

Running the full unit tests with FLOGFILE=flog.pickle.bz2 and FLOGTWISTED=1 takes 2m25s and seems to run my Macbook Pro's fan a lot, where running the tests without takes only 1m25s. The resulting flog.pickle.bz2 is 2 MB. So I can see why you don't want to enable this by default.

I will document something about this somewhere -- I think in the soon-to-be-created doc/using.html.

comment:6 Changed at 2008-01-08T17:06:10Z by zooko

  • Milestone changed from 0.7.0 to 0.7.1

comment:7 Changed at 2008-01-23T02:40:48Z by zooko

  • Milestone changed from 0.7.1 to 0.8.0 (Allmydata 3.0 Beta)

comment:8 Changed at 2008-02-28T12:07:42Z by zooko

  • Summary changed from automate new logging system to document new logging system

Write a document explaining how log gathering works and how to interpret the resulting logs. My current open question is what, exactly, determines the indentation in the web-view mode. I think it is that indented things are "caused by" their outdented parent, and that for all things caused by the same parent they are listed in time order. I'm not entirely sure where the time stamps come from -- the node whose nodeid appears at the beginning of the line, or the log gatherer, or other.

And of course the meaning of "caused by" could use some explanation. :-)

comment:9 Changed at 2008-02-28T14:01:43Z by zooko

My next question is: How does one find the "nodeid" field when using the web view? I found these lines in the web view, and I would like to know what node said that:

  • 17:52:16 0.3160 [26284]: AssistedUploader starting
    • 17:52:16 0.3160 [26286]: contacting helper..
    • 17:52:16 0.4360 [26287]: helper says we need to upload
    • 17:52:17 0.0570 [26293]: upload finished, building readcap

comment:10 Changed at 2008-03-08T02:35:00Z by zooko

  • Milestone changed from 0.8.0 (Allmydata 3.0 Beta) to 0.9.0 (Allmydata 3.0 final)

comment:11 Changed at 2008-03-08T07:28:33Z by warner

The answer to that last question is that the nodeid wasn't displayed because of an oversighithe log call. It has since been fixed.

The answer to the first question (which of course needs to go into the foolscap docs) is that there is a rough parent/child relationship defined between log events that is controlled by the parent= argument to the log.msg() call. The intention is to group related messages together, so that the steps of some process do not obscure the start of the process. Some day, I want the viewing tools to have little expander triangles, and for the subordinate messages to be hidden until you want to drill down.

Within a group, I think we're sorting by origin-tub timestamp, but I could be wrong: the code records both an origin timestamp and a receipt-by-gatherer timestamp. Since groups are always coming from the same tub (at least until we start doing Causeway-style remote ordering), it never actually makes a difference.

comment:12 Changed at 2008-03-12T20:14:11Z by zooko

  • Milestone changed from 0.9.0 (Allmydata 3.0 final) to 0.10.0

Oh well -- this isn't really a blocker for v0.9.0 release. It would just be nice to have sooner or later.

comment:13 Changed at 2008-05-29T22:19:24Z by warner

  • Milestone changed from 1.1.0 to 1.2.0

comment:14 Changed at 2008-08-30T12:51:09Z by zooko

I recently tried to use the foolscap logging system to figure out why one of allmydata.com's servers had failed while Brian was on vacation. It was extremely frustrating, and although I was more or less able to do it, I rather feel that I "got lucky" and I do not feel confident that I could do it again if Brian went on vacation again.

Brian has made a few improvements to the foolscap logging system after that incident, but as far as I know there is still no introductory document: "How To Use Foolscap Logging", or "How To Use Tahoe Logging".

It could be that such a document would mostly just be a pointer into other existing documentation of which I am currently unaware.

To close this ticket, write such a document for Tahoe users.

comment:15 Changed at 2008-09-03T12:17:54Z by zooko

  • Milestone changed from 1.4.0 to 1.3.0

Maybe this should go into 1.3.0.

comment:16 Changed at 2008-09-04T00:29:27Z by warner

af07b9cad6c41255 adds this document, in docs/logging.txt . Some further improvements to be made:

  • move some of the text into foolscap itself
    • add a detailed 'flogtool' reference to foolscap
  • either here or in foolscap, address the specific "what does this output mean" questions that Zooko raised above

I'm not sure if these improvements are important enough to make this 1.3.0 material, or if we should rename this ticket (to make it more specifically about these remaining improvements) and push it out past 1.3.0 .

Zooko, please read over the new document and let me know if it addresses your goals.

comment:17 Changed at 2009-02-07T19:43:38Z by zooko

  • Resolution set to fixed
  • Status changed from new to closed

It is good!

Note: See TracTickets for help on using tickets.