Opened at 2010-11-24T00:00:11Z
#1269 new enhancement
add tcpdump data to viz tool
Reported by: | warner | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-encoding | Version: | 1.8.0 |
Keywords: | performance | Cc: | |
Launchpad Bug: |
Description
as mentioned in comment:92:ticket:1170 in the context of the not-yet-landed #1200 viz tool:
These visualization tools are a lot of fun. One direction to explore is to record some packet timings (with tcpdump) and add it as an extra row: that would show us how much latency/load Foolscap is spending before it delivers a message response to the application.
The idea would be to start a tcpdump process just before starting a download, then run a tool over the output to extract just the relevant packets (actually you'd want a tool that starts by asking the tahoe client for a list of its connections, to get the port numbers, then runs tcpdump itself with the right filter arguments). You'd store some condensed form of the output (maybe a pickled list of timestamps) in a directory where web/status.py could find it. Then status.py would serve packet timestamps in the same JSON bundle as the other download events (in particular the tx/rx of data-block requests). These packet timestamps would then be shown on the same chart as the application-level requests.
(another thought is to have the tcpdump process publish its data over HTTP, and put a box on the viz page to paste in the URL of that process, so it can fetch the data itself. This requires a browser that allows CORS (also see here), but that dates back to Firefox 3.5 and maybe IE7).
The goal would be to eyeball how much overhead is coming from Foolscap and the network layer. Even though the data inside the SSL connections would be opaque to tcpdump, all we really care about is the timing. It should also be possible to see how multiple small messages are combined into a single packet (Nagle), and maybe how a small message gets stalled behind some other large messages (head-of-line-blocking). Contention between parallel requests to multiple servers might also show up here.
It would be great to be able to do this on the server side as well, and get a sense for how the delay is divided between the outbound network trip, the server's internal processing, and the return network trip. Of course, this assumes synchronized clocks, but perhaps the tcpdump-running tool could exchange a couple of packets with timestamps before the download starts, a sort of cheap stripped-down NTP, and apply the offset to the resulting packet trace.