[tahoe-dev] memory usage experiment

Wed Jan 16 16:05:35 PST 2008

Folks:

Over the last couple of days I performed a memory usage experiment on  
my Toshiba linux laptop ("zaula") which has only 507 KB of RAM.

I turned off swap, and set sys.vm.overcommit_memory to "2", and  
sys.vm.overcommit_ratio to "90".  (Later I changed it to "99".)

Then I ran a small tahoe grid, with the introducer and one node on a  
different machine, and with a number of storage servers on zaula  
(between 3 and 8).  I used top to examine the total memory free as  
well as the virtual and resident memory usage reported by each process.

I determined that the "total memory free" would drop by between 17  
and 24 MB when I start up a node, which correlates exactly with the  
resident size reported by top for that process, but doesn't correlate  
well with the virtual size.
Then I determined that if I have about 247 MB of memory free, I can  
upload a file (400 MB in size) to this grid (from the client on the  
other machine), but that if I have only 235 MB of memory free, that I  
get MemoryErrors when attempting to do that upload.  This implies  
that it takes about 30 MB per node (in addition to the memory  
required to start the node) in order to use it as a storage server  
for this kind of upload.  This is also exactly correlates with the  
increased resident sizes of the python processes during the upload.

Then I learned that the "total memory free" would increase by between  
39 and 52 MB when I stop each node, which correlates exactly with the  
resident size but not with the virtual size.  This shows that each  
node was at about 22 or 28 MB above start-up weight, after the upload  
was finished.

This correlation of resident size with amount used and amount freed  
is what we expect when there is no swap -- resident size is exactly  
the amount of RAM actually used for that process.  Presumably there  
are some kinds of sharing of memory between processes which could  
cause the reported resident size to be greater than the "amount of  
change of total memory available when starting or stopping the  
process", but we apparently don't do that kind of sharing.

The bottom line is that resident size, not vmsize, is a good measure  
of what we care about (which is mostly "How many active storage  
server nodes can we run on a server with X MB of RAM?") but only if  
there is no swap and perhaps also if overcommit_memory=2.

We should adjust our automated memory usage measurement and graphing  
tools to have no swap, no overcommit, and to measure resident instead  
of virtual size.

My experiments suggest that it takes at most 24 MB to startup a node,  
a peak of at most 54 MB to let it serve as one out of 9 upload  
servers for a single large (400 MB) file upload, and then it rests at  
at most 52 MB after the upload is over.

Regards,

Zooko