Next article: Friday Q&A 2009-06-26: Type Qualifiers in C, Part 1
Previous article: Friday Q&A 2009-06-05: Introduction to Valgrind
Tags: fridayqna memory performance
Welcome back to another Friday Q&A. Now that WWDC is behind us, I'm back on track to bring you more juicy highly-technical goodness. Maybe I can even get back to doing one a week.... This week I'm going to take André Pang's suggestion of discussing process memory statistics (the stuff you see in Activity Monitor or
top) in Mac OS X.
Before I can discuss what the stats mean, I first have to discuss just how memory actually works on a modern operating system. If you already know the difference between physical memory and virtual address space, understand how file mapping works, etc., then feel free to skip ahead.
At the hardware level, memory is physical chips accessed over a bus. Each byte of memory in those chips has a discrete physical address (although technically modern systems aren't usually byte-addressible, requiring larger chunks to be accessed).
Mediating access to the physical chips is the CPU's MMU (Memory Management Unit). The MMU is what allows for virtual memory. It maps between logical addresses coming from the CPU and physical addresses sitting out in physical RAM.
This gives the CPU a large virtual address space that doesn't necessarily correspond to the physical memory. (This space is 4GB in 32-bit, and a really big number in 64-bit.) Any given section of that address space can either be mapped to an arbitrary section of physical memory, or it can be left unmapped.
What happens when a program tries to access memory that's unmapped? A hardware exception results, and the OS gets to take over.
A cleverly programmed OS (like, say, any halfway recent UNIX, or even Windows) can use this fact to do some interesting things. It could, say, maintain its own, more complicated mapping behind the scenes which says that a section of memory that's unmapped in hardware is actually mapped to a file on disk. Then when a hardware exception is raised for trying to access that section, the OS can read a chunk of the file into that spot and then let program execution continue. Now you have file mapping and (if you automatically unmap little-used sections of memory and write their contents out to disk) swap.
Another clever thing is to map sections of two different processes' address spaces to the same chunk of physical memory. Now you have shared memory!
These techniques can be combined. For example, shared frameworks are typically loaded by mapping them into memory (allowing the OS to load them off of disk lazily). And they're then mapped into multiple processes at once, allowing them to use the same physical RAM for all processes instead of having a bunch of copies.
Now that we know roughly how the stuff works, let's define some memory-related terms:
- Resident: memory which is located in physical RAM.
- Private: memory which is only mapped into one process.
- Shared: memory which is mapped into multiple processes.
- Address space size: the quantity of address space occupied by a particular section of virtual memory.
- Memory size: the amount of actual physical memory occupied.
topmean, from looking at the man page and using these definitions:
- RPRVT: The amount of address space, local to this process, which corresponds to items currently present in physical RAM.
- RSHRD: The amount of address space, shared between this process and at least one other, which corresponds to items currently present in physical RAM.
- RSIZE: The total amount of physical RAM used by this process. (This is not equal to RPRVT + RSHRD because they measure address space, but this measures actual memory.)
- VPRVT: The amount of address space in the process mapped to items which are not shared with other processes.
- VSIZE: The total amount of address space in the process that's mapped to anything.
It should also be noted that these numbers are derived from an accounting system which does not always completely correspond to the true numbers, especially when distinguishing between shared and private memory. They're generally close enough to be useful, at least.
By this point you're probably scratching your head and wondering which number you should look at to see how much memory your program is using. Trouble is, there isn't one!
As you've seen, memory usage is highly complicated, and none of these numbers answers that question. In fact, with things like file mapping and shared memory, it's not even a question that really makes sense.
That's not to say that these numbers are useless, though. Even though nothing directly corresponds to what you'd really like to know, there are still some interesting facts you can obtain.
For 32-bit programs, VSIZE can be very important. This is because 32-bit programs have a hard 4GB limit on virtual address space, and in this modern world it's not all that hard to hit that limit. Once you do, memory allocations will begin to fail and your program will probably crash shortly afterwards. If your VSIZE is near the 4GB limit, you're chewing up too much address space on something.
(For 64-bit programs, the virtual address space is virtually unlimited, and so this column is of little use. For example, garbage collected apps in 64-bit immediately allocate a 64GB chunk of virtual address space just to make the accounting easier. This has no bearing on your actual memory usage and is completely harmless, although it tends to freak out users who go groveling around Activity Monitor.)
RPRVT can be useful as a rough indicator for watching if the total amount of memory your program has allocated is going up or down. This is dangerous to rely on, however. Because this only tracks resident memory, if your program has started to swap then your RPRVT will no longer increase, even though you're still allocating more and more memory. (To detect this, you can watch to see if VPRVT is going up, and the number of pageouts listed at the top of the screen is going up.) Conversely, the memory allocator doesn't always give memory back to the system right away, so this number may not go down if your program is freeing memory.
Overall, be careful not to rely too much on these statistics. For more precise information to track down leaks and excessive memory allocation, tools like the
leaks command and the ObjectAlloc instrument are much better.
That brings us to the end of this edition of Friday Q&A. Now you should understand what all those weird numbers mean in
top (except, potentially, for all of the ones that aren't related to memory) and how best to use and not use them.
Come back next week (I hope) for another exciting edition. Be sure to send along your ideas for topics to discuss. Without your contributions, Friday Q&A could not exist. Post them in the comments or e-mail them directly to me.
Friday Q&A would like to acknowledge Ed Wynne's important role in providing technical advice for this week's post.
the answer is "Firefox"
Thanks for the information!
Yet - and I have been wondering about this for years - my main question "which number corresponds to the number displayed by Mac OS 9" still remained unanswered, or rather answered as 'none'.
Which is a shame because the OS 9 numbers were actually useful in the sense that they told me which application would be worth quitting when running into swapping hell. In OS X I have to intuitively quit notorious memory hogs like iPhoto or VMWare but it seems to be impossible to collect reasonable information about which application is most worth quitting from these numbers.
"Nice" in the the CPU tab is also a mystery,
but I suppose that is another subject.
"nice" is scheduling priority, try 'man nice' in a terminal or http://en.wikipedia.org/wiki/Nice_(Unix)
RSIZE measures actual memory, but you said RPRVT is the amount of address space, local to the process, which corresponds to items currently present in physical RAM.
I'm sure I'm just missing something
I've searched all over and not come up with much info other than what I can summarize as: it doesn't necessarily mean something bad, it just means a program is waiting for something.
It would be nice to know which program and what its waiting for.....
Shared library are composed of multiple sections. One with the code (read-only), one with the process specific state (read-write) (global variables) etc.
And only the "read-only" sections are shared.
In useful tool list, it's worth to mention MallocDebug too.
Chuck: Any given chunk of physical RAM could be mapped 0 or more times into the process.
For questions about CPU usage and such, I think you're in the wrong place. This is a programming blog.
It's nice to pine for the "good old days", but the Darwin memory manager is nothing like the 9 memory manager, and Mac OS X apps tend to have more complex behaviours than their Classic ancestors, so the situation overall just isn't as straightforward anymore.
As Jean-Daniel explains, shared libraries are internally subdivided based on (amongst other things) their writability. It's worth noting that the parts that might be writable but which have not yet been written remain shared using a technique called copy-on-write (often abbreviated COW).
The accounting used to maintain those numbers is complicated, and they are not directly derived one from the others. It may help to know that most shared libraries are kept in what is known as the "shared segment", and the system uses a trick to share portions of the pmap and VM pagetable (data structures used by the MMU) that correspond to this segment between processes. This is a healthy performance optimisation, but it means that updates made to these data structures cannot be trivially accounted for across all processes.
In particular, RSIZE is very difficult to account precisely. If one task causes a page to be made present in a shared library, it's reasonable to charge that task for the page. If another task then uses that page, it might be reasonable to charge it for it as well, but there is no event that tells the system the second task has used the page. Further, when the page is later evicted and recycled, there is no list of all the tasks that used the page. Indeed, once it's made present, it's present in all tasks. Do you account a resident page in the shared segment against every task in the system? That would make the number useless, as one task running lots of code in a framework would make it look like every task in the system was blowing out its resident set.
Instead, RSIZE is managed using several tricks that try to make it 'relevant' at the cost of being 'precise'. You're encouraged to read the source code for the details.
The act of "inspecting" a process causes a large chunk of its address space to be shared with the tool inspecting it, and so it goes from being accounted as VSIZE to VSHARED. You'll note that if you quit and restart Activity Monitor, the VSIZE accounting pops back up to where it was.
This matches my experience, since the 3.0 release. FF has cleaned up its act.
e.g., Firefox running for 13 days, presently about 80 tabs open:
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
460 firefox-bi 10.7% 31:29:37 26 277 6391 338M 50M 490M 939M
I also find it useful to keep separate profiles (run /Applications/Firefox.app/Contents/MacOS/firefox-bin --profilemanager
from Terminal), one profile for regular browsing with just a few plugins enabled, and another for web dev work with firebug, html validators, etc.
You run it like this:
~>sudo iotop -P
And it displays a % of disk IO each process is using :)
CPU usage is easy to see in activity monitor or by using top. But disk I/O is usually hard to track down.
I keep wondering why a why to monitor per process disk activity is not integrated with the Activity Monitor (just the totals... not very helpful).
From my experience, the cause for major system slowdows are apps like Time Machine (local) or Backblaze (online) and other backup apps, or Spotlight and other search/indexing utilities... Those reaaaally bog down my machine and are really hard to track down/stop.
This article is such an enlightening read, but it comes down to: it's very hard to track *real* memory usage, because... there's no such thing! ;) haha It's soo complicated and "abstract"...
Thanks for the tip on iotop, juancn! :)
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.