Using Process Memory Matrix script for understanding Oracle process memory usage
This tool currently works only on Solaris. I will write support for (newer) Linux versions soon and possibly also for HP-UX.
When working on a problem I wrote a script which helps to present the output of Solaris pmap in a better way. If you don't know what pmap is, it's a tool available on Solaris, Linux, HP-UX (and AIX where it's called procmap) which displays you the breakdown of each processes address space - virtual memory mappings. This is way better than just relying on ps or top command SIZE and RSS columns.
My script gives a better overview of how much memory an Oracle process is really using. Historically this has been problematic due various differences of memory accounting of shared memory segments (Oracle SGA) and the large amount of data returned from pmap command.
My script procmm.sh (Process Memory Matrix) will simply run pmap on the processes specified and aggregate the output into a matrix.
Example output is here:
oracle@solaris02:~/research/memory$ ./procmm.sh 12755
-- procmm.sh: Process Memory Matrix v1.01 by Tanel Poder ( http://tech.e2sn.com )
-- All numbers are shown in kilobytes
PID SEGMENT_TYPE VIRTUAL RSS ANON LOCKED SWAP_RSVD
------ -------------------- ------------ ------------ ------------ ------------ ------------
12755 lib 22932 22872 900 0 1048
12755 oracle 95824 95820 232 0 2468
12755 ism_shmid=0x1d 409608 409608 0 409608 0
12755 anon 5728 5512 5508 0 5724
12755 stack 156 156 156 0 156
12755 heap 1924 868 868 0 1924
------ -------------------- ------------ ------------ ------------ ------------ ------------
12755 TOTAL(kB) 536172 534836 7664 409608 11320
On the Horizontal axis you see the various memory sizes pmap reports (like VIRTUAL size and SWAP_RSVD - swap space reservation) and on vertical "SEGMENT_TYPE" axis you'll see for what (which mapping) in that process address these memory figures are shown (for example, Oracle binary, heap memory (think malloc()), libraries and process private memory allocations which you'll see as "anon").
I don't do any of my own computation magic here, I just show output from couple pmap commands in a better aggregated and understandable manner. So if you want to read official documentation about these figures then just run man pmap.
I will explain the columns shortly here too:
In the above example I examined only a single process. Below I pass all processes of an instance as a parameter and procmm walks through them. This is not a cheap and fast process, so shouldn't run this frequently.
The -t option below means "Total", the script doesn't show individual PID memory breakdown but sum of all PIDs passed to it.
oracle@solaris02:~/research/memory$ ./procmm.sh -t `pgrep -f ora_.*SOL102`
-- procmm.sh: Process Memory Matrix v1.01 by Tanel Poder ( http://tech.e2sn.com )
-- All numbers are shown in kilobytes
Total PIDs 17, working: .................
PID SEGMENT_TYPE VIRTUAL RSS ANON LOCKED SWAP_RSVD
------ -------------------- ------------ ------------ ------------ ------------ ------------
0 lib 389844 388796 13180 0 17816
0 oracle 1629064 1628908 3336 0 42012
0 ism_shmid=0x1d 6963336 6963336 0 6963336 0
0 hc_SOL102.dat 48 48 0 0 0
0 anon 32936 15936 15452 0 32868
0 stack 1660 1628 1592 0 1660
0 heap 37004 18016 16844 0 37004
------ -------------------- ------------ ------------ ------------ ------------ ------------
0 TOTAL(kB) 9053892 9016668 50404 6963336 131360
-- Note that in Total (-t) calculation mode it makes sense to look into ANON and SWAP_RSVD
-- totals only as other numbers may be heavily "doublecounted" due overlaps of shared mappings
The ANON figure reports roughly 50404 kB as the Oracle instance processes actual memory usage (I should be more precise and say memory allocation).
However the total 50404 kB also includes 13180 kB of ANON memory allocated by various libraries (in addition to Oracle libraries also multiple OS libraries which are not under Oracle's control). Also, total 3336 kB of private (writable) ANON memory has been allocated "in" oracle binary. This is because the BSS section in the binary which holds various static global variables (static function-local variables often go to stack, but global variables used over object modules go to BSS section).
Every new Oracle process reuses the shared Oracle binary pages (only one copy of Oracle binary is in memory), but when the process tries to write to the variables section, then the OS copies the shared page into a new physical page and maps the new page to process address space as a writable page. That's called copy on write.
To Be Continued...