Using Process Memory Matrix script for understanding Oracle process memory usage

This tool currently works only on Solaris. I will write support for (newer) Linux versions soon and possibly also for HP-UX.

When working on a problem I wrote a script which helps to present the output of Solaris pmap in a better way. If you don't know what pmap is, it's a tool available on Solaris, Linux, HP-UX (and AIX where it's called procmap) which displays you the breakdown of each processes address space - virtual memory mappings. This is way better than just relying on ps or top command SIZE and RSS columns.

My script gives a better overview of how much memory an Oracle process is really using. Historically this has been problematic due various differences of memory accounting of shared memory segments (Oracle SGA) and the large amount of data returned from pmap command.

My script procmm.sh (Process Memory Matrix) will simply run pmap on the processes specified and aggregate the output into a matrix.

Example output is here:

oracle@solaris02:~/research/memory$ ./procmm.sh 12755

-- procmm.sh: Process Memory Matrix v1.01 by Tanel Poder ( http://tech.e2sn.com )

-- All numbers are shown in kilobytes

PID            SEGMENT_TYPE      VIRTUAL          RSS         ANON       LOCKED    SWAP_RSVD

------ -------------------- ------------ ------------ ------------ ------------ ------------

12755                   lib        22932        22872          900            0         1048

12755                oracle        95824        95820          232            0         2468

12755        ism_shmid=0x1d       409608       409608            0       409608            0

12755                  anon         5728         5512         5508            0         5724

12755                 stack          156          156          156            0          156

12755                  heap         1924          868          868            0         1924

------ -------------------- ------------ ------------ ------------ ------------ ------------

12755             TOTAL(kB)       536172       534836         7664       409608        11320

On the Horizontal axis you see the various memory sizes pmap reports (like VIRTUAL size and SWAP_RSVD - swap space reservation) and on vertical "SEGMENT_TYPE" axis you'll see for what (which mapping) in that process address these memory figures are shown (for example, Oracle binary, heap memory (think malloc()), libraries and process private memory allocations which you'll see as "anon").

I don't do any of my own computation magic here, I just show output from couple pmap commands in a better aggregated and understandable manner. So if you want to read official documentation about these figures then just run man pmap.

I will explain the columns shortly here too:

In the above example I examined only a single process. Below I pass all processes of an instance as a parameter and procmm walks through them. This is not a cheap and fast process, so shouldn't run this frequently.

The -t option below means "Total", the script doesn't show individual PID memory breakdown but sum of all PIDs passed to it.

oracle@solaris02:~/research/memory$ ./procmm.sh -t `pgrep -f ora_.*SOL102`

-- procmm.sh: Process Memory Matrix v1.01 by Tanel Poder ( http://tech.e2sn.com )

-- All numbers are shown in kilobytes

Total PIDs 17, working: .................

PID            SEGMENT_TYPE      VIRTUAL          RSS         ANON       LOCKED    SWAP_RSVD

------ -------------------- ------------ ------------ ------------ ------------ ------------

0                       lib       389844       388796        13180            0        17816

0                    oracle      1629064      1628908         3336            0        42012

0            ism_shmid=0x1d      6963336      6963336            0      6963336            0

0             hc_SOL102.dat           48           48            0            0            0

0                      anon        32936        15936        15452            0        32868

0                     stack         1660         1628         1592            0         1660

0                      heap        37004        18016        16844            0        37004

------ -------------------- ------------ ------------ ------------ ------------ ------------

0                 TOTAL(kB)      9053892      9016668        50404      6963336       131360

-- Note that in Total (-t) calculation mode it makes sense to look into ANON and SWAP_RSVD

-- totals only as other numbers may be heavily "doublecounted" due overlaps of shared mappings

The ANON figure reports roughly 50404 kB as the Oracle instance processes actual memory usage (I should be more precise and say memory allocation).

However the total 50404 kB also includes 13180 kB of ANON memory allocated by various libraries (in addition to Oracle libraries also multiple OS libraries which are not under Oracle's control). Also, total 3336 kB of private (writable) ANON memory has been allocated "in" oracle binary. This is because the BSS section in the binary which holds various static global variables (static function-local variables often go to stack, but global variables used over object modules go to BSS section).

Every new Oracle process reuses the shared Oracle binary pages (only one copy of Oracle binary is in memory), but when the process tries to write to the variables section, then the OS copies the shared page into a new physical page and maps the new page to process address space as a writable page. That's called copy on write.

To Be Continued...