Memory leaks are very dangerous especially when our applications run for an extended period of time. They tend to eat up your memory (depending on the  amount of leak) and might result in a hang of your application or worse your entire machine.

I have worked with different memory debugging tools and one of them I found very interesting was “libumem“.

The problem: One of my JNI application was leaking badly on Solaris OS. Valgrind was always my first choice, but as valgrind doesn’t support Solaris, I had no other choice but to look for other solutions. On googling, I stumbled upon libumem and found it very promising for my problem. Surprisingly, I found it very easy to use than any other tool.

A brief description of the library: libumem is a memory management library provided by Solaris. If you have Solaris 9(update 3)/10. You should be already having  the library.

libumem is a user space slab allocation library which performs object caching so that the memory that is frequently allocated and freed is cached. This reduces the overhead of creating and releasing the memory. Whenever the application requests memory from the system, the system will allocate the memory from this umem cache. And the buffer, which is the building block of the cache contains the redzone section and debug metadata section which holds the that data helps in analyzingthe memory leaks and memory corruption. And one can view all this information using MDB(Modular Debugger)

Enough of theory… Roll up your sleeves now… Letz get in to some action…

Before starting your rouge application, we need to set a few things.

bash-3.00#LD_PRELOAD=libumem.so

bash-3.00#export LD_PRELOAD

bash-3.00#UMEM_DEBUG=default

bash-3.00#export UMEM_DEBUG

bash-3.00#UMEM_LOGGING=transaction

bash-3.00#export UMEM_LOGGING

The above settings ensures that libumem is loaded on to the application and the information like thread-ID, timestamp and stack trace are recorded for each memory transaction initiated by the application.

Now start your application

bash-3.00#./Myapp.bin

Get the  process Id of the application and attach it to the mdb…

bash-3.00# mdb -p 24814
Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 libuutil.so.1 ]
>

If your application got cored, you can even run mdb with core. But we must ensure that libumem is preloaded in to the application. Now moving to the point, letz get the leaks using findleaks

>::findleaks
BYTES             LEAKED VMEM_SEG CALLER
40960                  8 d0fc0000 MMAP
8192                   1 fe308000 MMAP
8192                   1 fe302000 MMAP
8192                   1 fbfb8000 MMAP
8192                   1 fbfb6000 MMAP
8192                   1 f9e72000 MMAP
16384                  1 d15d0000 MMAP
24576                  1 d0fca000 MMAP
————————————————————————
Total       8 oversized leaks, 122880 bytes

CACHE     LEAKED   BUFCTL CALLER
000dae08       1 0012a258 libc.so.1`strdup+0xc
000db188    8902 005674a0 libmysocket_jni.so`Java_com_my_tempsock_mysocket_DSA_1wrap_Mylan_1socketConnect+0x2c
000e3c08       3 005e4708 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3888       1 001930e0 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3c08       7 007cd158 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000dae08       2 0012d0e0 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       2 00791680 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3c08       6 0032e690 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       3 005661e0 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3c08       2 007ccd20 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       3 00269680 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3c08       3 005e4960 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       6 00269f68 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000e3c08       5 005e2ff0 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       7 0078a168 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000de388       1 00168ff0 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000de708       1 000efd10 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       4 0015be00 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db188       5 009599c8 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000de708       1 000eff68 libjvm.so`__1cCosGmalloc6FI_pv_+0x24
000db888    8943 00243428 libsocket.so.1`get_addr+0x3e0
000db188    8718 00567338 libsocket.so.1`get_addr+0x410
000df508       1 004f9860 pkcs11_kernel.so.1`get_bigint_attr_from_template+0x28
000df508       1 004f9950 pkcs11_kernel.so.1`get_bigint_attr_from_template+0x28
000dae08       1 003e3ba8 pkcs11_kernel.so.1`get_bigint_attr_from_template+0x28
000db508       1 005547f8 pkcs11_kernel.so.1`get_bigint_attr_from_template+0x28
———————————————————————-
Total   26630 buffers, 812808 bytes

It shows many leaks, but most of them are just false alerts. So just ignore them. Look at the leaks that were bolded. They are more serious. To get more information about the leaks, we need to print the stack trace. The command to get the stack is “<CACHE_mem_add>$<bufctl_audit”.

> 00243428$<bufctl_audit
ADDR          BUFADDR        TIMESTAMP           THREAD
CACHE          LASTLOG         CONTENTS
243428           241120 219a283c058e25               27
db888                0                0
libumem.so.1`umem_cache_alloc+0x210
libumem.so.1`umem_alloc+0x60
libumem.so.1`malloc+0x28
libsocket.so.1`get_addr+0x3e0
libsocket.so.1`_getaddrinfo+0x710
libmysocket.so`Mygetaddrinfo+0x78
libmysocket_jni.so`Java_com_my_tempsock_mysocket_DSA_1wrap_Mylan_1socketConnect+0x100
0xfc00cd40
0xfc00ccec
0xfc0059b8
0xfc005850
0xfc005850
0xfc005850
0xfc005970
0xfc005970

The above stack trace clearly tells me that there has been some memory allocated by getaddrinfo, which hasnt been freed up by freeaddrinfo.  I went ahead and fixed up that bug. But not every memory leak / buffer overflowis so simple. We might need some more information about the buffer. The above bolded string is the buffer address of the leak. Printing information will tell us more details like size of the allocation, if there is any corruption etc…

> 241120/10X

will give us the whole information( my buffer is too huge her to copy here). To find if there is any corruption in the memory, get the last two values of the buffer and XOR them. It should return 0xa110c8ed. If not, then our memory is definitely corrupted.

Thats all about how we can use libumem to debug a memory leak in Solaris. So simple right? :). Hope this helps

Advertisements

BonJour…

Debugging a native C/C++ application is a very tedious process unless you know the right tools to use. This blog is an attempt to give information about the right tools that help us in different situations.

Hope this helps in Sorting Out UR “SOUR” problems.