title: “Finding leaks on Solaris (w/o Valgrind)” date: 2010-10-09 20:04:18 categories: - Damaged Bits tags: - solaris

- illumos

Premise: I write a lot of C code. I run a lot of Solaris.

Sadness: One of my favorite tools ever made is Valgrind. Valgrind does not run on Solaris.

A lot of the C code I write is event-driven and as such (complicated) it is harder to write code and leaked memory is a common residual of this more complicated coding effort. Memory leaks suck. Most of the code I write is systems-level code and as such is exercised heavily and needs to run for months or years without restart. So, leaks more than suck.

Valgrind will tell you exactly where you’re leaking (and it feels like it tells you why)… BFM.

On Solaris, we have a different option: libumem.

I compile almost all of my apps against libumem, which has the effect of replacing malloc/free/and friends with a more multi-processor scalable slab-allocator implementation than the default one in libc. A very useful feature in libumem is debugging and leak detection… how?

First, let’s assume you’ve compiled your app with debugging symbols (-g in the compiler flags). Next, I’ll assume you didn’t have the foresight to link against libumem (-lumem on the linker line). We need to link in libumem and we need to turn on debugging. In the case of this example, our app is the Apache web server httpd.

LD_PRELOAD=libumem.so.1 UMEM_DEBUG=default /opt/apache22/bin/httpd

Now, just find the process ID of exampled (perhaps as easy as a pgrep exampled). It is leaking, so we’ll use the process ID 666. To find the leaks (without killing or restarting), we can simply use the mdb umem helper ::findleaks on the running process.

echo "::findleaks -d " | mdb -p 666 > leak-profile.txt

Note you can also do this from a core… very nice.

The output has a summary and then individual profile records grouped by stack trace of allocation, like:

umem_alloc_16384 leak: 203 buffers, 16384 bytes each, 3325952 bytes total
             ADDR          BUFADDR        TIMESTAMP           THREAD
                             CACHE          LASTLOG         CONTENTS
          1d36b60          1d37000   70235c07811a50              197


mod_ssl… Why are you so mean to me?