# mdb custom dmods

Picking up right where we left off in our previous exercises. We've got a core due to an error. We fix the error by removing line 31 from myprog.c and rebuilding. The program runs now... prints out some text and pauses... to simulate a long-running program that we need to debug without disrupting too much.

Let's get a core!

Let's see if it leaking:

An estute coder might quickly identify the problem by just reviewing the code, however, in larger systems this can be far more daunting, so we'll take a leap of faith and consider this leakage puzzling.

One problem is that we have all of our data up in this hash table and it is quite challenging to iterate over all of to see what's inside.

Before we dive in and build our own mdb dmod to iterate over a ck_ht, let's take a look at the actual leaks. Presumably, those leaks via strdup() should have strings in them. So let's print some out:

That clearly didn't work. This is some deep carnal knowledge about libumem, but umem allocation have a redzone and to make it worse, they are of variable size. Basically, if the allocation is over 16 bytes, the redzone is 16, otherwise it is 8.

In mdb, the only way I know how to walk a set of address and print strings out at an offset is to hack a silly struct.

We can see that this came from the umem_alloc_16 slab, so it is 16 bytes or less and thus has an 8 byte redzone. We'll create two types to help us and then leverage one to print out all 35 leaked words in this bufctl. You'll notice that the redzone16 type has its redzone allocated with 10 bytes. Everything in mdb is in hex... 0x10 = 16t.

Very interesting... what's so special about those words? I didn't just add lesbian into my output to get more search engine traffic... I was actually a bit surprised by it.

Now we have more clues and you might likely guess what's so screwed up here. But, for the sake of more complex debugging, we'll assume we're still completely stumped and decide we now need to go look in the hash table to see what is actually there. This requires some heavy lifting and is more the point of this post.

### Writing and mdb dmod.

All those fancy commands ::findleaks and ::walk leakbuf were provided by a dmod... libumem's. Libumem provides some (quite sophisticated) utilities to mdb via a dmod. We will use the same approach to teach mdb how to walk over all the entires in a ck_ht.

This will seem a bit complicated, but remember travesing a data structure might seem easy when the memory is yours and the pointers point to valid locations in your memory space. Here we're in mdb, in it's memory space and have a core file representing something completely foreign. We need to go through all the effort of reading memory out of a foreign virtual memory space.

I'm not going to explain this line by line, but the basic idea is that mdb_vread lets us read bytes of the debugged virtual memory space into our own.
To make things more complicated (or perhaps realistic) there are bits of the ck_ht structures that we need to look into that aren't exposed by the public headers... so evil hack time.

A walker is made up of three parts: an initializer, a stepper, and a finalizer.

Refering back to the makefile from the previous post, we've got a one line build for this, which or my system turns out to be:

gcc -I/opt/circonus/include/amd64 -m64 -fPIC -shared -o libck.so libck.c

Now, in mdb, we can load this dmod. Note that if you actually placed this file in /usr/lib/mdb/proc/amd64, then it would autoload when mdb realized that the core linked libck (super nice and convenient). For now, we'll assume it isn't installed and we need to load it from the local directory.

Cool... but actually that just comes from CTF. I did that to emphasize that the map element of that structure actually holds all the goodies and because libck itself was built without CTF, we can't see shit. There are a lot of words in this hash, so I'll just walk it and count them and then print out the first few. Again we need to pull a stupid trick to print out C strings via ::print

Okay, we've got words in there. While not so useful for debugging, let's print out the value in the hash table to see if we can keep our sanity. The first word there is "sangaree", so let's see if the meta data about its length, etc. is correct.

Interesting... caps is zero (obviously), but I've not yet seen any capitalized words. Let's print all the caps...

No caps. Anywhere. Looking at the code, maybe that's the issue! We're lowercasing all the words.

I wonder if any of our freed words are still in the hash table. One of our other interesting leaked words was "fortran".

Hot damn! So we have a leaked "fortran" and a not leaked "fortran." We have two "fortan"s. As any C programmer will tell you, the only acceptable number of fortans is zero, but as a small hat tip to the mathematicians in the croud and we'll allow one. Two? Bad mojo.

If we put a duplicate value into the hash table, we'll replace the old one without freeing it... thus our leaks.

If we go look in /usr/dict/words, we'll find both "Fortan" and "fortan" there. Personally, I'm all for removing both of them. While I'm puzzled as to why there are two lesbians in my dictionary, I do understand that both "Max" (the proper noun) and "max" (the short form of maximum) would both legitimately be there.