So a panic is a panic, right? Well yes in that is stops your server dead in its tracks, but the how you deal with it is the important thing. I've been frustrated for a long time with Linux and FreeBSD (Linux more so) regarding the lack of good post-mortem crash analysis facilities.
Some background might be helpful here. One of the clusters that my company administrates has 10 FreeBSD 5.3, 1 FreeBSD 5.2.1, 1 FreeBSD 5.5, 2 Linux CentOS 4.1 and 2 Solaris 10 machines running (all dual Xeons). And we get between 1 and 3 kernel panics per week? You might say bad hardware, or bad OSes or odd gravitational flux in the area -- you'd be wrong. There is nothing wrong with all 16 machines and everyone of them takes turns kernel panicing. There are a variety of vendors (all the boxes aren't identical). So, why so unstable? I believe our app is just pushing on these boxes in new ways. We're looking at sustained 500+ disk ops/second and 20-30 thousand open TCP sessions pretty full use of the 2GB of RAM in each box.
Where am I going? Well, FreeBSD has a feature called savecore. These machines are 400 miles away from our admin team and I can't (even with the help of several core FreeBSD developers) get savecore to function reliably. So, when one of these panics occurs, more often than not, I have nothing to work with upon reboot. Sigh.
FreeBSD sucks right? No, take a look at Linux to see bad. the box panics or hangs. I can't leave it hung to do online debugging through the kernel debugger. Besides, I might not think of the crucial information I want to see until a day later. I need a core dump and for the life of me, there is no way to get a good port-mortem system image and vmcore. WTF is going on here? How can people run this stuff in heavy production environments when there is no really sound way to troubleshoot kernel faults without attempting to repeat them.
Let's demonstrate this. If I have a kernel panic and get some sort of complicated register dump and stack trace, should I (a) cut and paste it and post it to a mailing list or (b) take a picture of my monitor with my digital camera. Well, if you answer (b) you are retarded (take offense, please). The fact that digital cameras are common linux crash dump reporting tools is truly sad. Now take a look at how many Solaris users use option (b).
So... I covered Linux and FreeBSD. What about Solaris you ask? (If you didn't ask, you should). Upon panic (which happens more rarely than on FreeBSD and Linux in our cluster) I get a reboot and all services are auto-restored before I can blink (thanks smf), but low-and-behold a present! Just for me! in /var/crash/nodename/. A complete vmcore.0 and unix.0 allowing for thorough offline (or online) post-mortem analysis. It's like I am using a real UNIX again.
Our Linux and FreeBSD installs outnumber out Solaris installs 20 to 1 at least. This rant to tell people that just because Linux is popular, doesn't mean it can't learn a few things from Solaris. The code is out there, grok it, adopt the methodologies and and good engineering practices. I like Linux, but I need savecore, I need it now, and I need it to unquestioningly, reliably work out-of-the-box.
Monday, September 19. 2005 at 03:52 (Link) (Reply)
Gosh, Linux actually has a kernel debugger now?
Monday, September 19. 2005 at 07:29 (Link) (Reply)
What's wrong with adding "(c) have it logged by your serial console" to your crash possibilities? As for digital camera photographs of oopsen, they're so prevalent that I've considered writing a web CGI to run OCR on photographs of 80x25-style text. Didn't get good results from standard OCR packages, though; they're meant for more cursive writing.
Monday, September 19. 2005 at 07:48 (Link) (Reply)
The point is that twofold. One, I need more than just a register dump and stack trace, I need interactive exploration to see the state of various kernel data structures. Two, it's a production box, I need to up again fast, post-mortem analysis is the "right way" as it provides the most inforamation at the least downtime cost.
Monday, September 19. 2005 at 08:13 (Reply)
Have you tried netdump (http://www.redhat.com/support/wpapers/redhat/netdump/) ? It comes with RHEL though. I haven't used it personnaly but got good feedback and will give it a try in the future.
Monday, September 19. 2005 at 09:04 (Link) (Reply)
I think I have a machine suitable for that at that location and the NICs in my boxes seem supported.
I'll hand this off to our SA team for a trial.
Monday, September 19. 2005 at 11:53 (Reply)
Actually there are two solutions in RHEL (CentOS). Netdump will send the dump across the network as someone stated previously. However, diskdump is the solution that is most like Solaris' savecore. Just install the diskdumputils RPM if you do not have it already. It requires that you create a dump partition however. And of course, it only works with certain SCSI controllers so your mileage may vary.
Wednesday, November 29. 2006 at 10:13 (Link) (Reply)
kdb (http://oss.sgi.com/projects/kdb/) is also pretty useful, when you segfault you get thrown into a debugger which is also accessible over a serial line (which you can then send over the network); it generally works in all but the most dead oops's
Wednesday, November 29. 2006 at 11:07 (Reply)
Perhaps the point is that Linux doesn't fail nearly as often as FreeBSD and Solaris.
Take offense, please ;)
Wednesday, November 29. 2006 at 12:19 (Reply)
If you're running "sustained 500+ disk ops/second and 20-30 thousand open TCP sessions pretty full use of the 2GB of RAM in each box" then your applications are outpacing the limits of any possible software.
To put it bluntly, you need to stop whatever horseshit you're doing. Scale it back. Or end it.
No software in the universe will tolerate that kind of overhead. No OS. No machine. Nothing. Ever. Anywhere. At any time. At any point in the universe.
There's a simpler samrter way to fix your broken systems and avoid your kernel panics. Cut out the crap queries that produce those 20 to 30 thousand open TCP sessions. Whatever's causing 'em, redesign the front end. Cut it out. End it. Shut it down.
If you're runnig a database, limit the number of queries by changing the front end. If you're running a web server, limit the number of connections by reducing the number of buttons for web surfers to click on. If you're running a forum, make it harder for people to post. This can all be done. Easily.
The solution isn't the insane and frankly demented extremes of coredump fetishism you complain you need, but a competently designed front end that prevents such absurd overuse of your machines' resources.
Like all programmers, you will of course deny this and instead fasely assert that the solution is "simple" and "easy" and "quick" but that the [fill in the blank: OS / apps / API / protocols] are "garbage" and "designed by idiots" and "worthless" and "unusable."
And, like all programmers, you'll be lying and badly deluded.
Programmers are the world's most overpaid typists. If programmers had been put in charge of developing fire in the Neolithic era, we'd all still be living in caves in the dark.
Friday, December 1. 2006 at 01:27 (Reply)
That's just funny. I'll let your "no OS" not do that while my Solaris and Linux happily go on operating under those conditions.
We have clients running Linux with 70k concurrent TCP sessions without issues. And some others that sustain will over 500 I/O ops / second on commodity boxes running Solaris.
They aren't "queries" they are user sessions. Carrier class software has to support millions of users with tens of thousands (or more) concurrent connections. If you can do a thousand concurrent connections on one box, you can go by 100. Me? I'll do 50k concurrent sessions and by two.
Wednesday, November 29. 2006 at 13:06 (Link) (Reply)
Well, Solaris' license is not compatible with GPL, so having the code is as useless as not having them when porting features to Linux. Worst, if you read the sources you might doing something illegal.
Hopefully Sun will release Solaris as GPL, then either Linux will be better or we'll just happily use Solaris (I'd do at least).
Maybe Debian GNU/Solaris ;)
Wednesday, November 29. 2006 at 13:55 (Link) (Reply)
I have used netdump/netconsole for a group of 20 colo'd Sun V20zs. It has caught a couple of dumps so far but I don't have them around anymore.
--
syslog-ng to a centralized log server might help pick up any patterns of behavior just prior to the crashes.
--
You might try supermon on a couple of the machines that are esp. crashy - it is low resource usage/high resolution monitoring
http://supermon.sourceforge.net/
Again, you may see something prior to the crashes that gives hints to the mechanism precipitating the crashes.
--
DTrace on Solaris to instrument your app?
--
Also, I read someone's blog who said they were getting better performance for their app on DFBSD than on FreeBSD - they didn't say what the app was but it seems doubtful that there are shills creating elaborate blogs to astroturf for DFBSD.
--
Ack! CR/LFs are not preserved by you comment system.
--
Ack! I hate when year old stuff is posted to reddit, etc.
:(
Friday, December 1. 2006 at 01:30 (Reply)
It might be a year old, but sadly the situation simple hasn't improved all that much. FreeBSD (6) still panics under our app and the post-mortem tools on both are unreliable at best.
Wednesday, November 29. 2006 at 15:26 (Link) (Reply)
In the true Linux spirit, if you miss the feature -- why not contribute it yourself?
Wednesday, November 29. 2006 at 15:42 (Reply)
Never used it, but there's also Linux Kernel Crash Dump - http://lkcd.sourceforge.net
Wednesday, November 29. 2006 at 17:25 (Reply)
Hmm, has anyone tried the Linux Kernel Core Dump patch *LKCD* and related utilities? It has a site: http://lkcd.sourceforge.net/
I don't know, in Debian, for example, there are kernel core debugger support packages ( *crash* , *lcrash* - netdump and diskdump are supported too), a package for core dump configuration ( *dumputils* ) and the kernel patch itself as a package.
Anyone cares to report their findings and availability in other packaging platforms?
-Kvorg
Thursday, November 15. 2007 at 12:13 (Reply)
LKCD works great, and is compatible with multiple core dump formats. crash and lcrash are part of it, if I remember right.
netdump is nice, but it does (or can) cause "panic loops" when it tries to activate the network interface to send out the dump.
LKCD has SGI behind it - and their stuff (XFS, PCP, FAM, ...) seems to be very well tested. On top of that, it seems that LKCD has gotten a life of its own separate from SGI now.
netdump comes from and is developed by Red Hat.
http://administratosphere.wordpress.com
Wednesday, January 31. 2007 at 16:02 (Reply)
Hmm.. I really wonder if this is a troll's blog since I don't recall seeing this on the FreeBSD questions mailing list. Anyway, sysctl -a is your friend. Run it out of cron and compare the output (you could use diff crudly, or perl + mrtg slickly) your going to find something going getting lower and lower and lower until your out and then the system panics. Possibilities are kern.ipc.nmbclusters and anything with mbuf in it in the kern.malloc variable.
I can't understand why savecore isn't reliable for you, did you define an additional swap partition and set dumpdev to it in /etc/rc.conf?