What
is mod_log_spread? |
mod_log_spread is a patch to Apache's mod_log_config, which provides
an interface for spread to multicast access logs. It utilizes the group
communication toolkit Spread, developed
at Johns Hopkins University's Center for Networking and Distributed Systems.
mod_log_spread was developed to solve the problem of collecting consolidated
access logs for large web farms. In particular, the solution needed to
be scalable to hundreds of machines, utilize a reliable network transport,
allow machines to added or dropped on the fly, and impose minimal performance
impact on the webservers. Current version is 1.0.3p3. This makes a fix to a stupid vhost logging bug as well as providing a complete and flexible log-writing solution. |
|
|
What
is wrong with the way things were...? |
The reason I wrote mod_log_spread was that a popular commercial log writing application my company purchased was hard to support, non-scalable, and broke frequently. The scalability concerns with it
stemmed from it's basic design. The particular product I was addled with was a (java-based) packet sniffer. It sniffs for http transactions and recreates them from tcp
sessions. This presents immediate scalability concerns. How do we sniff a network pushing 70Mb of traffic with a single non-clustering packet-sniffer? You don't.
mod_log_spread backs up this assertion by demonstratebly recording 10-15% more traffic. Sniffers drop logs, Spread, the underlying protocol behind mod_log_spread, is
designed to be unable to drop messages. This particular commercial sniffer is also a single point of failure. mod_log_spread can run two (or any number) logging hosts
simultaneously with no netwrok overhead. Further it is not a black box product, mod_log_spread is an open-source project.
|
|
|
So why not just write logs locally? |
There is a 20-30% performance hit, and you have never known pain until you have tried to manage local logging across 60 machines. Trust me.
|
|
|
What
are other Spread logging projects? |
In addition to the apache module, I also have a patch for the lightweight
web server thttpd to
allow similair logging capabilities. If you're running spread on a highly
utilized Linux system, you may benefit from setting our spread daemon to
real-time scheduling. Here's
a perl module to make that easy to accomplish. |
|
|
Availability |
mod_log_spread is available under an apache style license. Basically,
redistribution is permitted as long as the copyright is preserved and included
intact. In addition, the authors of Spread have provided an unlimited
use license for using Spread to log web requests. Thanks CNDS!
Note: other uses of Spread may not be covered under this license. |
|
|
ChangeLog |
-
2000-05-27 Fixed brown-paper-bag parse error.
-
2000-06-04 Added perl scheduler interface for Linux to help running spread
in realtime.
-
2000-06-07 Added spread tuning docs. Fixed potential issue of inedfinetly
blocking on SP_multicast if Spread's queue is full.
-
2000-07-14 Added Theo Schlossnagle's wonderful spreadlogd spreadlogd to the dist as a replacement to log_writer.
-
2000-09-24 mod_log_spread becomes part of the Backhand Project!
-
2000-10-14 Added vhost support. Fixed potential hang.
-
2000-10-18 Completed vhost logging support.
-
2000-10-19 Fixed bug in $#vhost logging which caused log corruption under
certain circumstances.
-
2000-10-21 Added sample configuration/tutorial to distribution tar ball.
2000/11/03 Fixed Solaris support for spreadlogd. Fixed symbol conflict between mod_log_spread and mod_php which broke $#vhost logging. When the two are used in conjunction.
|
|
|
|
|