I’ve wanted to write a book for a while – mainly because people seemed genuinely interested in the presentations I have given over the past several years at the ApacheCon conferences.

While my academic background and most of my project background is in the area of high-availability and resource allocation in clusters (i.e. load-balancing), it seemed that a book on it would either be too high level or too applied. The problem with high level conversation is that you end up boring people to death who are practical and not providing enough hands-on information to those that are engineers. Too low-level is usually not a problem in the field of technology, but HA/LB systems are traditionally commercial and a hands on book starts to look a lot like a product’s user manual.

Where to go from here? Well, I am not a big fan of a book on one piece of software, specifically when it is a low level systems tool. Books on Oracle or MySQL or Apache are quite useful if you want to learn about the specific software products. The problem with HA/LB is that it is a small layer on larger an architectural issue that is deep and all-encompassing. That issue is scalability. Concentrating on one technology in particular tends, it is easy to deviate from the purpose – the purpose being “the methodology of scaling systems.”

Scalable Internet Architectures will be my stab at tackling various aspects of scalability in today’s web architectures. The focus is on scalable system design methdodology and “thinking scalable.” Topics include:

  • Managing large systems
  • High availability
  • Load balancing
  • Highly-distributed static content delivery
  • Dynamic caching technologies
  • Databases and Database Replication
  • Clustered Logging
  • Building highly customized tools to tackly acute performance problems

Additionally, I will talk (at various depths) about the following technologies (short list):

  • Apache, thttpd, Squid
  • Linux, FreeBSD
  • MySQL, Oracle, Postgres
  • Wackamole, CARP, VRRP
  • DNS, Anycast (shared-IP)
  • Spread
  • perl, PHP, mod_perl, Apache::ASP
  • RHT, Splash!, memcached, NBD
  • … loads more.

The goal of the book is to look at the typical stresses in a production architecture as the demands on that architecture increase and to walk through solutions and understand how they alleviate those stesses.

To appease myself, there will be a sufficient amount of theoretical talk (a.k.a. idealism without regard to practicality). By being throwing idealist perspectives in where appropriate, the hot burning flames of product and technology propaganda can be kept under control.

I hope that readers are able to gain two vital skills from reading the book:

  1. the ability to look at systems and understand what the bottlenecks are/will be and apply basic techniques to increase horizontal scalability,
  2. to take every new technology with a grain of salt and thoroughly understand its limitations and therefore better understand how it can (or can’t) be placed to benefit an architecture.

… wish me luck.