So,
I do a lot on the scalability front. I spend a lot of time
reviewing people's architectures and helping them understand how things can change to make sure that their data infrastructure can survive substantial growth. Scalability isn't a new concept, but before 10 years ago, there was so much concentration on performance that people that specialized in the area of scalability had to make a point. What was that point?
Scalability has little to do with performance; moreover, a scalable solution is one that does not need to change when the problem size increases. Vertical scaling (which is buying bigger machines to crunch the problem faster) is inherently limited by the hardware available on today's market; and while it is always getting faster, it is likely that your cool new idea will manage to exceed the capacity of one of these monsters. Horizontal scaling (adding more machines, not faster ones) is what "scalability" is all about. And clearly this has nothing to do with performance.
Point made. However, somewhere along the way, a sense of relativity was lost. The sentiment of horizontal scalability being so important that performance is irrelevant. This is a scary outlook, but one I more and more commonly see.
If you have an architecture that requires five or six machines, the concept of improving performance by a factor of two will drop you by two or three machines. What's the big deal, right? The linear extrapolation of this should be clear, but I'll spell it out. If traffic increases by a factor of 100, you need 600 boxes now instead of 300... but what's 300 boxes between friends, right?
Now, you may argue that going in an optimizing a system is never likely to result in a two-fold speed up. Several years ago, I'd lead with the assumption that you are right. However, this incessant focus on scalability over performance has caused a shift in reality. The concept that horizontal scalability is the only key to an architecture's success has caused an unhealthy mental attitude in the industry.
There is a computer science mantra (one of many) stated by Donald Knuth as; "Premature optimization is the root of all evil." I've argued for a long time that defining "premature" for a particular situation is what separates a senior and a junior engineer. Many times I hear the argument for inefficiencies as, "we didn't want to prematurely optimize it and now that we see it, it is the right time to fix it." That's generally a bad position to be in. The awful thing now is that I witness optimization delinquency on a regular basis. When pointing out a point for specific optimization and the response is: "buying more servers is cheaper than fixing that problem." Achieving a two-fold performance boost is increasingly more common due to optimization delinquency.
This mentality that "I can just buy more servers because my infrastructure scales horizontally" is being applied far too liberally. Reducing a server farm from 600 to 300 has a significant impact on the capital investment and ongoing maintenance and operational costs manifesting as unnecessary software engineering to solve problems that would otherwise not exist, additional space, additional power, additional labor, etc.
If your site doesn't scale horizontally, you will be in a world of hurt when faced with serving ever-growing needs. If you can optimize your applications, every step on the path of executing your scaling plan with be more reliable, more manageable, and dramatically CHEAPER.
Scalability experts everywhere: don't forget what we should be expecting from a single box.
Wednesday, May 23. 2007 at 22:30 (Link) (Reply)
Theo, What you are forgetting is that hardware and data center costs rollup into a different department's cost center, and since they force me to use ancient Redhat...... What motivation is there to use less servers? :-)
Saturday, June 2. 2007 at 16:54 (Link) (Reply)
Nice article - performance per server shouldn't be completely ignored.
Saturday, March 14. 2009 at 15:25 (Link) (Reply)
Too many machines begets more machines, it's like they reproduce! With a huge setup comes huge dev, qa, staging, and launch machine requirements.
Not all features are created equal, attack your feature set first, then right size your solution to the business ROI.
Vertical scaling has it's place.