Skip to: categories | main content
Esoteric Curio
About meOren Hurvitz has a great post about LinkedIn's architecture. It's well-written and well thought out. Their architecture has evolved on what appears to be a steady and safe path of improvement. It is well worth a read.
I would like to comment on something I see repeated again and again and is likely misinterpreted by young scalability architects. The statement of what you should expect to lose when you scale up/out. Oren writes:
The presentation ends with some tips about scaling. These are oldies but goodies:
- Can’t use just one database. Use many databases, partitioned horizontally and vertically.
- Because of partitioning, forget about referential integrity or cross-domain JOINs.
- Forget about 100% data integrity.
- At large scale, cost is a problem: hardware, databases, licenses, storage, power.
- Once you’re large, spammers and data-scrapers come a-knocking.
- Cache!
- Use asynchronous flows.
- Reporting and analytics are challenging; consider them up-front when designing the system.
- Expect the system to fail.
- Don’t underestimate your growth trajectory.
Now, I agree with much of that. The spammers comment should be revised to "Fraud happens and the bigger you are, the bigger the bullseye." Be aware and protect your assets. Everything from Cache! on down: hard and fast rules. The cost argument is odd. While it is completely correct, it's also rather obvious. If your business model ties audience size and site use to revenue (which it should), then the cost should simply scale sub-linearly w.r.t. revenues (i.e. no big deal). However, there are a few that remain on that list that should be cherished and the loss of them should pain you.
"[You] Can't use just one database" -- this is a conclusion you should arrive at after analysis. We have one client that supports 10 million users on a cluster of partitioned databases. We have another that supports 35 millions users on one database without issue and room for growth.
"Because of partitioning, forget about referential integrity or cross-domain JOINs." Think. Think hard. Think harder. Sometimes it is possible to partition in a fashion that allows for integrity. While I'm sure (or at least hope) that the LinkedIn guys had some sleepless nights making the decision to break foreign constraints, it isn't conveyed. You should absolutely have some sleepless nights over a decision like that. My bank supports many more users and transactions than LinkedIn -- and it damn well better have FKs and 100% integrity. So, while you still may partition in such a fashion that requires a loss of enforced integrity, the decision should be a heavy one.
"Forget about 100% data integrity." WTF? While I'm sure it was the end of the post and he was being smart, someone somewhere might actually take the advice to forget about data integrity. You never, ever, ever forget about it. We have some "one big database" architectures where data integrity has been an issue due to memory bit-flips (corrupt data on disk) -- it's a BFP (big f@#$ing problem) and we treat it that way. Sometimes you make an architectural decision that will make the loss of integrity much more probable (partitioning and losing FK constraints is a ripe example). It's still something that should be attended to with great attention and diligence. you should never forget about data integrity and always put forth the effort required to reach as close to 100% as possible. When you lose data integrity you end up with a big pile of shit in your database. I'll leave you with a rather crass metaphor:
There's an expectation that there is no shit on your living room floor. Don't shit in your living room. Don't let your dog shit in your living room. If you're a dog owner, you know your dog could have an accident. You bought the dog. You chose to increase the probability of finding shit in your living room. Don't ignore it or forget it. Clean up the shit when it happens. If you get suddenly ill while playing your Wii naked and shit on your living room floor (be it probable or improbable)... respect yourself -- clean it up. Never forget the goal: a 100% shit-free living room.



We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Understanding what is and isn't "premature" is what separates senior engineers from junior engineers.

Hi all!
Just a friendly reminder that we'll be having our first meetup tomorrow as planned. I thought as a good kick-off we could all collaboratively share what we do with PostgreSQL. We'll start off with a whirlwind tour of how OmniTI uses PotsgreSQL, taking a brief look at ZFS, DTrace and large datasets. After that I think it would be good to get to know each other -- maybe we'll hit a local pub afterwards!
I look forward to seeing you there!
Meetup starts at 6:30pm
7070 Samuel Morse Dr. Ste 150
Columbia, MD 21046
If you have issues getting in the building, ring me on my cell -- it will be posted on the doors.
Best regards,
Theo
I just registered for OSCON. They say I should advertise that I am a speaker. Here goes.
For the last several years, I've presented multiple talks at the O'Reilly Open Source Conference. My Scalable Internet Architectures talk has been quite popular and drawn large crowds. It is an interesting talk as it doesn't really change with time. As I say, "if principles of good engineering changed frequently, I'd never drive on bridges." The talk is about sound engineering approaches to building really large consumer-facing websites. Almost all of it is open-source centric, which is why it fits so well at OSCON. While my Scalable talk was not accepted this year, I've got another talk lined up that will rock your world.
I am quite excited that my other proposal was accepted. This year I will be giving a session about using DTrace to perform "full-stack" introspection.
Using DTrace we will deep dive into the amazingly cool questions one can ask. Is my application really hitting disk? If so, what line of code is causing it? My process is being descheduled by the kernel, why? I have 100 Apache process and some randomly segfault, how do I get a stack trace when that happens? The app I am running doesn’t have the right debugging output, I need to know more!
DTrace is an oracle. The value of the answers depends on the quality of the questions. Learn to ask good question and prepare to be amazed at the possibilities.
I've given a variation on this presentation at a few places now (both internal to OmniTI and external) and had really positive feedback. I'll be taking these prior presentations and polishing them up for a 45 minute escapade that will open your eyes to new possibilities. DTrace is an amazing tool and once you get used to it, you can really take it for granted. I do. When people watch the presentation and say "by the power of Greyskull," I know I've made my point.
Come to OSCON. Immerse yourself in technology.
Design by Andreas Viklund | Ported to Serendipity by Carl

