THE RKGBLOG

Database Scaling: War Stories From Leading Sites

I’ve noticed smaller and larger e-tailers tend to run custom in-house e-commerce software, while often mid-sized firms depend on third-party e-comm platforms.

For folks who’ve built their own infrastructure, database scaling is a key strategic concern. Slow databases lead to a slow site, and a slow site suffers from reduced conversion. Indeed, speed is perhaps the most important — but oft-overlooked — component of usability. (See, for example, Google’s Marissa Mayer’s Nov ’06 Web 2.0 comments on the importance of site speed.)

Often, your database is your bottleneck. Speed up database reads and writes and your site will zoom. Better hardware helps, but fast growing sites soon reach the point where it makes sense to scale out rather than up.

If your in-house IT folks are responsible for your database strategy, I highly recommend this series of fascinating snippets on database approaches put up by O’Reilly last spring. Tim O’Rielly talked to prominent web sites about the nuts and bolts of their database strategies here:

* Second Life
* Bloglines and Memeorandum
* Flickr
* NASA World Wind
* Craigslist
* O’Reilly Research
* Google File System and BigTable
* Findory and Amazon
* Brian Aker of MySQL Responds

Of the nine, the O’Reilly Research post is likely most relevant to mid-sized retailers — Roger Magoulas discusses how savvy DBA skills reduced the run time for an important query from “query never finished” to a zippy “query runs under two minutes.” Tips: clean up your data (Magoulas discussed the performance hit from having to deal with orphaned rows), partition your data sensibly, and use automation to keep your partitioning appropriate.

But even the war stories from the larger sites are instructive.

Last spring, Craig’s List’s active data (90 days) was 114G and 56 million rows, with another 238Gigs and 96 million rows in their less-active (older than 90 days) archive. (One suspects all these numbers are significantly bigger now, a year later.) And compared to most shops, Craig’s List has practically no IT staff — the entire firm is 24 people. Last spring, at this scale, Craigslist was still running a single master database, but was in the process of moving to a cluster.

And here’s Second Life’s Ian Wilkes on SL’s preference for scaling-through-architecture vs. scaling-through-hardware:

I think the biggest lesson we learned is that databases need to be treated as a commodity. Standardized, interchangeable parts are far better in the long run than highly-optimized, special-purpose gear. Web 2.0 applications will require more horsepower with less money than “One Database” or his big brother “One Cluster All Hail The Central Cluster” will offer.

Great series.

Technorati Tags: , , , , , , , , , , , , , , ,

  • Alan Rimm-Kaufman
    Alan Rimm-Kaufman founded the Rimm-Kaufman Group...
  • Comments
    2 Responses to “Database Scaling: War Stories From Leading Sites”
    1. Jet Fraklin says:

      I wanted to comment on this post since it is the top result on google for “database scaling” right now. This is great information on why keeping scaling in mind is so important. I too have my war stories and have finally landed on database sharding as the best solution. You keep your existing design and DBMS in place and simple add a driver to support sharding. dbshards.com is one good example.

      Thanks,
      Jet

    Trackbacks
    Check out what others are saying...
    1. [...] The moment you deviate from the norm, things start becoming more complicated and less user friendly. The assumption that goes with this is that if a customer’s solution requires to do such work, he is also prepared to take the extra cost of employing a special administrator like a DBA or can pay for a special installation by the solution experts. Often this is also necessitated by the sheer amount of configuration required. Any DBA will tell you that no database of a fairly large size can run properly without custom tweaks ( de-normalize table A + put table B on a different server + use Indexing on table C and remove the one on Table D). Creating a self handling maintainable big solution is therefore really a big challenge, [...]