Joe Gregorio | BitWorking | ETech ’07 Summary – Part 2 – MegaData

Common Themes

Some common themes are emerging. If you want to scale to the
petabyte level, or the billion requests a day, you
need to be:Distributed
The data has to be distributed across multiple machines.

No joins, and no referential integrity, at least at the data store level.

No one said this explicily, but I presume there is a lot of de-normalization going on if you are
avoiding joins.

No transactions

Those constraints represent something fundamentally different
from a relational database.

The only difference between today and two years ago when Adam Bosworth gave his talk Database Requirements in the Age of Scalable Services is that there’s a lot more public knowledge about
what the likes of Google and eBay are doing

Joe Gregorio | BitWorking | ETech ’07 Summary – Part 2 – MegaData


About this entry