About the time the nosql movement started to get legs, I started to get interested in relational databases. They’re awesome beasts, and horizontal scalability issues aside, they handle a lot of the very hard problems in the web programming space (i.e. durability and concurrency). But when data gets large and you can’t find the 2 million quarters in your sofa for that oracle license (or the 2 million braincells for that oh-so-perfect sharding scheme), the nosql databases start to show their appeal.
But maybe there’s another way. This article is mostly focussed on relaxing the durability constraints of postgres but also mentions a possible mixing of relational and non-relational systems. What a cool idea! Use the somewhat batch-oriented, distributed, and non-relational system as a large backing store, and populate a traditional relational database (almost) on demand by transforming and moving (ETLing) the data there. With the right transforms, we get all the ad-hoc queries of our beloved star schema data warehouses without the expense of a big-honkin’ database server. Certainly these ‘transforms’ are the tricky part (and slicing the data out of the non-relational store along a particular dimension (probably time) will almost always be necessary), but the gains seems pretty great. Maybe we’ll call this approach a “pattern” someday.
Isn’t big data exciting?