Tag Archives: graph databases

The next big thing, part 3: Taking the relational out of relational databases

Part of an ongoing series.

The relational database is an extremely powerful tool. But sometimes data isn’t very relational, and sometimes transactional, relational, integrity is not as important as it is for, say, a bank. This is one reason why so many sites can get away with mySQL backed by myISAM tables — they’re fine if you’re read-heavy and data integrity is not mission-critical.

Some new projects have sprung up which provide key-value stores or simpler kinds of databases without all the overhead and inflexibility of a relational database.

On the other hand, sometimes data is way more interrelated than a traditional relational database is prepared to handle. Sometimes different kinds of items (i.e. rows) in a database can be related to many other kinds of items in that database, and sometimes end users can create not just new items or new relationships, but new kinds of relationships between items. This type of database is called a graph database, and there are also projects pushing the boundaries of relational in this completely opposite direction.

Pretty much everywhere I interviewed back in February 2008 was either building their own graph database, working on an existing one, or repurposing a relational database (or, in one case, a search backend), to kinda, sorta behave like one. The w3c, not one to be left behind when there’s a specification to be written, is even working on a SQL-inspired query language intended to search them1.

Most applications have some combination of totally un-relational data that can go in a key-value store, some strictly relational data that belongs in a SQL database, and some flexible, highly relational data that belongs in a graph database.

What will happen when these alternative databases start giving traditional relational databases a run for their money? Well, sharding, caching, and normalization all start to sound a lot more complex when the data is in a few different kinds of databases — but then again, maybe optimization won’t be as necessary if a single SQL database isn’t doing all the heavy lifting. Object-relational mappers (and the web frameworks that use them) might need to talk to, and abstract away from, different kinds of databases2.

And the different types of data won’t always be easily separated along table boundaries. Maybe these different types of databases will talk to each other, or maybe they will mature into über-databases that understand lots of different types of data relationships.

But the monolithic, strictly relational, master SQL database is eventually going to go the way of Cobol3.

  1. Of course, if it’s anything like other technologies designed by the w3c, it’s a steaming pile. []
  2. Some can already handle talking to multiple SQL databases, and of course there’s two-phase commit. []
  3. Or Kobol. []