Note: This is the first part of a three-part blog series on NoSQL database technology.
Table Of Contents
Introduction
A lot of companies that end up as Volt Active Data customers started out by replacing a legacy RDBMS platform with a NoSQL database. Given the poor performance, lack of cloud-friendliness, and exorbitant costs traditionally associated with legacy database technology, the decision to move away from it is unquestionable. Unfortunately, NoSQL typically just replaces legacy-based issues with other types of problems.
While legacy RDBMS technology was inconvenient, it did almost anything you could want, albeit badly. A lot of NoSQL platforms started out as key-value stores or caches and fail (sometimes catastrophically) when you ask them to go out of their ‘comfort zones’.
Vendors have been adding features we associate with legacy relational database systems to NoSQL systems, but unfortunately there’s no ‘magic software fairy’ that can wave a wand and make complex new features work on a NoSQL platform without negative side effects.
Specifically, what we’ve seen is that companies come to us when they run into serious challenges with one or more of the following:
1. Transactions at scale
ACID transactions are one of the hardest things to get right at scale. The issue is not how the system handles a single transaction that modifies three things for a single user. It’s how it handles 50,000 such transactions per second, with different attributes being changed in different sequences.
Things really get ugly when multiple people can change the same data items at the same time. Techniques like optimistic locking can fall apart in such scenarios. In the real world, we will see this even in use cases where in theory multiple simultaneous access should never, ever happen.
Another thing to watch out for is what happens to transactional states when a single node in a cluster fails while it’s busy. Can you end up with half a transaction committed?
2. Complex data structures
Most NoSQL systems assume that your data can be mapped into keys and values. Developers absolutely love this, as they no longer have pesky DBAs looking over their shoulder. But as new use cases are added to the system the data structures get more complicated and become much harder to share between different teams of developers. This is why SQL became successful, as it allows you to navigate data structures without having to worry about the minutiae of how the data was actually stored. A lot of vendors are now tacking SQL layers on top of their products, but you need to evaluate them very carefully, as these layers make assumptions about how organized your developers were when it came to storing the data.
3. Aggregation
Traditional databases excelled at questions like ‘Show me customers that we’ve shipped more than $10,000 worth of products to, but they’ve only paid for $8,000 or less’. These aggregate questions can be very hard to do in products that lack built-in support. In cases where you need to get a perfectly accurate answer, you may face significant challenges.
4. Geo-replication
Whether it’s for business continuity or consistent low latency, you may find yourself facing a requirement that live copies of the database be in multiple locations at once. This is a very complex thing for any database product to do, and is notoriously hard to retrofit to either an application or an underlying data platform.
5. Foreign key access
As data models become more complex, the likelihood that you will need to access a value without knowing it’s ‘official’ key becomes much, much higher. Not all vendors have a good story or answer here.
Conclusion
The goal of this blog post is not to question the utility or effectiveness of NoSQL platforms. They are excellent when used for what they were designed for. But when you try to go beyond that, you tend to end up facing problems.
While many NoSQL vendors are working hard to address the issues we highlight above, one shouldn’t assume that their fixes will work any better than a legacy RDBMS, or that their fixes will be without negative consequences of their own.
To truly tackle all of the above while maintaining data consistency and accuracy, you need a data platform that can manage complex fast data at scale without consequences of downtime, lost data, lost security, or lost revenue. It’s not easy to find such a data platform—although it may be easier than you think because you are on the Volt Active Data website. Have a look around or contact us today to get started.