Public and private clouds empower businesses to move away from traditional error-prone architectures and run applications with five- and six-nines availability. Business applications can be spun-up on demand, instantly and cost effectively. Database applications have always been a major component of all enterprise infrastructures, but these applications—and, specifically, relational databases—still have a long way to go when it comes to using the power of the cloud. They were designed as large monolithic applications and present a considerable challenge when you’re trying to run them reliably in a scalable manner.
Traditional databases for distributed environments are generally deployed as multiple isolated database instances with the capability of querying in a unified manner. For test/dev environments, numerous physical copies of the production database are created in the background, leading to data sprawl. Cloud integration with these solutions is limited: the focus is on the locality of data instead of using the power of distributed systems. Let’s consider different options for solving these problems for the cases of highly available and test/dev databases.
Highly Available Databases
Highly available databases in the cloud era are database instances that are scalable, fault tolerant and compatible with any private or public cloud. They’re built to deliver business continuity without user-experience glitches owing to any kind of hardware or network failure. The core design principles are to remove any single points of failure and to deliver a smooth failover experience.
Active/Passive Database Replica Pair
The first option is to deploy a database in a master/replica architecture so that a single master server, at any given time, serves database requests. A replication policy synchronously replicates data from the master server by using either the database vendor’s replication feature or an external third-party replication tool. As soon as a master server fails, the replica server takes over and uses replicated data to restart the database where it left off before the failure.
This approach may suffer from database performance and reliability issues. Database operations will be as fast as the network across these two locations. A third-party replication tool may fail to provide steady data replication and may land the database in an inconsistent state after the failover.
Database With Built-In High Availability
The second option is to look for a database solution that provides built-in high availability. This capability is more common in NoSQL databases such as Cassandra and MongoDB. They create consistent data replicas from the database layer and enable fault-tolerant databases. This approach may not work for many enterprises, however, because traditional relational-database-management systems lack the capability. Also, hybrid- and multicloud replication are generally not an option with these solutions.
Cloning a database is a common requirement for test/dev and analytics. A production database always runs in an isolated infrastructure, and one or more copies of this database are created for backups, big-data analytics and quality assurance.
Snapshot and Cloning
To make a database copy to another site or cloud, the operator can take a snapshot and physically clone it to a different location. Doing so involves setting up a policy that periodically takes a snapshot of the database and clones it to a predefined location. Users need a solution like Oracle RMAN, which can track the changes between snapshots, take consistent backups and recover when needed. But such a solution doesn’t exist for all databases, and using remote data centers and clouds for creating such clones remains a complex, if feasible, process.
Highly Available Storage for Private, Hybrid and Multicloud
A better solution to iron out these problems is to completely abstract the database layer from the storage layer and let the storage solution handle high availability for the application. Software-defined storage (SDS) can furnish this capability by providing data protection from different kinds of hardware and software failures. SDS can grant the flexibility of using any kind of storage hardware at the back end, including physical on-premises servers and virtual cloud instances.
The only downside of this approach is investing in the right solution—one that’s easily integrated with the database application of a customer’s choice and that’s also compatible with other databases to enable transitions. Also, such a solution should be able to run on any private or public cloud and should blur the boundaries between on-premises and public-cloud locations.
The following is what such a solution would look like:
Database as a Service
Applications such as social media, investment and fantasy gaming that need five- or six-nines availability along with worldwide accessibility are best suited to entire database systems in the cloud. Database as a service (DBaaS) provides an easy way to spin databases in the cloud and eliminate the time to buy servers, architect the infrastructure and build a large team to manage these databases. Vendors provide multiple ways to consume DBaaS offerings such as virtual machine(s) with DB(s) installed, DB Schema as a service and DBs with specialized dedicated hardware, for RDBMS as well as NoSQL databases.
The following problems must be solved when considering any cloud-based database installation:
- Vendor lock-in: Each provider has its own orchestration framework, which makes it harder for consumers to move from one provider to another.
- Data synchronization: There’s always a need for an external tool to copy data from one location to another in a consistent manner. These tools are generally disruptive, expensive and complex, and they act as a huge bottleneck in building an environment spanning multiple locations.
- Cloud cost analysis: A careful budget analysis should be done before deciding on what applications/data should be in the cloud versus on premises. Budgets can easily spin out of control if the hybrid cloud isn’t designed while keeping in mind that simplicity and flexibility come at a cost.
Databases have been around for more than 50 years, running successfully in traditional on-premises environments. The time has come for businesses to grab a competitive edge by employing cloud solutions for running modern databases.
About the Author
Gaurav Yadav is founding engineer and product manager at Hedvig. He has more than 10 years of experience working in storage, databases, distributed systems and virtualization. His previous experience includes working with a search-engine startup, Google and Oracle.