High availability with CrateDB

One of the key benefits of a distributed database like CrateDB is its ability to provide high availability for always-on applications, thanks to its shared-nothing architecture. This architecture ensures excellent performance with zero downtime at a minimal operational effort, and in contrast to a primary-secondary architecture, every node can perform every operation and all nodes are configured in the same way.

CrateDB goes beyond just allowing multi-node setups; nodes can be distributed across multiple availability zones or data centers to further enhance availability.
The system ensures uninterrupted data access during maintenance operations through the execution of rolling software updates.
CrateDB natively provides automatic replication of data to a configurable number of nodes in the cluster. CrateDB clusters exhibit self-healing characteristics, where nodes re-joining a cluster after a failover automatically synchronize with the latest data.

Achieving high availability with CrateDB requires a minimum of three nodes to maintain a quorum for master node election, which holds the cluster state. The determination of the number of nodes is guided by the availability Service Level Agreement (SLA), specifying how many nodes can fail before the cluster cannot accept reads and writes. It is recommended to have at least one replica; depending on the availability SLA, having two or more replicas significantly enhances the level of failure tolerance.

CrateDB offers users the flexibility, on a per-table level, to decide how many replicas of the data should be created. This choice dictates how many nodes each table and its shards are replicated on, providing fine-grained control over data redundancy.

High Availability

Securing high availability in a shared-nothing architecture

CrateDB at Big Data Conference Europe 2022

CrateDB Architecture Guide

Additional resources

Documentation

Clustering

Documentation

Resiliency

Interested in learning more?

Company

Ecosystem

Contact