CrateDB

The distributed SQL database for machine data

CrateDB Technical Overview White Paper

Download

What is CrateDB?

CrateDB is a distributed SQL database built on top of a NoSQL foundation. It combines the familiarity of SQL with the scalability and data flexibility of NoSQL, enabling developers to:

  • Use SQL to process any type of data, structured or unstructured
  • Perform SQL queries at realtime speed, even JOINs and aggregates
  • Scale simply

Customers often use CrateDB to store and query machine data. This is because CrateDB makes it easy and economical to handle the velocity, volume, and diversity of machine and log data. In fact, customers have reported CrateDB ingesting millions of data points per second, while also querying terabytes of data in real time… 20x faster than their previous database and on 75% less database hardware.

Core Features

CrateDB Core Feature Simply Scalability

Simply scalable

Growing a database should be easy, and it is with CrateDB. Automatic data rebalancing and a shared-nothing architecture enable you to scale simply. Just add new machines to create and grow a CrateDB cluster. There’s no need to know how to redistribute data on the cluster because CrateDB does it for you.

CrateDB Core Feature Distributed SQL

Distributed SQL queries,
aggregations, and search

CrateDB’s distributed SQL query engine features columnar field caches, and a more modern query planner. These give CrateDB the unique ability to perform aggregations, JOINs, sub-selects, and ad-hoc queries at in-memory speed. CrateDB also integrates native, full-text search features, which enable you to store and query structured or unstructured data together. Therefore, you no longer have to use separate SQL and Search databases to manage tabular and non-tabular data.

CrateDB Core Feature High Availability

Highly available

Even if things go wrong in your data center, CrateDB keeps running. Automatic replication of data across your cluster and rolling software updates help ensure hardware failures and scheduled maintenance do not interrupt access to data. In addition, CrateDB clusters are self healing, so when nodes are added to the cluster, CrateDB automatically loads them with data.

CrateDB Core Feature Scalability Real-time

Real-time data ingestion

Analytic data is often loaded in batches, with transactional locks and other overhead. By contrast, CrateDB eliminates locking overhead to enable massive write performance (e.g. 40.000+ inserts per second per node on commodity hardware). Furthermore, CrateDB can deliver millisecond-speed query performance, even when writes are in action.

CrateDB Core Feature Data

Any data and BLOBs

CrateDB supports both relational data, as well as nested JSON-documents All nested JSON attributes can be included in any SQL command. In addition, CrateDB provides BLOB storage so you can store and retrieve BLOBs like pictures, videos, or large unstructured files – providing a fully distributed cluster solution for BLOB storage.

CrateDB Core Feature Time Series

Time series analysis

Time series data is important for identifying trends and anomalies. CrateDB makes time series analysis fast and easy with automatic table partitions, which are like virtual tables that can be queried, moved or deleted. Partitioning data by time intervals delivers very fast time series query performance.

CrateDB Core Feature Geospatial Queries

Geospatial queries

Location is important for many machine data analyses. For this reason, CrateDB can store and query geographical information using the geo_point and geo_shape types. You can control geographic index precision and resolution for faster query results, and also run exact queries with scalar functions like intersects, within, and distance.

CrateDB Core Feature Dynamic Schemas

Dynamic schemas

Unlike many other SQL databases, CrateDB schemas are totally flexible. You can add columns anytime without slowing performance or downtime. This is great for agile development and fast deployment.

CrateDB Core Feature Transactional

Transactional

CrateDB is eventually consistent, but offers transactional semantics. CrateDB is consistent at the row level, so each row is either fully written or not. By offering read-after-write consistency we allow synchronous real-time access to single records, immediately after they are written.

Even though CrateDB does not support ACID transactions with rollbacks etc, it offers Optimistic Concurrency Control by providing an internal versioning, that allows detection and resolution of write conflicts.

CrateDB Core Feature Backups

Backups

CrateDB can save incremental snapshots of your database to storage. Snapshots contain the state of the tables in a CrateDB cluster at the time the snapshot was created, and can be restored into the cluster anytime.

CrateDB Core Feature Openness

Openness and flexibility

• Run CrateDB anywhere, in your data center or in the cloud

• Connect to CrateDB from most any language, SQL application or SQL BI tool

• Extend CrateDB functionality by writing your own plug ins

• Deploy CrateDB as a container on Docker, Kubernetes, or others

• Use CrateDB Community Edition for free, under the Apache 2.0 open source license.

When is CrateDB a
good choice for you?

Enterprises and startups often use CrateDB to power real-time machine data monitoring and analytics dashboards. However, CrateDB is a good choice for any application if you require:

A horizontally scalable, relational SQL database with integrated search

The economics and ease of an open source SQL database

Fast search, aggregations, or ad-hoc queries

You need to query data in real time while writing data simultaneously

You have huge amounts of data (hundreds of terabytes)

A highly available database with zero downtime

White Paper

CrateDB vs. Time Series DBS

In a benchmark querying 314 million rows of sensor readings, CrateDB executed 10x more queries per second under load than InfluxDB.

When is CrateDB not
a good choice for you?

On the other hand, CrateDB may not be a good choice if you require:

Strong (ACID) transactional consistency

Highly normalized schemas with many tables and many joins

Resource Library

Resources

White Background with lines and rectangles in cyan and grey
Overview

CrateDB Editions

Learn about CrateDB vs. CrateDB Community Edition

Word Service written on a white backgrounds with lines and rectangles
Overview

Crate.io IoT
Data Platform

Put Machine Data to Work (even easier).

Machine Data
Good to know

What is
Machine Data?

Manage Machine Data in Real-Time with the SQL database CrateDB.

CrateDB versus
Comparison

CrateDB versus other Databases

High-level comparison of CrateDB versus other database categories.

Request

Schedule a 1-on-1 Demo with a Database Engineer