WHAT is CrateDB?

CrateDB is an open source, distributed SQL database with integrated search that makes it simple to store and analyze massive amounts of structured and unstructured data in real-time.

CrateDB is ideal for machine data. It's able to ingest millions of sensor readings or log entries per second and query them at in-memory speed using the same SQL syntax that already exists in your applications and BI tools.

After switching to CrateDB, customers have reported being able to query terabytes of data 20x faster, while reducing database hardware costs by 75%.

Do you love containers? Crate runs perfectly as stateful container.

Core Features

Simply scalable

CrateDB is horizontally scalable with automatic data sharding and rebalancing. That means you can resize your CrateDB cluster, simply by adding machines, without having to know how or where to re-shuffle or re-index data on the cluster.

Distributed SQL queries, aggregations, and search

CrateDB's distributed SQL query engine features columnar field caches, and an advanced query planner that give CrateDB the unique ability to perform aggregations, sub-selects, and ad-hoc queries at in-memory speed. CrateDB integrates native, full text search features, which enable you to query structured or unstructured data with SQL. No more having to use separate SQL and Search databases to manage tabular and non-tabular data.

Highly available

Automatic replication of data across your cluster and rolling software updates help ensure hardware failures and scheduled maintenance do not interrupt database availability. CrateDB clusters are self healing; when nodes join or rejoin the cluster, CrateDB automatically populates them with data and balances the cluster.

Real-time ingestion

Most analytic workloads get ingested in batch loads, often with transactional locks and other overhead. CrateDB allows lock-free ingestion with massive write performance (e.g. 40.000+ inserts per second per node on commodity hardware). CrateDB can deliver millisecond-speed query performance, even when writes are in action.

Any data and BLOBs

CrateDB´s columnar store supports both relational data, as well as nested JSON-documents All nested JSON attributes can be included in any SQL command. In addition, CrateDB provides BLOB storage so you can persistently store and retrieve BLOBs – typically pictures, videos or large unstructured files - providing a fully distributed cluster solution for BLOB storage.

Time series analysis

CrateDB allows the automatic partitioning of any table. Partitions are virtual tables that can be queried, moved or deleted like any other table. Partitioning data by time period enables very fast time series query performance.

Geospatial queries

Store and query geographical information of any kind using the geo_point and geo_shape types. For fast results, use geographic indices with given precision and resolution, or run exact queries with scalar functions like intersects, within, and distance.

Dynamic schemas

Contrary to many other scale-out solutions CrateDB´s schemas are totally flexible. You can add columns anytime without any penalty or re-indexing requirements. This is great for agile development and fast deployments.


CrateDB is eventually consistent, but offers transactional semantics. CrateDB is consistent at the row level, so each row is either fully written or not. By offering read-after-write consistency we allow synchronous real-time access to single records, immediately after they are written.

Even though CrateDB does not support ACID transactions with rollbacks etc, it offers Optimistic Concurrency Control by providing an internal versioning, that allows detection and resolution of write conflicts.


Incremental snapshots can be created anytime and saved to storage (e.g., file system, HDFS, AWS S3, etc.). Snapshots contain the state of the tables in a CrateDB cluster at the time the Snapshot was created, and can be restored into the cluster anytime.

Open and flexible

Plug-in architecture - If you require special application-specific functionality, you can extend the functionality of CrateDB by writing your own plug ins.

Microservices - CrateDB´s shared-nothing architecture allows it to run perfectly in ephemeral environments such as Docker, Kubernetes, CoreOS, and Mesosphere. Crate is a scalable, containerized persistence layer that scales along with your app containers.

Use any language - Between drivers provided with CrateDB (JDBC, Ruby, Python, PHP, ODBC, etc.) and drivers from the community (Ado, Erlang, etc.), you can use almost any language with CrateDB.

Open Source - CrateDB is written in Java and licensed under the Apache 2.0 license. Enterprise licensing (SLA, Indemnification, bug fix escalation etc) is also available.

What can I use CrateDB for?

Enterprises and startups have deployed CrateDB clusters to power real-time analytics, real-time dashboards (network traffic, security events), IoT-backends (sensor data, telemetry data), ad-tech (web traffic), telecom applications  (call logs, CDRs) and user-facing Web and Mobile apps (large tables with fast-growing and dynamic data).

Generally speaking, CrateDB fits well if:

  • You require a horizontally scalable, relational SQL database with integrated search
  • Your applications & dashboards require fast search, aggregations, ad-hoc queries
  • You need to query data in real time while writing data simultaneously.
  • You have huge amounts of data (trillions of records in hundreds of TBs)
  • Your database must be highly available never go down
  • You want to be able to scale out horizontally as you grow
  • You want to be faster, more agile and save money on licenses and hardware

CrateDB isn’t a good choice if you have strong consistency requirements (ACID) and very complex relational schemas (e.g. highly normalized with many tables and many joins).  


Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form