CrateDB uses Elasticsearch as a library for cluster management, node discovery, communication, and so on. However, CrateDB is a fully-featured distributed SQL database with its own unique featurset and has completely replaced the Elasticsearch query engine with its own advanced distributed SQL query engine.
This post covers some of the differences between CrateDB and Elasticsearch on its own.
This post was updated with new information on 21.12.2015.
CrateDB supports cross joins, inner joins, left outer joins, right outer joins, and full outer joins.
Elasticsearch does not support joins.
CrateDB is more than just a database, and offers a complete solution for distributed blob storage that comes with replication and rebalancing. This offers you the opportunity to replace expensive network or cloud storage solutions with cheap commodity hardware.
Elasticsearch does not support blob storage. Typically, Elasticsearch is used together with GridFS or HDFS for blob storage.
CrateDB supports accurate aggregation functions.
Elasticsearch offers HyperLogLog aggregations, which can only approximate values.
CrateDB distributes aggregation calculations across the whole cluster using a simple modulo based hashing algorithm. As a result, aggregation calculations use the complete memory and processing power of the cluster.
Elasticsearch scatters queries to all the nodes, and then gathers the responses on the node that is handling the client request, resulting in high memory usage, and under-utilisation of your cluster’s resources.
Post-aggregation filtering (i.e.
GROUP BY ... HAVING) is fully implemented in CrateDB.
Elasticsearch has a number of limitations on this type of query.
CrateDB supports the creation of partitioned tables, which transparently partition your data based on the value of particular column.
Elasticsearch supports table aliases. This can be used to achieve the same, but you will have to implement the logic for yourself in your application.
CrateDB supports the
COPY FROM and
COPY TO SQL statements for exporting or importing data in JSON format.
Elasticsearch does not offer an import or export feature.
Since version 0.46, CrateDB fully supports array types.
Elasticsearch does not strictly distinguish between arrays and core types (a string type can be string or string array depending how you insert it).
CrateDB allows you to update one or multiple documents with a
Elasticsearch only allows you to update a single document at a time, and you must reference the document by its
CrateDB allows you to insert data with the results of a query instead instead of manually passing in the data values.
This feature can be used to dynamically create new records as well as to restructure a table by renaming a field, changing a field’s data type, or by converting a normal table into a partitioned one.
Elasticsearch does not support insertion via query.
CrateDB ships with an administration user interface. This admin UI shows cluster, node, and table information. It also includes an interactive SQL console, notifications of new CrateDB news, and a “Getting Started” section.