Variety
Query structured and semi-structured data. Benefit from full-text and geo-spatial indexing
Real-time
Everything is indexed, run ad hoc queries in real-time. Explore your data in the moment
Run anywhere & everywhere
Run on-premise from a single to hundreds of nodes and deploy for production to our managed cloud
Fast Aggregation
Instantly aggregate across billions of records, adjust and iterate to get the results you need
Simple Scalability
Shared-nothing architecture that’s simple to scale. Run worry free with built-in high availability
SQL & NoSQL
Index high data volumes with a distributed SQL DBMS built atop Apache Lucene
Variety
Query structured and semi-structured data. Benefit from full-text and geo-spatial indexing
Real-time
Everything is indexed, run ad hoc queries in real-time. Explore your data in the moment
Run anywhere & everywhere
Run on-premise from a single to hundreds of nodes and deploy for production to our managed cloud
Fast Aggregation
Instantly aggregate across billions of records, adjust and iterate to get the results you need
Simple Scalability
Shared-nothing architecture that’s simple to scale. Run worry free with built-in high availability
SQL & NoSQL
Index high data volumes with a distributed SQL DBMS built atop Apache Lucene
CrateDB is data-shape agnostic
Ready to handle structured and unstructured data
/*
* Based on sensor data, this query calculates:
* - time-buckets of 10 seconds
* - different aggregations per time-bucket and host group
*/
SELECT DATE_BIN('10 seconds'::INTERVAL, m.timestamp, 0) AS period,
h.host_group,
MIN(m.fields['usage_user']) AS "min",
AVG(m.fields['usage_user']) AS "avg",
MAX(m.fields['usage_user']) AS "max"
FROM telegraf.metrics m
LEFT JOIN telegraf.hosts h ON h.host_name = m.tags['host']
WHERE tags['cpu'] = 'cpu-total'
AND m.timestamp > NOW() - '5 minutes'::INTERVAL
GROUP BY 1, 2
ORDER BY 1 DESC;
+------------+------------+--------+--------+--------+
| period | host_group | min | avg | max |
+------------+------------+--------+--------+--------+
| 1621515050 | Demo Hosts | 0.125 | 0.125 | 0.125 |
| 1621515040 | Demo Hosts | 0.1753 | 0.1753 | 0.1753 |
| 1621515030 | Demo Hosts | 0.2251 | 0.2251 | 0.2251 |
| 1621515020 | Demo Hosts | 0.2251 | 0.2251 | 0.2251 |
| 1621515010 | Demo Hosts | 0.1251 | 0.1251 | 0.1251 |
| 1621515000 | Demo Hosts | 0.2501 | 0.2501 | 0.2501 |
| 1621514990 | Demo Hosts | 0.1501 | 0.1501 | 0.1501 |
| 1621514980 | Demo Hosts | 0.075 | 0.075 | 0.075 |
| 1621514970 | Demo Hosts | 0.2503 | 0.2503 | 0.2503 |
| 1621514960 | Demo Hosts | 0.2 | 0.2 | 0.2 |
| 1621514950 | Demo Hosts | 0.1001 | 0.1001 | 0.1001 |
| 1621514940 | Demo Hosts | 0.175 | 0.175 | 0.175 |
| 1621514930 | Demo Hosts | 0.25 | 0.25 | 0.25 |
| 1621514920 | Demo Hosts | 0.1 | 0.1 | 0.1 |
+------------+------------+--------+--------+--------+
/*
* Based on a date dimension, this query calculates:
* - the observed value, carrying forward the last known value
* - the observed value's ratio compared to the previous value
* - the average over a moving window of the last 100 rows
* - the maximum value during the same hour of observation
*/
SELECT s.time,
COALESCE(
distance,
LAG(distance) IGNORE NULLS OVER (ORDER BY s.time ASC)
) AS distance,
1 - (
LAG(distance) OVER (ORDER BY s.time ASC) / distance
) AS ratio,
AVG(distance) OVER (
ORDER BY s.time ASC ROWS BETWEEN 100 PRECEDING AND CURRENT ROW
) AS moving_avg,
MAX(distance) OVER (
PARTITION BY DATE_TRUNC('hour', s.time)
) AS hourly_max
FROM GENERATE_SERIES(
DATE_TRUNC('minute', NOW() - '10 minutes'::INTERVAL),
DATE_TRUNC('minute', NOW()),
'1 minute'::INTERVAL
) s(time)
LEFT JOIN doc.raw r ON r.time >= s.time
AND r.time < s.time + '1 minute'::INTERVAL
ORDER BY 1 ASC;
+---------------+----------+---------+------------+------------+
| time | distance | ratio | moving_avg | hourly_max |
+---------------+----------+---------+------------+------------+
| 1619424000000 | 108.886 | NULL | 108.886 | 124.749 |
| 1619424060000 | 109.88 | 0.99 | 109.383 | 124.749 |
| 1619424120000 | 112.171 | 0.99 | 110.312 | 124.749 |
| 1619424180000 | 112.815 | 0.99 | 110.938 | 124.749 |
| 1619424240000 | 115.87 | 0.99 | 111.924 | 124.749 |
| 1619424300000 | 116.737 | 0.99 | 112.726 | 124.749 |
| 1619424360000 | 118.429 | 0.99 | 113.541 | 124.749 |
| 1619424420000 | 119.76 | 0.99 | 114.319 | 124.749 |
| 1619424480000 | 121.739 | 0.99 | 115.143 | 124.749 |
| 1619424540000 | 122.213 | 0.99 | 115.85 | 124.749 |
| 1619424600000 | 124.163 | 0.99 | 116.606 | 124.749 |
| 1619424660000 | 124.749 | 0.99 | 117.284 | 124.749 |
| 1619424720000 | 124.749 | NULL | 117.284 | 124.749 |
| 1619424780000 | 124.749 | NULL | 117.284 | 124.749 |
| 1619424840000 | 124.749 | NULL | 117.284 | 124.749 |
| 1619424900000 | 124.749 | NULL | 117.284 | 124.749 |
+---------------+----------+---------+------------+------------+
/*
* Based on ride-hailing data, this query calculates:
* - the distance of a given dropoff location to a fixed point of interest,
* partitioned in buckets of 50 meters each
* - the average trip distance in kilometers within that partition
* - the number of trips within that partition
*/
SELECT FLOOR(
DISTANCE(t.dropoff_location, 'POINT(-73.984 40.758)') / 50
) * 50 AS point_distance,
AVG(
DISTANCE(t.pickup_location, t.dropoff_location) / 1000.0
) AS trip_distance,
COUNT(*) AS trips
FROM nyc_taxi.trips t
WHERE t.pickup_month = DATE_TRUNC('month', CURDATE())
GROUP BY 1
ORDER BY 1 ASC;
+----------------+--------------------+--------+
| point_distance | trip_distance | trips |
+----------------+--------------------+--------+
| 0 | 6.022131503971144 | 5026 |
| 50 | 7.9007773871769436 | 25851 |
| 100 | 8.67141343399684 | 40359 |
| 150 | 5.951902622791926 | 58186 |
| 200 | 6.803184429768031 | 76231 |
| 250 | 5.458829825224027 | 77545 |
| 300 | 6.3820270018484155 | 83378 |
| 350 | 6.477387557677408 | 114297 |
| 400 | 6.214864318547081 | 105659 |
| 450 | 6.41286222320987 | 107163 |
+----------------+--------------------+--------+
/*
* Based on system event logs, this query calculates:
* - a filter for specific messages using a full-text index
* - the number of entries per minute
* - the average scoring ratio for each matched row
*/
SELECT DATE_TRUNC('minute', receivedat) AS event_time,
COUNT(*) AS entries,
AVG(_score) AS avg_score
FROM "syslog"."systemevents"
WHERE MATCH(message, 'authentication failure')
USING most_fields WITH (analyzer = 'whitespace')
AND MATCH(syslogtag, 'sshd')
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10;
+---------------+---------+--------------------+
| event_time | entries | avg_score |
+---------------+---------+--------------------+
| 1620220260000 | 4 | 1.5798743814229965 |
| 1620220200000 | 8 | 1.7750384211540222 |
| 1620220140000 | 10 | 1.6113891124725341 |
| 1620220080000 | 9 | 1.676726798216502 |
| 1620220020000 | 8 | 1.6908064410090446 |
| 1620219960000 | 8 | 1.690401442348957 |
| 1620219900000 | 7 | 1.7646006005150932 |
| 1620219840000 | 7 | 1.7795820917401994 |
| 1620219780000 | 10 | 1.5844267368316651 |
| 1620219720000 | 13 | 1.5637413492569556 |
+---------------+---------+--------------------+
We bring value
across an organization
From architects and engineers to business leaders, we work to solve data
management challenges companies face today
Developers
Develop modern applications and services
- Work across any data shape
- Fully managed database
- Familiar SQL

"The technical discussion with Crate.io engineers paid off, as it helped us to verify the technical and business requirements. CrateDB is an integral part of our big data streaming architecture, and it is delivering as promised."

Kristoffer Axelsson
Principal Solution Architect, TCG
Senior Data Scientists
Selecting, configuring and implementing analytics solutions
- Cost effective scaling
- No upskilling required
- Make real-time decisions

"CrateDB is a highly scalable database for time series and event data with a very fast query engine using standard SQL."

Alexander Mann
TGW Logistics Group
Business Leaders
Optimize the performance of digital assets
- Prevent operational disruptions
- Increase equipment efficiency
- Decrease total cost of DB ownership

"Tens of thousands of sensors generate data along our production lines, and CrateDB for Azure IoT allows us to analyze it to make real-time changes to factory efficiency."

Philipp Lehner
CEO
ALPLA Group
GET STARTED
Start your CrateDB experience now
Get started with CrateDB Cloud and launch a 30-day trial cluster.
Designed to handle the complexity of high-end time series workloads in real-time, CrateDB Cloud is a fully managed database-as-a-service. Secured, scaled and operated by the engineers that built CrateDB.
Download CrateDB
CrateDB is the leading open source, distributed SQL database for relational and time‑series data. It combines the familiarity of SQL with the scalability and data flexibility of NoSQL.