Enabling Data Insights at Scale

CrateDB is the leading open-source, distributed SQL database empowering customers to turn data into business value

A Single Data Hub for all Operational Data

CrateDB makes it simple to ingest and analyze massive amounts of data. Companies can extract the full value of the data – whether for new innovative digital solutions or operational improvements across the business functions.

Icon Cursor Move

Variety

Readily handle structured and unstructured data, regardless of data shape

Icon Speed Gauge

Real-time

Run ad hoc queries in real-time - no waiting necessary

Icon Network Refresh

Run anywhere & everywhere

Run on-premise or in the cloud, any cloud

Icon Shrink All

Aggregation

Condense and distill to get what you need and when you need it

Icon Business Hierarchy

Simple scalability

Unique architecture that’s simple to scale and keeps running 24/7 with no bottlenecks

Icon Database Flash Lightning

SQL & NoSQL

Index high data volumes with a distributed SQL DBMS built atop NoSQL storage

PostgreSQL compatible

PostgreSQL

Use standard SQL without the need to learn new skills

CrateDB is data-shape agnostic

Ready to handle structured and unstructured data

/*
 * Based on sensor data, this query calculates:
 * - time-buckets of 10 seconds
 * - different aggregations per time-bucket and host group
 */
SELECT FLOOR(EXTRACT(epoch FROM m.timestamp) / 10) * 10 AS period,
       h.host_group,
       MIN(m.fields['usage_user']) AS "min",
       AVG(m.fields['usage_user']) AS "avg",
       MAX(m.fields['usage_user']) AS "max"
FROM telegraf.metrics m
LEFT JOIN telegraf.hosts h ON h.host_name = m.tags['host']
WHERE tags['cpu'] = 'cpu-total'
  AND m.timestamp > NOW() - '150 seconds'::INTERVAL
GROUP BY 1, 2
ORDER BY 1 DESC;








+------------+------------+--------+--------+--------+
|     period | host_group |    min |    avg |    max |
+------------+------------+--------+--------+--------+
| 1621515050 | Demo Hosts | 0.125  | 0.125  | 0.125  |
| 1621515040 | Demo Hosts | 0.1753 | 0.1753 | 0.1753 |
| 1621515030 | Demo Hosts | 0.2251 | 0.2251 | 0.2251 |
| 1621515020 | Demo Hosts | 0.2251 | 0.2251 | 0.2251 |
| 1621515010 | Demo Hosts | 0.1251 | 0.1251 | 0.1251 |
| 1621515000 | Demo Hosts | 0.2501 | 0.2501 | 0.2501 |
| 1621514990 | Demo Hosts | 0.1501 | 0.1501 | 0.1501 |
| 1621514980 | Demo Hosts | 0.075  | 0.075  | 0.075  |
| 1621514970 | Demo Hosts | 0.2503 | 0.2503 | 0.2503 |
| 1621514960 | Demo Hosts | 0.2    | 0.2    | 0.2    |
| 1621514950 | Demo Hosts | 0.1001 | 0.1001 | 0.1001 |
| 1621514940 | Demo Hosts | 0.175  | 0.175  | 0.175  |
| 1621514930 | Demo Hosts | 0.25   | 0.25   | 0.25   |
| 1621514920 | Demo Hosts | 0.1    | 0.1    | 0.1    |
+------------+------------+--------+--------+--------+






/*
 * Based on a given observation timeframe, this query calculates:
 * - the observed value's ratio compared to the previous value
 * - the average over a moving window of the last 100 rows
 * - the maximum value during the same hour of observation
 */
SELECT s.time,
       distance,
       1 - (
         LAG(distance) OVER (ORDER BY s.time ASC) / distance
       ) AS ratio,
       AVG(distance) OVER (
         ORDER BY s.time ASC ROWS BETWEEN 100 PRECEDING AND CURRENT ROW
       ) AS moving_avg,
       MAX(distance) OVER (
         PARTITION BY DATE_TRUNC('hour', s.time)
       ) AS hourly_max
FROM GENERATE_SERIES(
  DATE_TRUNC('minute', NOW() - '10 minutes'::INTERVAL),
  DATE_TRUNC('minute', NOW()), '1 minute'::INTERVAL
) s(time)
LEFT JOIN doc.raw r USING (time)
ORDER BY 1 ASC;





+---------------+----------+---------+------------+------------+
|          time | distance |   ratio | moving_avg | hourly_max |
+---------------+----------+---------+------------+------------+
| 1619424000000 |  108.886 | NULL    |    108.886 |    124.749 |
| 1619424060000 |  109.88  |    0.99 |    109.383 |    124.749 |
| 1619424120000 |  112.171 |    0.99 |    110.312 |    124.749 |
| 1619424180000 |  112.815 |    0.99 |    110.938 |    124.749 |
| 1619424240000 |  115.87  |    0.99 |    111.924 |    124.749 |
| 1619424300000 |  116.737 |    0.99 |    112.726 |    124.749 |
| 1619424360000 |  118.429 |    0.99 |    113.541 |    124.749 |
| 1619424420000 |  119.76  |    0.99 |    114.319 |    124.749 |
| 1619424480000 |  121.739 |    0.99 |    115.143 |    124.749 |
| 1619424540000 |  122.213 |    0.99 |    115.85  |    124.749 |
| 1619424600000 |  124.163 |    0.99 |    116.606 |    124.749 |
| 1619424660000 |  124.749 |    0.99 |    117.284 |    124.749 |
| 1619424720000 | NULL     | NULL    |    117.284 |    124.749 |
| 1619424780000 | NULL     | NULL    |    117.284 |    124.749 |
| 1619424840000 | NULL     | NULL    |    117.284 |    124.749 |
| 1619424900000 | NULL     | NULL    |    117.284 |    124.749 |
+---------------+----------+---------+------------+------------+








/*
 * Based on ride-hailing data, this query calculates:
 * - the distance of a given dropoff location to a fixed point of interest,
 *   partitioned in buckets of 50 meters each
 * - the average trip distance in kilometers within that partition
 * - the number of trips within that partition
 */
SELECT FLOOR(
         DISTANCE(t.dropoff_location, 'POINT(-73.984 40.758)') / 50
       ) * 50 AS point_distance,
       AVG(
         DISTANCE(t.pickup_location, t.dropoff_location) / 1000.0
       ) AS trip_distance,
       COUNT(*) AS trips
FROM nyc_taxi.trips t
WHERE t.pickup_month = DATE_TRUNC('month', CURDATE())
GROUP BY 1
ORDER BY 1 ASC;






+----------------+--------------------+--------+
| point_distance |      trip_distance |  trips |
+----------------+--------------------+--------+
|              0 | 6.022131503971144  |   5026 |
|             50 | 7.9007773871769436 |  25851 |
|            100 | 8.67141343399684   |  40359 |
|            150 | 5.951902622791926  |  58186 |
|            200 | 6.803184429768031  |  76231 |
|            250 | 5.458829825224027  |  77545 |
|            300 | 6.3820270018484155 |  83378 |
|            350 | 6.477387557677408  | 114297 |
|            400 | 6.214864318547081  | 105659 |
|            450 | 6.41286222320987   | 107163 |
+----------------+--------------------+--------+










/*
 * Based on system event logs, this query calculates:
 * - a filter for specific messages using a full-text index
 * - the number of entries per minute
 * - the average scoring ratio for each matched row
 */
SELECT DATE_TRUNC('minute', receivedat) AS event_time,
       COUNT(*) AS entries,
       AVG(_score) AS avg_score
FROM "syslog"."systemevents"
WHERE MATCH(message, 'authentication failure') 
USING most_fields WITH (analyzer = 'whitespace')
   AND MATCH(syslogtag, 'sshd')
GROUP BY 1
ORDER BY 1 DESC
LIMIT 10;








+---------------+---------+--------------------+
|    event_time | entries |          avg_score |
+---------------+---------+--------------------+
| 1620220260000 |       4 | 1.5798743814229965 |
| 1620220200000 |       8 | 1.7750384211540222 |
| 1620220140000 |      10 | 1.6113891124725341 |
| 1620220080000 |       9 | 1.676726798216502  |
| 1620220020000 |       8 | 1.6908064410090446 |
| 1620219960000 |       8 | 1.690401442348957  |
| 1620219900000 |       7 | 1.7646006005150932 |
| 1620219840000 |       7 | 1.7795820917401994 |
| 1620219780000 |      10 | 1.5844267368316651 |
| 1620219720000 |      13 | 1.5637413492569556 |
+---------------+---------+--------------------+










We bring value across an organization

From architects and engineers to business leaders, we work to solve data management challenges companies face today

Developers

Develop modern IoT applications and services

  • Work across any data shape
  • Fully managed database
  • Familiar SQL
Thomas Concrete Group

"The technical discussion with Crate.io engineers paid off, as it helped us to verify the technical and business requirements. CrateDB is an integral part of our big data streaming architecture, and it is delivering as promised."

Portrait Thomas Concrete Group

Kristoffer Axelsson
Principal Solution Architect, TCG

Senior Data Scientists

Selecting, configuring and implementing analytics solutions

  • Cost effective scaling
  • No upskilling required
  • Make real-time decisions
McAfee

"CrateDB is an important part of our data stack giving us the performance and horizontal scalability to meet our rapidly growing business needs."

Portrait McAfee

Sekhar Sarukkai
Chief Scientist,
McAfee

Business Leaders

Optimize the performance of IoT assets

  • Prevent operational disruptions
  • Increase equipment efficiency
  • Decrease total cost of DB ownership
Alpla

"Tens of thousands of sensors generate data along our production lines, and CrateDB for Azure IoT allows us to analyze it to make real-time changes to factory efficiency."

Portrait Alpla

Philipp Lehner
CEO
ALPLA Group

Use cases

CrateDB's value spans across several industries and segments

Get started

Start your CrateDB experience now