Cloud Monitoring

This tutorial demonstrates how you can monitor your CrateDB Cloud cluster using the exposed Prometheus metrics endpoint.

The visualization tool Grafana is used with Prometheus to scrape the API endpoint that exposes metrics and visualize them. The returned metrics are a sum of all the clusters in the specified organization.

Table of contents

Prerequisites

  • Both Prometheus and Grafana are run as Docker containers in this tutorial, so you need Docker present in your system.

Cluster Deployment

The first step is to sign up in the Cloud Console if you haven’t done so yet. After that, you can deploy your cluster.

Prometheus

Prometheus is used to scrape the CrateDB Cloud API endpoints for available metrics and serve as a data source for Grafana.

First, you need to save the following configuration .yaml file in your system:

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  - job_name: "cratedb"
    metrics_path: '/api/v2/organizations/{{ORGID}}/metrics/prometheus/'
    basic_auth:
      username: '{{APIKEY}}'
      password: '{{SECRET}}'
    static_configs:
      - targets: ["console.cratedb.cloud"]

Substitute the ORGID with the ID of your organization. It can be found in the Settings page in the CrateDB Cloud Console:

Organization ID

An API Key and Secret can be generated in your account page in the CrateDB Cloud Console.

Once you have added your Organization ID and API credentials, execute the following command to create a Prometheus instance:

docker run -d --name prometheus -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml -p 9090:9090 prom/prometheus

This will start the Prometheus instance exposed on port 9090. You can verify it’s running correctly by visiting http://localhost:9090/. On the Status -> Targets page in the top menu, you should see the following:

Prometheus targets

There should be an endpoint with your Organization ID with state UP. This means that Prometheus is able to connect to the API and is scraping the available metrics.

Available Metrics

Most metric semantics are self-explanatory. This list is not exhaustive, and new metrics can be added at any point in the future. All metrics are per node.

Metric

Type

Description

container_cpu_usage_seconds_total

Counter

CrateDB CPU usage, in seconds.

container_fs_reads_bytes_total

Counter

Number of bytes read per disk

container_fs_writes_bytes_total

Counter

Number of bytes written per disk

container_memory_usage_bytes

Gauge

Memory usage

container_network_receive_bytes_total

Counter

Network ingress traffic

container_network_transmit_bytes_total

Counter

Network egress traffic

crate_circuitbreakers

Gauge

Circuit breaker stats for crate per breaker

crate_cluster_state_version

Gauge

Info about the cluster’s state

crate_connections

Gauge

Number of connections per protocol

crate_node

Gauge

Shard statistics

crate_query_failed_count

Counter

Number of failed queries per type (i.e. Insert/Select/Update/…)

crate_query_sum_of_durations_millis

Counter

Sum of the durations of all queries per query type

crate_query_total_count

Counter

Total number of queries per type

crate_ready

Gauge

An indicator if this CrateDB node is up-and-running

crate_threadpools

Gauge

Thread pool statistics, per pool

jvm_*

Gauge

Various JVM statistics

Grafana

Grafana doesn’t need any special configuration. You can run it either in a Docker container or as a local installation, it doesn’t matter for this use case. Follow the Grafana documentation and use your preferred method.

By default, Grafana is exposed on port 3000. Go to http://localhost:3000/ to access it.

Data source

Now you can add Prometheus as a data source in Grafana under Configuration -> Data sources. Choose Prometheus, then use http://localhost:9090/ as the URL, and leave the rest as default:

Prometheus data source

Note

If you run both Prometheus and Grafana as Docker containers, you might need to create a new network and add both containers to it.

Dashboard

All that’s left is to create a dashboard or import one that we prepared for you. Simply save this snippet as .json and import it under Dashboards -> New -> Import. Click the “Upload JSON file” and choose the file. The dashboard will be called “CrateDB Cluster Monitoring”.

Sample grafana dashboard

The dashboard displays the following metrics. The values are aggregated from all the running clusters in your organization:

  • Global stats:
    • Number of nodes

  • Clusters stats:
    • Type and number of open connections to your clusters

    • SELECT queries per second

    • INSERT queries per second

    • CPU usage (Cores)

    • Memory usage

    • File system writes

    • File system reads

  • Query stats:
    • Error rate along with the type of failed query

    • Average query duration along with the type of query

    • Queries per second along with the type of query

Feedback

How helpful was this page?