Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics

2018-05-15, by Naomi Slater

Telegraf is a plugin-driven agent written in Go for collecting, processing, aggregating, and writing metrics.

Telegraf can source metrics directly from the system it’s running on, pull metrics from third-party APIs, collect sensor data from Internet of Things (IoT) devices, or even listen for metrics via a StatsD and Kafka consumer services.

You can then transform, annotate, and filter the metrics you collect. You can even create aggregate metrics, such as mean, min, max, quantiles, and so on.

Finally, you can write your data to to a variety of other datastores, services, and message queues. Including CrateDB.

In this post, I am going to show you how to set up Telegraf and have it send metrics data to CrateDB.

With this setup, you can collect metrics with Telegraph and then take advantage of CrateDB's capacity for ingesting, storing, and analyzing huge amounts of data in real-time.

I will show you how to do this on macOS, but these instructions should be trivially adaptable for Linux or Windows.

Install CrateDB

If you don't already have CrateDB running locally, it's very easy to get set up.

Run this command:

$ bash -c "$(curl -L try.crate.io)"

This command downloads CrateDB and runs it from the tarball. If you'd like to install CrateDB more permanently, or you are using Windows, check out our collection of super easy one-step install guides.

If you're using the command above, it should pop open the CrateDB admin UI for you automatically once it has finished. Otherwise, head over to http://localhost:4200/#/help in your browser.

You should see something like this:

From here, you can import some tweets as test data if you fancy getting to grips with the basics of CrateDB before continuing.

For the purposes of this demo, we don't care about tweets. We're going to set up Telegraf and import some metrics.

Install Telegraf

If you're using macOS, you can install Telegraf with Homebrew:

$ brew update
$ brew install telegraf

If you're not on macOS, or you fancy trying a different installation method, head on over to the Telegraph downloads page.

You will be presented with the option to start Telegraf as a system service, or use the system-wide configuration file.

For the purposes of this demo, however, I will show you how to set things up in a temporary fashion. You can easily adapt this process if you wish.

Configure Telegraf

First of all, generate the default configuration file, like so:

$ telegraf \
    --input-filter cpu \
    --output-filter cratedb \
    config > telegraf.conf

Okay, what's going on here? Let's break it down.

  1. Telegraf is a plugin-driven tool and has plugins to collect lots, and lots, and lots of different types of metrics. But we only want to get a taste of things for now, so, with --input-filter cpu we're limiting the input plugins such that we're only collecting metrics about CPU utilization on our local machine.
  2. For the purposes of this demo, we only care about sending the data to CrateDB, so we can use --output-filter cratedb to limit our output plugins.
  3. Finally, we tell telegraph to generate a configuration file, and we redirect that output to a file named telegraf.conf.

Now, you can open telegraf.conf in your favourite text editor.

Feel free to browse this file and explore some of the configuration options. Check out the Telegraph configuration documentation if you want to understand this file in more detail.

At the very end of the file, you will find an INPUT PLUGINS section that corresponds to our two configured input plugins:

# Read metrics about cpu usage
[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = true
  ## If true, collect raw CPU time metrics.
  collect_cpu_time = false
  ## If true, compute and report the sum of all non-idle CPU states.
  report_active = false

There is no need to alter this, for the purposes of the demo.

What you should be looking for is the OUTPUT PLUGINS section:

# Configuration for CrateDB to send metrics to.
[[outputs.cratedb]]
  # A github.com/jackc/pgx connection string.
  # See https://godoc.org/github.com/jackc/pgx#ParseDSN
  url = "postgres://user:password@localhost/schema?sslmode=disable"
  # Timeout for all CrateDB queries.
  timeout = "5s"
  # Name of the table to store metrics in.
  table = "metrics"
  # If true, and the metrics table does not exist, create it automatically.
  table_create = true

This is the section that will allow you to configure the one output plugin we have specified: CrateDB.

For our purposes, modify this so you have:

# Configuration for CrateDB to send metrics to.
[[outputs.cratedb]]
  # A github.com/jackc/pgx connection string.
  # See https://godoc.org/github.com/jackc/pgx#ParseDSN
  url = "postgres://crate@localhost/doc?sslmode=disable"
  # Timeout for all CrateDB queries.
  timeout = "5s"
  # Name of the table to store metrics in.
  table = "metrics"
  # If true, and the metrics table does not exist, create it automatically.
  table_create = true

What we've done here is configured the connection string so we're connecting to CrateDB on localhost as the crate user, and we're using the default doc schema.

Note that we're using the PostgreSQL wire protocol, as this is one of CrateDB's client interfaces.

Notice also that we've left table_create = true, and this means that when we start up Telegraf, it will create the necessary table for us in CrateDB.

Run Telegraf

All that's left to do now is actually start up Telegraf.

You can do that with this command:

$ telegraf --config telegraf.conf

You should see a few status messages printed to the terminal.

Start Playing With Your Data in CrateDB

Bring up the CrateDB admin UI, and navigate to the tables browser by selecting the tables browser icon from the left-hand navigation menu.

You should see your new metrics table:

And if you select QUERY TABLE and then EXECUTE QUERY on the resulting screen, you should have a table of rows that looks like this:

Visualize Your Data

From here, you can start to slice and dice your data however you wish. And of course, visualize it.

The CrateDB admin UI is not designed for visualization, so I am going to use SQLPad (which I show you how to set up and run in a previous post).

Here's a simple query:

select
  date_trunc('minute', "timestamp") as "time",
  avg(fields ['usage_user']) as "user"
from
  metrics
group by
  "time"
order by
  "time";

What we're doing here is:

  • Chunking user CPU utilization into minute-long buckets
  • Outputting the average value across the whole minute

And when you plug the right graph configuration into SQLPad, you should see something that looks like this:

Wrap Up

Telegraf can gather data from machines, third-party APIs, sensor data, and even systems like StatsD and Kafka. From there, you can collect, process, and aggregate that data, before sending it to CrateDB.

By pairing Telegraf with CrateDB, a powerful distributed SQL database, you can store and analyze massive amounts of that collected machine data in real-time.