Telegraf is a plugin-driven agent written in Go for collecting, processing, aggregating, and writing metrics.
Telegraf can source metrics directly from the system it’s running on, pull metrics from third-party APIs, collect sensor data from Internet of Things (IoT) devices, or even listen for metrics via a StatsD and Kafka consumer services.
You can then transform, annotate, and filter the metrics you collect. You can even create aggregate metrics, such as mean, min, max, quantiles, and so on.
Finally, you can write your data to to a variety of other datastores, services, and message queues. Including CrateDB.
In this post, I am going to show you how to set up Telegraf and have it send metrics data to CrateDB.
With this setup, you can collect metrics with Telegraph and then take advantage of CrateDB's capacity for ingesting, storing, and analyzing huge amounts of data in real-time.
I will show you how to do this on macOS, but these instructions should be trivially adaptable for Linux or Windows.
If you don't already have CrateDB running locally, it's very easy to get set up.
Run this command:
$ bash -c "$(curl -L try.crate.io)"
This command downloads CrateDB and runs it from the tarball. If you'd like to install CrateDB more permanently, or you are using Windows, check out our collection of super easy one-step install guides.
You should see something like this:
From here, you can import some tweets as test data if you fancy getting to grips with the basics of CrateDB before continuing.
For the purposes of this demo, we don't care about tweets. We're going to set up Telegraf and import some metrics.
If you're using macOS, you can install Telegraf with Homebrew:
$ brew update $ brew install telegraf
If you're not on macOS, or you fancy trying a different installation method, head on over to the Telegraph downloads page.
You will be presented with the option to start Telegraf as a system service, or use the system-wide configuration file.
For the purposes of this demo, however, I will show you how to set things up in a temporary fashion. You can easily adapt this process if you wish.
First of all, generate the default configuration file, like so:
$ telegraf \ --input-filter cpu \ --output-filter cratedb \ config > telegraf.conf
Okay, what's going on here? Let's break it down.
- Telegraf is a plugin-driven tool and has plugins to collect lots, and lots, and lots of different types of metrics. But we only want to get a taste of things for now, so, with
--input-filter cpuwe're limiting the input plugins such that we're only collecting metrics about CPU utilization on our local machine.
- For the purposes of this demo, we only care about sending the data to CrateDB, so we can use
--output-filter cratedbto limit our output plugins.
- Finally, we tell
telegraphto generate a configuration file, and we redirect that output to a file named
Now, you can open
telegraf.conf in your favourite text editor.
Feel free to browse this file and explore some of the configuration options. Check out the Telegraph configuration documentation if you want to understand this file in more detail.
At the very end of the file, you will find an
INPUT PLUGINS section that corresponds to our two configured input plugins:
# Read metrics about cpu usage [[inputs.cpu]] ## Whether to report per-cpu stats or not percpu = true ## Whether to report total system cpu stats or not totalcpu = true ## If true, collect raw CPU time metrics. collect_cpu_time = false ## If true, compute and report the sum of all non-idle CPU states. report_active = false
There is no need to alter this, for the purposes of the demo.
What you should be looking for is the
OUTPUT PLUGINS section:
# Configuration for CrateDB to send metrics to. [[outputs.cratedb]] # A github.com/jackc/pgx connection string. # See https://godoc.org/github.com/jackc/pgx#ParseDSN url = "postgres://user:password@localhost/schema?sslmode=disable" # Timeout for all CrateDB queries. timeout = "5s" # Name of the table to store metrics in. table = "metrics" # If true, and the metrics table does not exist, create it automatically. table_create = true
This is the section that will allow you to configure the one output plugin we have specified: CrateDB.
For our purposes, modify this so you have:
# Configuration for CrateDB to send metrics to. [[outputs.cratedb]] # A github.com/jackc/pgx connection string. # See https://godoc.org/github.com/jackc/pgx#ParseDSN url = "postgres://crate@localhost/doc?sslmode=disable" # Timeout for all CrateDB queries. timeout = "5s" # Name of the table to store metrics in. table = "metrics" # If true, and the metrics table does not exist, create it automatically. table_create = true
What we've done here is configured the connection string so we're connecting to CrateDB on
localhost as the
crate user, and we're using the default
Note that we're using the PostgreSQL wire protocol, as this is one of CrateDB's client interfaces.
Notice also that we've left
table_create = true, and this means that when we start up Telegraf, it will create the necessary table for us in CrateDB.
All that's left to do now is actually start up Telegraf.
You can do that with this command:
$ telegraf --config telegraf.conf
You should see a few status messages printed to the terminal.
Start Playing With Your Data in CrateDB
Bring up the CrateDB admin UI, and navigate to the tables browser by selecting the tables browser icon from the left-hand navigation menu.
You should see your new
And if you select QUERY TABLE and then EXECUTE QUERY on the resulting screen, you should have a table of rows that looks like this:
Visualize Your Data
From here, you can start to slice and dice your data however you wish. And of course, visualize it.
The CrateDB admin UI is not designed for visualization, so I am going to use SQLPad (which I show you how to set up and run in a previous post).
Here's a simple query:
select date_trunc('minute', "timestamp") as "time", avg(fields ['usage_user']) as "user" from metrics group by "time" order by "time";
What we're doing here is:
- Chunking user CPU utilization into minute-long buckets
- Outputting the average value across the whole minute
And when you plug the right graph configuration into SQLPad, you should see something that looks like this:
Telegraf can gather data from machines, third-party APIs, sensor data, and even systems like StatsD and Kafka. From there, you can collect, process, and aggregate that data, before sending it to CrateDB.
By pairing Telegraf with CrateDB, a powerful distributed SQL database, you can store and analyze massive amounts of that collected machine data in real-time.