Real-time edge-cloud data warehouse for industrial IoT
Real-time “Edge-Cloud” data warehouse for Industrial IoT
Senseforce.io was founded in 2016 to provide industrial machinery manufacturers with a predictive maintenance platform for the connected machines they produce.
The real-time insights generated by Senseforce help the machinery manufacturers provide preventative maintenance to customers, identify upsell opportunities and improve product planning.
The founding team of developers and data scientists had prior experience developing solutions like Senseforce on a bespoke basis. The challenges they faced developing Senseforce included:
- Flexibility to handle any kind of machinery data, without requiring custom development
- Scalability to handle data from thousands of connected machines
"CrateDB was the only database that met our performance and scaling needs."
They developed the Senseforce industrial edge cloud platform working closely with a range of pilot customers, including Künz (industrial cranes) and IMA Schelling (industrial saws), which gave them exposure to different machine data sets and analytic requirements. The Senseforce platform includes the following components:
- Senseforce Luna – runs on device at customer sites to collect data from sensors, PLCs and other edge systems. It standardizes, compresses, and encrypts the data before sending it via MQTT (HiveMQ) to a central, cloud-based repository
- Senseforce Galaxy – built on CrateDB, Galaxy is a distributed data warehouse capable of scaling elastically, running 24×7, and supporting a wide variety of analytics, including time series, predictive, and ad-hoc. Galaxy includes a range of pre-defined analyses based on the Senseforce team’s many years of experience working with machine-generated data.
- Senseforce Rocket – the data visualization component. Customizable dashboards allow different stakeholders at the machine manufacturers to create their own specific view on the company’s data universe.
Senseforce is developed in C# on Windows using the .Net framework. Predictive models are written in R. It is deployed to bare-metal Linux CentOS systems because bare metal gives them 2x better performance than running on virtual machines. Docker Swarm is used to operate and scale the system and automatically load balances connections.
“CrateDB, the best fit” for Senseforce
Senseforce chose CrateDB because it was the best fit for their real time requirements.
Senseforce required a database capable of processing a wide variety and large volume of machine data structures. A single saw generates 50,000 messages per day (1.2GB/day), which customers can choose to retain for as long as they’d like (usually many months’ worth).
Queries needed to perform fast groupings and aggregations on the fly, as data is ingested. Scaling horizontally on Docker was also important to Senseforce in order to react quickly to data spikes and provide high availability.
We did some database evaluations before we started the project. Apache Cassandra struggled to support datasets with a large number of columns. And RavenDB, MongoDB, and InfluxDB ran into scaling issues as well due to the large cardinalities within our data. CrateDB was the only database that met our performance and scaling needs.
Handling Reference Data
One challenge Senseforce faced was reference data. In a data warehouse, transaction data (sensor readings) is analyzed in the context of reference data such as customer, factory location, machine type, operator name, etc. Reference data is often managed via multiple tables which are joined to transactions via a star or snowflake schema. The downside of this, is SQL joins, which slow down query performance.
Senseforce solved this problem by maintaining changes to reference data in a separate SQL Server database. Changes in SQL Server trigger a process in Senseforce, which exports the reference data changes to CrateDB. In the process, the data is transformed and stored into a single, flattened reference data table that is inner joined to the raw machine readings (transactions).
The first version of Senseforce runs on a 5-node CrateDB cluster and was developed and rolled out in less than 12 months. The system is extremely flexible and customer friendly:
Any user can browse the data without having DBMS or SQL expertise
Aggregations, groupings, and sorting all happen quickly in
An easy drag-drop UI allows them to form new reports and dashboards by selecting event types, and attributes such as time ranges and KPIs
Enabled in part by the speed, scalability and adaptability of CrateDB, Senseforce is growing fast. Their pilot customers were based in Europe and they are already starting to expand into North America.
We’re solving a hard problem for machine manufacturers. The bigger their customer bases grow, the more important–and more difficult–it becomes to connect with and support them. Senseforce eliminates that burden so they can gain clearer insight into product usage and identify new service and upsell opportunities.