Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications that builds on 15 years of experience running production workflows at Google.

Thanks to our Docker image, managing your CrateDB cluster with Kubernetes requires only a few steps.

This guide will assume that you already have kubectl and a Kubernetes cluster ready to go. All of the commands/templates/etc. provided should be customized to meet your personal requirements!

Table of Contents

The CrateDB Template

At CrateDB we are proud of our community and our user ‘cedboss’ provided us with a template to use for Kubernetes to deploy CrateDB. Building on that draft we developed a more mature version:

apiVersion: v1
kind: Service
  name: crate-discovery
    app: crate
  - port: 4200
    name: crate-web
  - port: 4300
    name: cluster
  - port: 5432
    name: postgres
  type: LoadBalancer
    app: crate
apiVersion: "apps/v1beta1"
kind: StatefulSet
  name: crate
  serviceName: "crate-db"
  replicas: 3
        app: crate
        pod.alpha.kubernetes.io/initialized: "true"
      - name: init-sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
          privileged: true
      - name: crate
        image: crate:latest
          - /docker-entrypoint.sh
          - -Ccluster.name=${CLUSTER_NAME}
          - -Cdiscovery.zen.hosts_provider=srv
          - -Cdiscovery.zen.minimum_master_nodes=2
          - -Cdiscovery.srv.query=_cluster._tcp.crate-discovery.default.svc.cluster.local
          - -Cgateway.recover_after_nodes=2
          - -Cgateway.expected_nodes=${EXPECTED_NODES}
            - mountPath: /data
              name: data
            memory: 2Gi
        - containerPort: 4200
          name: db
        - containerPort: 4300
          name: cluster
        - containerPort: 5432
          name: postgres
        # Half the available memory.
        - name: CRATE_HEAP_SIZE
          value: "1g"
        - name: EXPECTED_NODES
          value: "3"
        - name: CLUSTER_NAME
          value: "my-cratedb-cluster"
        - name: data
            medium: "Memory"

Init Containers

Init containers are containers that run before the main container of the pod, and are intended to accomplish some prerequisite task before the pod can be fully started. In this template’s case, the init container sets a value for vm.max_map_count, which is required for the bootstrap checks to pass.

Start CrateDB

This template utilizes the Kubernetes StatefulSet type that allows you to add SRV records for node DNS discovery, as well as a graceful shutdown when scaling. While CrateDB would also work as a stateless pod (cattle), this way facilitates a couple of things like persistant storage and shard migration on zero downtime upgrades. To create a new cluster based on this template, use:

.. code-block:: sh
kubectl create -f crate-kubernetes.yml

Find out the public IP and the services that have been created with

kubectl get services

Scale CrateDB

Next, you can scale your cluster by scaling the number of replicas of the StatefulSet to the desired number of CrateDB nodes:

kubectl scale statefulsets crate --replicas=4

After increasing/decreasing the number of pod replicas, CrateDB’s config should be adjusted, especially when scaling down or your cluster might disappear! There are three important settings in this context:


The settings recover_after_nodes and expected_nodes are important for the cluster to know the intended size and if it should recover or rather accept the provided cluster state on startup. Therefore changing these values will require a cluster restart and a misconfiguration will trigger a cluster check to fail. However, the cluster will still continue to function properly. discovery.zen.minimum_master_nodes on the other hand influences how the master nodes are elected, and a value greater than the number of nodes is going to disband the cluster, whereas a too small number might lead to a split-brain scenario - which is why it is necessary to adjust this number when scaling by issuing

SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = 5

… where 5 is one more than half the actual number of nodes in the cluster. Now to monitor the available pods you can use kubectl proxy which starts a web server with log outputs and a list of the available pods.

Customize CrateDB

In this template, CrateDB is configured using the -C parameters for the executable. Every setting in CrateDB’s YAML config is available there and allows for a better flexibility than passing around config files.


Many container hosting services offer spinning disks as default storage media, which - for CrateDB - is often too slow. In addition to that, replication only makes sense if the data is on separate physical media in order to achieve high availability and to avoid potential bottlenecks when several clusters attach the same storage. Since the Persistent Volume Claims depend on the Kubernetes cluster (and if it runs in the cloud), the volume template is currently not included in our template. You can familiarize yourself with them here. For now, the volumes are explicitly stated and to customize that accordingly, the volume guide is very helpful.

Manual Approach

Create a Kubernetes ‘pod’, or a group of containers tied together for administration and networking. In this case, using the CrateDB Docker image and exposing the ports that CrateDB uses.

kubectl run crate-cluster --image=crate:latest --port=4200 --port=4300 --port=5432

Expose the pod to the outside World with a specific Kubernetes service:

kubectl expose deployment crate-cluster --type="LoadBalancer"

And use the following command to get the status of the service you just created:

kubectl get services crate-cluster

Scale a CrateDB pod with:

kubectl scale deployment crate-cluster --replicas=3