Kubernetes

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications that builds on 15 years of experience running production workflows at Google.

Thanks to our Docker image, managing your CrateDB cluster with Kubernetes requires only a few steps.

This guide will assume that you already have kubectl and a Kubernetes cluster ready to go. All of the provided commands, templates, and so on should be customized to meet your requirements!

Table of Contents

The CrateDB Template

At CrateDB we are proud of our community and our user cedboss provided us with a template to use for Kubernetes to deploy CrateDB. Building on that draft we developed a more mature version:

apiVersion: v1
kind: Service
metadata:
  name: crate-discovery
  labels:
    app: crate
spec:
  ports:
  - port: 4200
    name: crate-web
  - port: 4300
    name: cluster
  - port: 5432
    name: postgres
  type: LoadBalancer
  selector:
    app: crate
---
apiVersion: "apps/v1beta1"
kind: StatefulSet
metadata:
  name: crate
spec:
  serviceName: "crate-db"
  replicas: 3
  template:
    metadata:
      labels:
        app: crate
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
    spec:
      initContainers:
      - name: init-sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      containers:
      - name: crate
        image: crate:latest
        command:
          - /docker-entrypoint.sh
          - -Ccluster.name=${CLUSTER_NAME}
          - -Cdiscovery.zen.hosts_provider=srv
          - -Cdiscovery.zen.minimum_master_nodes=2
          - -Cdiscovery.srv.query=_cluster._tcp.crate-discovery.default.svc.cluster.local
          - -Cgateway.recover_after_nodes=2
          - -Cgateway.expected_nodes=${EXPECTED_NODES}
        volumeMounts:
            - mountPath: /data
              name: data
        resources:
          limits:
            memory: 2Gi
        ports:
        - containerPort: 4200
          name: db
        - containerPort: 4300
          name: cluster
        - containerPort: 5432
          name: postgres
        env:
        # Half the available memory.
        - name: CRATE_HEAP_SIZE
          value: "1g"
        - name: EXPECTED_NODES
          value: "3"
        - name: CLUSTER_NAME
          value: "my-cratedb-cluster"
      volumes:
        - name: data
          emptyDir:
            medium: "Memory"

Init Containers

Init containers are containers that run before the main container of the pod, and are intended to accomplish some prerequisite task before the pod can be fully started. In this template’s case, the init container sets a value for vm.max_map_count, which is required for the bootstrap checks to pass.

Start CrateDB

This template utilizes the Kubernetes StatefulSet type that allows you to add SRV records for node DNS discovery, as well as a graceful shutdown when scaling. While CrateDB would also work as a stateless pod (cattle), this way facilitates a couple of things like persistant storage and shard migration on zero downtime upgrades.

To create a new cluster based on this template, use:

kubectl create -f crate-kubernetes.yml

Find out the public IP and the services that have been created with:

kubectl get services

If it’s okay for this service is exposed externally, specifying the service type as LoadBalancer is sufficient. An external load balancer then routes to NodePort and ClusterIP services.

However, if it is not acceptable to expose the database on an external (public) IP, we recommend specifying the type as NodePort. This means that the service is exposed on each node’s IP as a static port.

See also

The Kubernetes documentation has more information about services.

Scale CrateDB

Next, you can scale your cluster by scaling the number of replicas of the StatefulSet to the desired number of CrateDB nodes:

kubectl scale statefulsets crate --replicas=4

After increasing/decreasing the number of pod replicas, CrateDB’s config should be adjusted, especially when scaling down or your cluster might disappear! There are three important settings in this context:

discovery.zen.minimum_master_nodes=2
gateway.recover_after_nodes=2
gateway.expected_nodes=3

The settings recover_after_nodes and expected_nodes are important for the cluster to know the intended size and if it should recover or rather accept the provided cluster state on startup.

Therefore changing these values will require a cluster restart and a misconfiguration will trigger a cluster check to fail. However, the cluster will still continue to function properly.

The discovery.zen.minimum_master_nodes setting, on the other hand, influences how the master nodes are elected. A value greater than the number of nodes is going to disband the cluster. Whereas a too small number might lead to a split-brain scenario.

For these reasons, it is necessary to adjust this number when scaling.

You can do that, like so:

SET GLOBAL PERSISTENT discovery.zen.minimum_master_nodes = 5

Here, 5 is one more than half the actual number of nodes in the cluster.

Now to monitor the available pods you can use kubectl proxy which starts a web server with log outputs and a list of the available pods.

Customize CrateDB

In this template, CrateDB is configured using the -C parameters for the executable. Every setting in CrateDB’s configuration file is available there and allows for a better flexibility than passing around config files.

Storage

The above example makes use of in-memory storage. This is only suggested for when you are testing CrateDB with Kubernetes.

Warning

We do not recommend in-memory storage for a production database. All data from this directory will be lost when a pod is deleted.

See also

The Kubernetes documentation has more information about in-memory storage.

For durable storage, persistent volumes (PV) should be used.

A volume is bound and mounted to each Crate pod, using volumeClaimTemplates within the StatefulSet configuration.

volumeClaimTemplates:
- metadata:
    name: crate-persistent-storage
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 10Gi

Each PV lifecycle is independent of any pod that makes use of it. When a pod (or group of pods) is destroyed, the data will persist on these volumes and will be reassigned to the pods when they re-spawn.

Many container hosting services offer spinning disks as the default storage media. However, the performance of spinning disks can be dependent on the cloud provider. Nevertheless, replication only makes sense if the data is on separate physical media in order to achieve high availability and to avoid potential bottlenecks when several clusters attach the same storage.

See also

The Kubernetes examples documentation has more information about provisioning persistent volumes with a cloud provider.

The Kubernetes docs has more information on Volumes and StatefulSets.

Volumes can be customized as required. Consult the kubernetes volume guide for more help.

Manual Approach

Create a Kubernetes ‘pod’, or a group of containers tied together for administration and networking. In this case, using the CrateDB Docker image and exposing the ports that CrateDB uses.

kubectl run crate-cluster --image=crate:latest --port=4200 --port=4300 --port=5432

Expose the pod to the outside World with a specific Kubernetes service:

kubectl expose deployment crate-cluster --type="LoadBalancer"

And use the following command to get the status of the service you just created:

kubectl get services crate-cluster

Scale a CrateDB pod with:

kubectl scale deployment crate-cluster --replicas=3