Cassandra is one of my favorite databases when it comes to needing something fast, scalable, and distributed. When it came time to deploy Cassandra for our micro-services, I almost went the old route of creating virtual machines or provisioning bare metal servers. Our micro-services were already deployed in IBM Cloud Kubernetes clusters, so ideally the Cassandra clusters would be too. It turns out deploying Cassandra was super easy, especially with dynamically provisioned block storage (more on that later).

This article assumes you already have an IBM Cloud Kubernetes cluster created in IBM Cloud.  If you haven't already installed and configured kubectl, go here to get that sorted out. You'll want to be able to run kubectl commands against the cluster.

Create a Headless Service

A normal Service in Kubernetes allows you load balance between the pods behind a single service IP, and creates a DNS entry for you. A headless service in Kubernetes is a service definition that doesn't have a service IP. This can be useful if you don't need load balancing for a service but would rather have just a set of A records for the service, pointing to each individual pod. Since Cassandra doesn't really need load balancing behind an IP (because your Cassandra client connects to the nodes directly), there's no need to create a real service. With a headless service, you can point your cassandra client to cassandra.data.svc.cluster.local.

apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
  namespace: data
spec:
  clusterIP: None
  ports:
    - port: 9042
  selector:
    app: cassandra

This effectively creates a DNS entry for cassandra.data.svc.cluster.local where an A record will exist for each Kubernetes pod with app: cassandra in its label. This makes configuring the Cassandra client easy, just point it at cassandra.data.svc.cluster.local.

Create the StatefulSet

A StatefulSet is like a Deployment in Kubernetes, but rather than deleting and recreating Pods, it uses the same Pod.  So rather than get a dynamically generated Pod name like you do in a Deployment, Kubernetes will assign an index (starting at zero) for each Pod and use that index in the Pod name. This is one of the reasons we're using a StatefulSet instead of a Deployment. We want to tell our Cassandra clients that the nodes are located at cassandra-0, cassandra-1, and cassandra-2. Having predictable Pod names makes this much easier. Here's the StatefulSet definition as a whole.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
        - name: cassandra
          image: cassandra:3.11
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 7000
              name: intra-node
            - containerPort: 7001
              name: tls-intra-node
            - containerPort: 7199
              name: jmx
            - containerPort: 9042
              name: cql
          lifecycle:
            preStop:
              exec:
                command: 
                - /bin/sh
                - -c
                - nodetool drain
          env:
            - name: CASSANDRA_SEEDS
              value: cassandra-0.cassandra.data.svc.cluster.local
            - name: MAX_HEAP_SIZE
              value: 1024M
            - name: HEAP_NEWSIZE
              value: 100M
            - name: CASSANDRA_CLUSTER_NAME
              value: "Cassandra"
            - name: CASSANDRA_DC
              value: "DAL10"
            - name: CASSANDRA_RACK
              value: "Rack-1"
            - name: CASSANDRA_ENDPOINT_SNITCH
              value: GossipingPropertyFileSnitch
          volumeMounts:
            - name: cassandra-data
              mountPath: /var/lib/cassandra
  volumeClaimTemplates:
    - metadata:
        name: cassandra-data
        labels:
          billingType: "hourly"
      spec:
        storageClassName: "ibmc-block-silver"
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 50Gi

Most of this will look like a Deployment definition, and you can figure out what settings you need for your setup. The more interesting part is the volumeClaimTemplates section. This section specifies how volumes are created for each of the Pods that get created. Normally, you'd have to allocate storage from somewhere, make that storage available as Persistent Volumes (PV) in Kubernetes, then claim part of that PV by creating a Persistent Volume Claim (PVC), THEN you can reference that PVC as a volume in your Deployment. With a StatefulSet, you can specify what kind of storage you need and how much (for each Pod created in the StatefulSet) storage. It'll then create a PVC for each created Pod, where it will be made available as a volume to be referenced in the container spec. So in the above definition, 3 PVCs will be created; PVC cassandra-data-cassandra-0  will be claimed by Pod cassandra-0; PVC cassandra-data-cassandra-1 will be claimed by Pod cassandra-1; and so on.

So where does the storage for the PVC come from? Normally, you would need to allocate some storage and register that storage as a Persistent Volume within Kubernetes. With IBM Cloud, you can enable a plugin that will be invoked to allocate the block storage for you. Notice in the definition above we specified storageClassName: "ibmc-block-silver". This tells Kubernetes to invoke the plugin that will go provision block storage for you (of the specified IOPs) on the fly and add it to Kubernetes as a PV. So in order for the above to work, we need to enable the plugin.

Installing the IBM Cloud Block Storage Plug-in

If you haven't used Helm yet on your IBM Cloud Kubernetes cluster, follow these instructions before proceeding.

Okay, so you have Helm installed? Good, now run the following commands:

$ helm install ibm/ibmcloud-block-storage-plugin
$ kubectl get storageclasses | grep block

This installs the plugin and shows you the available storage classes available. You can find more detailed information about each of the storage classes here.

Putting it All Together

apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
  namespace: data
spec:
  clusterIP: None
  ports:
    - port: 9042
  selector:
    app: cassandra
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
        - name: cassandra
          image: cassandra:3.11
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 7000
              name: intra-node
            - containerPort: 7001
              name: tls-intra-node
            - containerPort: 7199
              name: jmx
            - containerPort: 9042
              name: cql
          lifecycle:
            preStop:
              exec:
                command: 
                - /bin/sh
                - -c
                - nodetool drain
          env:
            - name: CASSANDRA_SEEDS
              value: cassandra-0.cassandra.data.svc.cluster.local
            - name: MAX_HEAP_SIZE
              value: 1024M
            - name: HEAP_NEWSIZE
              value: 100M
            - name: CASSANDRA_CLUSTER_NAME
              value: "Cassandra"
            - name: CASSANDRA_DC
              value: "DAL10"
            - name: CASSANDRA_RACK
              value: "Rack-1"
            - name: CASSANDRA_ENDPOINT_SNITCH
              value: GossipingPropertyFileSnitch
          volumeMounts:
            - name: cassandra-data
              mountPath: /var/lib/cassandra
  volumeClaimTemplates:
    - metadata:
        name: cassandra-data
        labels:
          billingType: "hourly"
      spec:
        storageClassName: "ibmc-block-retain-silver"
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 50Gi

Important Note

Be careful about what storage class you use. You'll notice if you list the storage classes that there are two types:
$ kubectl get storageclasses | grep block
ibmc-block-bronze          ibm.io/ibmc-block   26d
ibmc-block-custom          ibm.io/ibmc-block   26d
ibmc-block-gold            ibm.io/ibmc-block   26d
ibmc-block-retain-bronze   ibm.io/ibmc-block   26d
ibmc-block-retain-custom   ibm.io/ibmc-block   26d
ibmc-block-retain-gold     ibm.io/ibmc-block   26d
ibmc-block-retain-silver   ibm.io/ibmc-block   26d
ibmc-block-silver          ibm.io/ibmc-block   26d

Each flavor (bronze, silver, gold, etc.) has a 'retain' version of it. The 'retain' ones mean that when the PVC is deleted, the PV and physical storage device provisioned in your IBM Cloud Infrastructure account is still around for you to mount and reuse. This is important because if you don't use the 'retain' classes, and the PVC is deleted, the underlying PV and physical storage device will automatically be deleted, along with your data!

3 comments: