TKGS – Quick Tip – How to enable Change Block Tracking (CBT) in vSphere with Tanzu + NSX-ALB

Helloooo!

Lately I was working for a customer who would like to have the capability to take Incremental Backups of their Persistent Volumes in their Tanzu Kubernetes Clusters. They were leveraging Kasten K10 for backing up the Tanzu Kubernetes environment. Kasten leverages the vSphere API for Incremental Backups and relies on the Change Block Tracking feature to be enabled on the Virtual Machine.

When you are running vSphere with Tanzu + NSX-T you have the possibility to enable the Velero vSphere Operator as a Supervisor Service. Which means it will run as vSphere Pods. The official instructions can be found in Kasten’s Online Documentation.

However when you have vSphere with Tanzu + NSX-ALB, we do not yet have the possibility to enable the Velero vSphere Operator as mentioned in the GitHub documentation.

So how do we enabled it? Let’s dive in!

  1. General Information
  2. Warning!
  3. So how do we enable CBT on vSphere with Tanzu + NSX-ALB then?
  4. Potential Problems?
  5. Potential Workarounds?

General Information

What does Kasten Want?

  • Kasten K10 uses the vSphere Feature: Change Block Tracking (CBT) in order to leverage the vSphere CBT API to take Incremental Backups of Persistent Volumes. Kasten needs to have CBT enabled on a VM if you would like to take incremental backups of a VM.
  • CBT can be enabled on a per VM basis in vCenter.

How do we enable CBT on non-Tanzu VMs (so regular VMs)?

Problem:

  • The VMs of vSphere with Tanzu (Tanzu Kubernets Cluster Nodes) cannot be managed by our regular user accounts in vSphere or even not the administrator account. It is managed by the vCenter itself. So we cannot add this advanced parameter to Tanzu Kubernetes Cluster VMs.

Why does the Kasten Documentation state ‘enable the Velero vSphere Operator’?

  • For the same Access Credentials reason: we cannot enable ‘CBT’ on Tanzu Kubernetes VMs with our User Accounts. So installing a Supervisor Service (as vSphere Pods which requires NSX-T) allows the Velero vSphere Operator Service to enable CBT on all the VMs for us.
  • Kasten does not need the Velero vSphere Operator. Only to enable CBT:

Warning

The following steps are currently unsupported, proceed at your own risk.

So how do we enable CBT on vSphere with Tanzu + NSX-ALB then?

We need to access our vSphere with Tanzu environment using ‘kubectl’ with an account that has enough privileges to enable the ‘CBT’ feature. We’re going to login as the built-in ‘kubernetes-admin’ on our vSphere with Tanzu environment. And afterwards basically take the same approach as the Velero vSphere Operator Service would do:

1. Browse to your vCenter UI using a Web Browser, login and leave your window open. This window can be used to check if CBT has been enabled successfully afterwards.

2. SSH to your vCenter Server as root:

ssh root@<vCenter-IP>

3. Start the shell in your vCenter Session by typingshell

4. Decrypt the ‘root’ password from the vSphere with Tanzu environment:

/usr/lib/vmware-wcp/decryptK8Pwd.py

5. SSH to a Supervisor Control Plane VM with the credentials AND IPaddress above (from your vCenter Server for example)

6. Check the current context of ‘kubectl’ on the Supervisor Control Plane VM:
kubectl config get-contexts

kubectl config get-contexts

You should be automatically ‘kubernetes-admin’

7. Virtual Machines in vSphere with Tanzu are just Kubernetes Objects; so we can modify the Kubernetes Object ‘virtualmachine’ to enable CBT.
In order to enable CBT for only ONE Tanzu Kubernetes Cluster Node (VM), perform the following:

kubectl edit virtualmachine <VM-Name> -n <Your-vSphere-Namespace (e.g.: shared-services>)

Add the following lines under Spec:

advancedOptions:
    changeBlockTracking: true

8. All the VirtualMachine Objects are ‘Namespaced’. So to enable CBT for all the Tanzu Kubernetes Clusters in a certain vSphere Namespace, perform the following:

kubectl get virtualmachines -n <Your-vSphere-Namespace (e.g.: shared-services>) -o name | sed -e 's/.*\///g' | xargs -I {} kubectl patch virtualmachine {} -n <Your-vSphere-Namespace (e.g.: shared-services>) --type=json  -p '[{ "op": "add", "path": "/spec/advancedOptions", "value": {"changeBlockTracking": true}}]'

9. You should see the changes take place in your vCenter UI too:

10. These actions set the Advanced Parameter ‘ctkEnabled’ on the Tanzu Kubernetes Cluster Nodes / VMs as shown below. You can verify this in your vCenter Server:

11. Test in Kasten

Potential Problems?

Manually enabling Change Block Tracking to support Incremental Backups of Persistent Volumes is not fool proof; here is why:

  1. Whenever a new Tanzu Kubernetes Cluster is deployed, new VM Objects will be deployed without the AdvancedOptions parameters set for CBT.
  2. Whenever you expand an existing Tanzu Kubernetes Cluster you end up with new VM Objects that will not have the AdvancedOptions parameters set for CBT.
  3. Whenever you upgrade an existing Tanzu Kubernetes Cluster, you end up with new VM Objects that will not have the AdvancedOptions parameters set for CBT.

Potential Workarounds?

  1. Use a Pipelining Solution to deploy, scale & upgrade your Tanzu Kubernetes Clusters and let it run the ‘kubectl apply’ command from step 8 each time to make sure the AdvancedOptions parameters for CBT are set.
  2. Maybe a Mutating Webhook on the Supervisor Cluster?
  3. Manual intervention each time 😉

Feel free to let us know if you’ve encountered this issue too. I hope this blog post helps.

Have a nice day!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s