Skip to content

Enabling Transparent Checkpointing

The MemVerge Transparent Checkpoint Operator is activated by adding specific labels to your Kubernetes Pod specifications. These labels can be applied directly in your YAML manifests or dynamically using kubectl.

Applying Labels in Pod Specifications

To enable checkpointing for a specific pod, add the memverge.ai/checkpoint-mode: 'true' label to the metadata.labels section of your Pod specification or within the template.metadata.labels of your workload controller (e.g., Deployment, StatefulSet, Job).

Example: Enabling checkpointing for a Job

apiVersion: batch/v1
kind: Job
metadata:
  name: my-checkpointed-job
spec:
  template:
    metadata:
      labels:
        memverge.ai/checkpoint-mode: 'true'  # Enable checkpointing for pods created by this Job
        memverge.ai/checkpoint-volume-size: 2Gi # Optional: Specify checkpoint volume size
    spec:
      containers:
      - name: my-container
        image: my-image:latest
      restartPolicy: Never

When the Job creates a Pod, the memverge.ai/checkpoint-mode: 'true' label will instruct the operator to automatically checkpoint the pod's state when it is deleted (e.g., upon successful completion or failure). If the pod is recreated, the operator will automatically restore its state from the latest checkpoint.

Important Note for Workload Controllers: For controllers like Deployments, StatefulSets, and Jobs, apply the MemVerge labels to the template.metadata.labels section. This ensures that all Pods created by the controller will inherit these labels. Modifying the labels of the controller itself will not affect existing Pods.

Setting Default Labels for Future Pods in a Namespace (using Mutating Admission Webhooks)

While not a direct kubectl command, you can configure Mutating Admission Webhooks (if your Kubernetes cluster supports them) to automatically add MemVerge labels to newly created Pods within a specific namespace. This approach ensures that all future Pods in that namespace will have checkpointing enabled by default. The configuration of such webhooks is beyond the scope of this basic user guide but is a powerful way to enforce checkpointing policies.