Uninstall GPU Cluster Manager and K3s¶
Uninstall GPU Cluster Manager¶
The following helm uninstall
command deletes the GPU Cluster Manager deployment, but leaves Custom Resource Definitions (CRDs) and user-created Custom Resources (CR's) in the cluster:
Note
The command may take several minutes to execute without displaying any output in the terminal. Please be patient.
Upon successful uninstallation, you will see a message similar to the following:
To completely clean up all GPU Cluster Manager resources, CRDs, CRs, etc, run the cleanup.sh
script:
wget https://raw.githubusercontent.com/MemVerge/mvai-public/refs/tags/v0.3.0/cleanup.sh
chmod +x cleanup.sh
sudo ./cleanup.sh
Note
The mvai-cleanup.sh
script may take a considerable amount of time, depending on the size of the cluster and the volume of data that needs to be deleted.
Verify All Pods are Removed¶
Run the following and check that there are no pods listed
If you do not see any pods, no further action is required.
If you see one or more pods, shown in the example below, continue to clean up the environment.
cattle-system mmai-mvai-pre-install-dl6gn 0/1 Completed 0 48s cattle-system mmcloud-engine-cg64g 1/1 Running 0 18h
Why Pods Remain After helm uninstall
¶
When we run helm uninstall mmai -n cattle-system
, Helm attempts to remove all Kubernetes resources associated with the mmai
release. However, it is common for some pods or other resources to remain in the namespace after the uninstall completes. This can happen for several reasons:
- Pods Not Managed by Helm: If certain pods (such as mmcloud-engine-cg64g) were created manually, by another Helm release, or by resources not tracked in the mmai release, Helm will not delete them during uninstallation. Helm only deletes resources it deployed and manages.
- Job or Hook Pods: Pods with names like
mmai-mvai-pre-install-dl6gn
are often created by Helm hooks (e.g., pre-install or pre-delete jobs). These pods may remain in aCompleted
state after the associated Job finishes, and Helm may not always clean up these pods, especially if the Job resource itself is not deleted or if the pod outlives the Job. - Finalizers or Orphaned Resources: Sometimes, resources may have Kubernetes finalizers or become orphaned if their parent resources (like Deployments or StatefulSets) are deleted before the pods themselves. This can prevent automatic cleanup.
- Custom Resource Definitions (CRDs): If the Helm chart installed CRDs and associated custom resources, uninstalling the release may not remove those resources or any pods they manage.
How to Remove Leftover Pods¶
To clean up these pods, you can:
- Delete Pods Manually: Use
kubectl delete pod <pod-name> -n cattle-system
to remove specific pods that remain after uninstall. - Delete Orphaned Resources: If Deployments, StatefulSets, or Jobs remain, delete them with
kubectl delete deployment <name> -n cattle-system
,kubectl delete statefulset <name> -n cattle-system
, orkubectl delete job <name> -n cattle-system
. This will also remove the pods they manage. - Check for Finalizers: If resources are stuck in a terminating state, inspect and remove finalizers using
kubectl edit <resource> <name> -n cattle-system
and delete thefinalizers
field. - Clean Up CRDs and Custom Resources: If the chart installed CRDs, delete any custom resources and the CRDs themselves if they are no longer needed.
Example Cleanup Commands¶
# List all pods in the namespace kubectl
get pods -n cattle-system
# Delete a specific pod
kubectl delete pod mmcloud-engine-cg64g -n cattle-system
# Delete all completed pods (e.g., from Jobs)
kubectl delete pod -n cattle-system --field-selector=status.phase=Succeeded
# Delete leftover Jobs
kubectl get jobs -n cattle-system kubectl delete job <job-name> -n cattle-system
Helm's uninstall process is limited to resources it manages directly. Manual cleanup using kubectl
is safe and common practice for any leftover pods or resources not under Helm's management.
Uninstall K3s¶
To completely remove K3s from your system, run the following commands on the worker nodes:
Remove remaining files and directories:
rm -rf /etc/rancher/k3s
rm -rf /var/lib/rancher/k3s
rm -rf /usr/local/bin/k3s
rm -rf /usr/local/bin/kubectl
rm -rf /usr/local/bin/crictl
rm -rf /usr/local/bin/ctr
To uninstall K3s on the management node(s), run the following:
Remove remaining files and directories: