How to Resolve "Failed to watch" Errors when Installing mvai¶
Problem Summary¶
-
Issue: "failed to watch" errors may occur during installation
-
Affected: All versions
-
Cause: The errors you are seeing during the Helm install of the MemVerge product are Kubernetes watch/list errors and are usually not fatal during Helm installs.
-
Error Message:
$ helm install --namespace cattle-system mvai oci://ghcr.io/memverge/charts/mvai \
--wait --timeout 30m \
--version 0.5.0 \
--set hostname=demo.memvergelab.com \
--set bootstrapPassword=admin \
--set ingress.tls.source=letsEncrypt \
--set letsEncrypt.email=test.user@gmail.com \
--set letsEncrypt.ingress.class=traefik
Pulled: ghcr.io/memverge/charts/mvai:0.5.0
Digest: sha256:f66f7ff5f05ab8b5cc2ecbbbce039b7abbf9a02ba0d9c701b6b292e511b15b21
E0804 02:57:29.142651 83562 reflector.go:200] "Failed to watch" err="the server is currently unable to handle the request (get jobs.batch)" logger="UnhandledError" reflector="k8s.io/client-go@v0.33.0/tools/cache/reflector.go:285" type="*unstructured.Unstructured"
E0804 02:57:31.522699 83562 reflector.go:200] "Failed to watch" err="failed to list *unstructured.Unstructured: the server is currently unable to handle the request (get jobs.batch)" logger="UnhandledError" reflector="k8s.io/client-go@v0.33.0/tools/cache/reflector.go:285" type="*unstructured.Unstructured"
...
Investigation Steps¶
- No further Investigation steps are required, as the error message provides enough information to pinpoint this issue.
Resolution Steps¶
Root Cause Explanation:
- The "Failed to watch" errors typically mean that the Kubernetes API server was temporarily unable to handle a client request -- in your error, it's related to "jobs.batch", meaning batch jobs in your cluster.
- This often happens because the API server is overloaded, restarting, or in the process of reapplying permissions/CRDs. It's common during large Helm deployments or on clusters with heavy activity or resource constraints.
- These errors can also appear if a required Custom Resource or RBAC permission is briefly unavailable while resources are being created; as Helm and controllers retry, this usually resolves automatically.
Installation Status Dependent: Most of the time, these are not fatal, and Helm, along with the controllers, will retry the operation. If the installation or upgrade completes successfully or continues to make progress, you can safely ignore these error logs.
Contact MemVerge Support: Only if the Helm deployment ultimately times out, hangs, or rolls back would these errors indicate a real problem with cluster health or permissions.
_If you do experience persistent failures_, please collect the output from and contact [MemVerge Support](mailto:support@memverge.com):
```bash
$ kubectl get events -A
$ kubectl get pods -n cattle-system
$ helm status mvai -n cattle-system
```