Confirm the Installation was Successful¶
After installing MemVerge.ai, it's important to verify that all components are running correctly. Follow these steps to confirm a successful installation:
- Check MMAI services:
Ensure all MMAI-related services are present and have a status of ClusterIP
, including:
- mmai
- mmai-billing
- mmai-billing-mysql
- mmai-ctrl-controller-manager-metrics-service
- mmai-ctrl-metrics-aggregator-service
- mmai-ctrl-webhook-service
Example:
$ kubectl get services -n cattle-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
gpu-operator ClusterIP 10.43.84.16 <none> 8080/TCP 67m
kueue-controller-manager-metrics-service ClusterIP 10.43.48.39 <none> 8443/TCP 67m
kueue-visibility-server ClusterIP 10.43.241.212 <none> 443/TCP 67m
kueue-webhook-service ClusterIP 10.43.89.126 <none> 443/TCP 67m
mmai ClusterIP 10.43.23.252 <none> 80/TCP,443/TCP 72m
mmai-billing ClusterIP 10.43.63.29 <none> 8080/TCP 72m
mmai-billing-mysql ClusterIP 10.43.41.163 <none> 3306/TCP 72m
mmai-ctrl-controller-manager-metrics-service ClusterIP 10.43.197.99 <none> 8443/TCP 67m
mmai-ctrl-metrics-aggregator-service ClusterIP 10.43.185.32 <none> 9191/TCP 67m
mmai-ctrl-webhook-service ClusterIP 10.43.239.169 <none> 443/TCP 67m
mmcloud-operator-controller-manager-metrics-service ClusterIP 10.43.234.77 <none> 8443/TCP 67m
mmcloud-operator-webhook-service ClusterIP 10.43.99.35 <none> 443/TCP 67m
nvidia-dcgm-exporter ClusterIP 10.43.29.94 <none> 9400/TCP 67m
rancher-webhook ClusterIP 10.43.193.230 <none> 443/TCP 70m
- Verify pod status:
All MMAI-related pods should be in the Running
state.
Example:
$ kubectl get pods -n cattle-system
NAME READY STATUS RESTARTS AGE
engine-q8fjl 1/1 Running 0 67m
engine-r9cw6 1/1 Running 0 67m
gpu-feature-discovery-2q454 1/1 Running 0 67m
gpu-operator-58dcc865fd-bzr5n 1/1 Running 0 68m
gpu-operator-node-feature-discovery-gc-7f546fd4bc-q67nl 1/1 Running 0 68m
gpu-operator-node-feature-discovery-master-8448c8896c-65w4w 1/1 Running 0 68m
gpu-operator-node-feature-discovery-worker-72m4v 1/1 Running 0 68m
gpu-operator-node-feature-discovery-worker-zzkxf 1/1 Running 0 68m
kueue-controller-manager-7f55cc5474-gf9xh 2/2 Running 0 67m
mmai-565bd9f48b-77hls 1/1 Running 0 73m
mmai-565bd9f48b-n4bqg 1/1 Running 0 72m
mmai-billing-6fb99c585d-2b2mt 1/1 Running 6 (70m ago) 73m
mmai-billing-mysql-6464ff86fb-2bbt4 1/1 Running 0 73m
mmai-ctrl-controller-manager-9cbd47d9-h47pm 1/1 Running 0 67m
mmai-ctrl-metrics-aggregator-5bd4d565d5-ksmwz 1/1 Running 0 67m
mmcloud-operator-controller-manager-974899777-vrbcf 1/1 Running 0 67m
nvidia-container-toolkit-daemonset-42dfn 1/1 Running 0 67m
nvidia-cuda-validator-z7drc 0/1 Completed 0 64m
nvidia-dcgm-exporter-hjtjp 1/1 Running 0 67m
nvidia-device-plugin-daemonset-6mx5b 1/1 Running 0 67m
nvidia-driver-daemonset-6gkq2 1/1 Running 0 68m
nvidia-operator-validator-ldf8k 1/1 Running 0 67m
rancher-webhook-5d7c7b486c-qk4ph 1/1 Running 0 71m
- Check ingress configuration:
Confirm that the MMAI ingress is properly configured with the correct hostname and TLS settings.
Example:
$ kubectl get ingress -n cattle-system
NAME CLASS HOSTS ADDRESS PORTS AGE
mmai traefik mvai-mgmt 172.31.25.216,172.31.25.25 80, 443 74m
- Validate MMAI version:
Ensure the deployed version matches the expected version.
Example:
$ kubectl describe deployment mmai -n cattle-system | grep Image
Image: ghcr.io/memverge/mmai:v0.3.0
- Test MMAI web interface accessibility:
Use a web browser to access the MMAI dashboard using the configured hostname. Verify that you can log in successfully.
- Check MMAI logs for any errors:
Review the logs for any error messages or warnings that might indicate configuration issues.
If all these checks pass without errors, your MMAI installation is likely successful and ready for use.