Skip to content

Confirm the Installation was Successful

After installing MemVerge.ai, it's important to verify that all components are running correctly. Follow these steps to confirm a successful installation:

  1. Check MMAI services:
kubectl get services -n cattle-system

Ensure all MMAI-related services are present and have a status of ClusterIP, including:

  • mmai
  • mmai-billing
  • mmai-billing-mysql
  • mmai-ctrl-controller-manager-metrics-service
  • mmai-ctrl-metrics-aggregator-service
  • mmai-ctrl-webhook-service

Example:

$ kubectl get services -n cattle-system
NAME                                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
gpu-operator                                          ClusterIP   10.43.84.16     <none>        8080/TCP         67m
kueue-controller-manager-metrics-service              ClusterIP   10.43.48.39     <none>        8443/TCP         67m
kueue-visibility-server                               ClusterIP   10.43.241.212   <none>        443/TCP          67m
kueue-webhook-service                                 ClusterIP   10.43.89.126    <none>        443/TCP          67m
mmai                                                  ClusterIP   10.43.23.252    <none>        80/TCP,443/TCP   72m
mmai-billing                                          ClusterIP   10.43.63.29     <none>        8080/TCP         72m
mmai-billing-mysql                                    ClusterIP   10.43.41.163    <none>        3306/TCP         72m
mmai-ctrl-controller-manager-metrics-service          ClusterIP   10.43.197.99    <none>        8443/TCP         67m
mmai-ctrl-metrics-aggregator-service                  ClusterIP   10.43.185.32    <none>        9191/TCP         67m
mmai-ctrl-webhook-service                             ClusterIP   10.43.239.169   <none>        443/TCP          67m
mmcloud-operator-controller-manager-metrics-service   ClusterIP   10.43.234.77    <none>        8443/TCP         67m
mmcloud-operator-webhook-service                      ClusterIP   10.43.99.35     <none>        443/TCP          67m
nvidia-dcgm-exporter                                  ClusterIP   10.43.29.94     <none>        9400/TCP         67m
rancher-webhook                                       ClusterIP   10.43.193.230   <none>        443/TCP          70m
  1. Verify pod status:
kubectl get pods -n cattle-system

All MMAI-related pods should be in the Running state.

Example:

$ kubectl get pods -n cattle-system
NAME                                                          READY   STATUS      RESTARTS      AGE
engine-q8fjl                                                  1/1     Running     0             67m
engine-r9cw6                                                  1/1     Running     0             67m
gpu-feature-discovery-2q454                                   1/1     Running     0             67m
gpu-operator-58dcc865fd-bzr5n                                 1/1     Running     0             68m
gpu-operator-node-feature-discovery-gc-7f546fd4bc-q67nl       1/1     Running     0             68m
gpu-operator-node-feature-discovery-master-8448c8896c-65w4w   1/1     Running     0             68m
gpu-operator-node-feature-discovery-worker-72m4v              1/1     Running     0             68m
gpu-operator-node-feature-discovery-worker-zzkxf              1/1     Running     0             68m
kueue-controller-manager-7f55cc5474-gf9xh                     2/2     Running     0             67m
mmai-565bd9f48b-77hls                                         1/1     Running     0             73m
mmai-565bd9f48b-n4bqg                                         1/1     Running     0             72m
mmai-billing-6fb99c585d-2b2mt                                 1/1     Running     6 (70m ago)   73m
mmai-billing-mysql-6464ff86fb-2bbt4                           1/1     Running     0             73m
mmai-ctrl-controller-manager-9cbd47d9-h47pm                   1/1     Running     0             67m
mmai-ctrl-metrics-aggregator-5bd4d565d5-ksmwz                 1/1     Running     0             67m
mmcloud-operator-controller-manager-974899777-vrbcf           1/1     Running     0             67m
nvidia-container-toolkit-daemonset-42dfn                      1/1     Running     0             67m
nvidia-cuda-validator-z7drc                                   0/1     Completed   0             64m
nvidia-dcgm-exporter-hjtjp                                    1/1     Running     0             67m
nvidia-device-plugin-daemonset-6mx5b                          1/1     Running     0             67m
nvidia-driver-daemonset-6gkq2                                 1/1     Running     0             68m
nvidia-operator-validator-ldf8k                               1/1     Running     0             67m
rancher-webhook-5d7c7b486c-qk4ph                              1/1     Running     0             71m
  1. Check ingress configuration:
kubectl get ingress -n cattle-system

Confirm that the MMAI ingress is properly configured with the correct hostname and TLS settings.

Example:

$ kubectl get ingress -n cattle-system
NAME   CLASS     HOSTS       ADDRESS                      PORTS     AGE
mmai   traefik   mvai-mgmt   172.31.25.216,172.31.25.25   80, 443   74m
  1. Validate MMAI version:
kubectl describe deployment mmai -n cattle-system | grep Image

Ensure the deployed version matches the expected version.

Example:

$ kubectl describe deployment mmai -n cattle-system | grep Image
    Image:      ghcr.io/memverge/mmai:v0.3.0
  1. Test MMAI web interface accessibility:

Use a web browser to access the MMAI dashboard using the configured hostname. Verify that you can log in successfully.

  1. Check MMAI logs for any errors:
kubectl logs deployment/mmai -n cattle-system

Review the logs for any error messages or warnings that might indicate configuration issues.

If all these checks pass without errors, your MMAI installation is likely successful and ready for use.