Viewing Detailed Workspace Information¶
Once you have one or more Workspaces running, you can drill down into each Workspace’s details to monitor resource usage, health, and configuration. This section describes the various tabs and panels that provide insights into your Workspace’s status, including GPU utilization, node allocation, and telemetry data.
Accessing the Workspace Details¶
-
Open the Workspaces Dashboard
- From the left navigation menu, click Workspaces.
- A list or card view of all your Workspaces appears.
-
Select a Workspace
- Locate the Workspace you want to inspect.
- Click the Workspace Name or the Details action (depending on your UI) to open its detailed view.
Overview Tab¶
-
Summary Information & Pods
- Displays the Workspace Name, Status (e.g.,
Running
,Stopped
), and Owner (the user who created it). - Shows which Project the Workspace belongs to, along with high-level resources allocated (CPU cores, memory, GPUs).
- Connect opens the workspace UI in another browser tab.
- Displays the Workspace Name, Status (e.g.,
GPUs Tab¶
- GPU Allocation
- Lists the GPUs assigned to this Workspace, including their Model and GPU Memory.
- Utilization Metrics
- Real-time or near-real-time graphs showing GPU Usage (percent busy) and GPU Memory consumption.
- Helps identify whether your AI model is efficiently using GPU resources or if there is unused capacity.
Nodes Tab¶
- Node Association
- Shows the node(s) where this Workspace’s pods are running, including Node Name and Roles (e.g.,
worker
,control-plane
).
- Shows the node(s) where this Workspace’s pods are running, including Node Name and Roles (e.g.,
- Node Metrics
- May display CPU usage, memory usage, and uptime data specific to the node(s) hosting your Workspace.
Node Groups Tab¶
- Group Membership
- If multiple Node Groups are available, this tab indicates which group currently serves the Workspace.
- Resource Overview
- Summarizes how many GPUs, CPU cores, and memory resources each Node Group provides.
- Helpful for understanding how workloads are distributed among different Node Groups in your cluster.
Metrics / Telemetry Tab¶
- GPU, CPU, and Memory Graphs
- View usage trends over time, identifying performance bottlenecks or resource spikes.
- Enables you to see if usage patterns changed after a model update or new dataset processing.
Conditions & Events¶
- Health Checks
- Displays any current warnings or errors reported by Kubernetes or the workspace environment.
- Examples include node pressure conditions, scheduling failures, or pod restart loops.
- System Events
- Shows a timeline of events like when the Workspace started, stopped, or encountered any scheduling issues.
Storage & Volumes¶
- Mount Paths and Volume Details
- Shows exactly how volumes are mounted within the container’s filesystem.
- Resize or Reconfigure (where applicable)
- Depending on platform support, you may see options to edit or update volume size or mount options.
Best Practices¶
- Regular Monitoring: Check GPU, CPU, and memory usage periodically to ensure you’re using the correct resource configurations.
- Investigate Conditions: Address warnings or errors promptly to maintain a healthy Workspace environment.
- Manage Storage: Attach or resize volumes as your dataset grows.
- Optimize Node Selection: If performance is lagging, consider adjusting Node Group assignments or upgrading hardware resources.