Skip to content

Quality of Service (QoS) Memory Engine

The Quality of Service (QoS) Memory Engine provides Latency and Bandwidth policies. Memory tiering is a memory management technique to optimize application performance on systems with heterogeneous memory architectures, such as those with DRAM and Compute Express Link (CXL) Type 3 memory expanders. DRAM has high bandwidth and lower latency compared with CXL memory.

Understanding the Latency Policy

Latency tiering intelligently manages data placement across heterogeneous memory devices to optimize performance based on the "temperature" of memory pages. Their "temperature" refers to how frequently they are accessed:

  • Hot Pages: These are frequently accessed pages. Since they are used often, minimizing the latency of accessing these pages is important to ensure high performance. Therefore, the MemVerge QoS engine moves hot pages to DRAM, where they can be accessed quickly.
  • Cold Pages: These are infrequently accessed pages. Because they are used less often, the slightly longer access times from being placed in CXL memory are less impactful on overall system performance. Moving cold pages to CXL memory frees valuable DRAM for more critical data.

By ensuring that frequently accessed data is stored in DRAM, the system can reduce the average latency of memory accesses, leading to faster application performance.

Understanding the Bandwidth Policy

Bandwidth-optimized memory placement and movement is a strategy designed to enhance the performance of applications by effectively utilizing the combined bandwidth capabilities of different memory types within a system, such as DRAM and CXL Type 3 memory expansion. This approach is particularly relevant in systems where both types of memory are present and where applications can benefit from the unique advantages each memory type offers in terms of bandwidth. High bandwidth is crucial for applications that quickly process large volumes of data, such as high-performance computing, data analytics, and video processing.

The goal of bandwidth-optimized memory placement and movement is to maximize the overall system bandwidth by strategically placing and moving data between DRAM and CXL memory based on the application's bandwidth requirements.

The Bandwidth policy engine will utilize the available bandwidth from all DRAM and CXL memory devices. The bandwidth policy uses a user-selectable ratio of DRAM to CXL to maintain a balance between bandwidth and latency.

Enabling the QoS Memory Engine

When the QoS Memory Engine is disabled, the following screen will be displayed.

qos-dashboard-disabled Quality of Service is disabled.

To enable the QoS Memory Engine, click the 'Turn on QoS' or the 'Settings' button, then click the 'Disabled' toggle.

qos-enable-policy-engine

When the QoS Memory Engine is enabled, additional settings are available.

Select the desired policy - Latency or Bandwidth, change the settings if required, then click 'Save' to apply the changes. A confirmation popup will appear. Click 'Start' to enable the policy engine. Only one policy can be enabled at a time.

qos-enable-policy-engine-confirmation-message

The active policy is shown in the upper-right of the page:

qos-bandwidth-policy-is-enabled

The Bandwidth Policy is Active and Enabled

qos-latency-policy-is-enabled

The Latency Policy is Active and Enabled

Latency Policy Settings

qos-latency-policy-settings

Bandwidth Policy Settings

qos-bandwidth-policy-settings

Changing Policies

To switch between the Latency and Bandwidth policy

  1. Click the settings button
  2. Select the desired policy
  3. Click 'Save' in the lower-right of the settings window
  4. Click 'Start' in the confirmation message

Disabling the QoS Memory Engine

To disable the QoS Memory Engine, use the toggle button in the Settings section:

  1. Click the 'Settings' icon, then click the toggle button.

    qos-disable-policy-engine

  2. Click 'Stop' in the confirmation popup.

    qos-disable-policy-engine-confirmation-message