CXL Access Coordinates Computation

Latency and Bandwidth Calculation

A memory region performance coordinates (latency and bandwidth) are typically provided via ACPI tables SRAT and HMAT. However, the platform firmware (BIOS) is not able to annotate those for CXL devices that are hot-plugged since they do not exist during platform firmware initialization. The CXL driver can compute the performance coordinates by retrieving data from several components.

The SRAT provides a Generic Port Affinity subtable that ties a proximity domain to a device handle, which in this case would be the CXL hostbridge. Using this association, the performance coordinates for the Generic Port can be retrieved from the HMAT subtable. This piece represents the performance coordinates between a CPU and a Generic Port (CXL hostbridge).

The CDAT provides the performance coordinates for the CXL device itself. That is the bandwidth and latency to access that device’s memory region. The DSMAS subtable provides a DSMADHandle that is tied to a Device Physical Address (DPA) range. The DSLBIS subtable provides the performance coordinates that’s tied to a DSMADhandle and this ties the two table entries together to provide the performance coordinates for each DPA region. For example, if a device exports a DRAM region and a PMEM region, then there would be different performance characteristsics for each of those regions.

If there’s a CXL switch in the topology, then the performance coordinates for the switch is provided by SSLBIS subtable. This provides the bandwidth and latency for traversing the switch between the switch upstream port and the switch downstream port that points to the endpoint device.

Simple topology example:

GP0/HB0/ACPI0016-0
       RP0
        |
        | L0
        |
    SW 0 / USP0
    SW 0 / DSP0
        |
        | L1
        |
       EP0

In this example, there is a CXL switch between an endpoint and a root port. Latency in this example is calculated as such: L(EP0) - Latency from EP0 CDAT DSMAS+DSLBIS L(L1) - Link latency between EP0 and SW0DSP0 L(SW0) - Latency for the switch from SW0 CDAT SSLBIS. L(L0) - Link latency between SW0 and RP0 L(RP0) - Latency from root port to CPU via SRAT and HMAT (Generic Port). Total read and write latencies are the sum of all these parts.

Bandwidth in this example is calculated as such: B(EP0) - Bandwidth from EP0 CDAT DSMAS+DSLBIS B(L1) - Link bandwidth between EP0 and SW0DSP0 B(SW0) - Bandwidth for the switch from SW0 CDAT SSLBIS. B(L0) - Link bandwidth between SW0 and RP0 B(RP0) - Bandwidth from root port to CPU via SRAT and HMAT (Generic Port). The total read and write bandwidth is the min() of all these parts.

To calculate the link bandwidth: LinkOperatingFrequency (GT/s) is the current negotiated link speed. DataRatePerLink (MB/s) = LinkOperatingFrequency / 8 Bandwidth (MB/s) = PCIeCurrentLinkWidth * DataRatePerLink Where PCIeCurrentLinkWidth is the number of lanes in the link.

To calculate the link latency: LinkLatency (picoseconds) = FlitSize / LinkBandwidth (MB/s)

See CXL Memory Device SW Guide r1.0, section 2.11.3 and 2.11.4 for details.

In the end, the access coordinates for a constructed memory region is calculated from one or more memory partitions from each of the CXL device(s).

QTG ID

Each CEDT has a QTG ID field. This field provides the ID that associates with a QoS Throttling Group (QTG) for the CFMWS window. Once the access coordinates are calculated, an ACPI Device Specific Method can be issued to the ACPI0016 device to retrieve the QTG ID depends on the access coordinates provided. The QTG ID for the device can be used as guidance to match to the CFMWS to setup the best Linux root decoder for the device performance.