drm/amdgpu AMDgpu driver

The drm/amdgpu driver supports all AMD Radeon GPUs based on the Graphics Core Next (GCN) architecture.

Module Parameters

The amdgpu driver supports the following module parameters:

vramlimit (int)

Restrict the total amount of VRAM in MiB for testing. The default is 0 (Use full VRAM).

vis_vramlimit (int)

Restrict the amount of CPU visible VRAM in MiB for testing. The default is 0 (Use full CPU visible VRAM).

gartsize (uint)

Restrict the size of GART in Mib (32, 64, etc.) for testing. The default is -1 (The size depends on asic).

gttsize (int)

Restrict the size of GTT domain in MiB for testing. The default is -1 (It’s VRAM size if 3GB < VRAM < 3/4 RAM, otherwise 3/4 RAM size).

moverate (int)

Set maximum buffer migration rate in MB/s. The default is -1 (8 MB/s).

benchmark (int)

Run benchmarks. The default is 0 (Skip benchmarks).

test (int)

Test BO GTT->VRAM and VRAM->GTT GPU copies. The default is 0 (Skip test, only set 1 to run test).

audio (int)

Set HDMI/DPAudio. Only affects non-DC display handling. The default is -1 (Enabled), set 0 to disabled it.

disp_priority (int)

Set display Priority (1 = normal, 2 = high). Only affects non-DC display handling. The default is 0 (auto).

hw_i2c (int)

To enable hw i2c engine. Only affects non-DC display handling. The default is 0 (Disabled).

pcie_gen2 (int)

To disable PCIE Gen2/3 mode (0 = disable, 1 = enable). The default is -1 (auto, enabled).

msi (int)

To disable Message Signaled Interrupts (MSI) functionality (1 = enable, 0 = disable). The default is -1 (auto, enabled).

lockup_timeout (int)

Set GPU scheduler timeout value in ms. Value 0 is invalidated, will be adjusted to 10000. Negative values mean ‘infinite timeout’ (MAX_JIFFY_OFFSET). The default is 10000.

dpm (int)

Override for dynamic power management setting (1 = enable, 0 = disable). The default is -1 (auto).

fw_load_type (int)

Set different firmware loading type for debugging (0 = direct, 1 = SMU, 2 = PSP). The default is -1 (auto).

aspm (int)

To disable ASPM (1 = enable, 0 = disable). The default is -1 (auto, enabled).

runpm (int)

Override for runtime power management control for dGPUs in PX/HG laptops. The amdgpu driver can dynamically power down the dGPU on PX/HG laptops when it is idle. The default is -1 (auto enable). Setting the value to 0 disables this functionality.

ip_block_mask (uint)

Override what IP blocks are enabled on the GPU. Each GPU is a collection of IP blocks (gfx, display, video, etc.). Use this parameter to disable specific blocks. Note that the IP blocks do not have a fixed index. Some asics may not have some IPs or may include multiple instances of an IP so the ordering various from asic to asic. See the driver output in the kernel log for the list of IPs on the asic. The default is 0xffffffff (enable all blocks on a device).

bapm (int)

Bidirectional Application Power Management (BAPM) used to dynamically share TDP between CPU and GPU. Set value 0 to disable it. The default -1 (auto, enabled)

deep_color (int)

Set 1 to enable Deep Color support. Only affects non-DC display handling. The default is 0 (disabled).

vm_size (int)

Override the size of the GPU’s per client virtual address space in GiB. The default is -1 (automatic for each asic).

vm_fragment_size (int)

Override VM fragment size in bits (4, 5, etc. 4 = 64K, 9 = 2M). The default is -1 (automatic for each asic).

vm_block_size (int)

Override VM page table size in bits (default depending on vm_size and hw setup). The default is -1 (automatic for each asic).

vm_fault_stop (int)

Stop on VM fault for debugging (0 = never, 1 = print first, 2 = always). The default is 0 (No stop).

vm_debug (int)

Debug VM handling (0 = disabled, 1 = enabled). The default is 0 (Disabled).

vm_update_mode (int)

Override VM update mode. VM updated by using CPU (0 = never, 1 = Graphics only, 2 = Compute only, 3 = Both). The default is -1 (Only in large BAR(LB) systems Compute VM tables will be updated by CPU, otherwise 0, never).

vram_page_split (int)

Override the number of pages after we split VRAM allocations (default 512, -1 = disable). The default is 512.

exp_hw_support (int)

Enable experimental hw support (1 = enable). The default is 0 (disabled).

dc (int)

Disable/Enable Display Core driver for debugging (1 = enable, 0 = disable). The default is -1 (automatic for each asic).

sched_jobs (int)

Override the max number of jobs supported in the sw queue. The default is 32.

sched_hw_submission (int)

Override the max number of HW submissions. The default is 2.

ppfeaturemask (uint)

Override power features enabled. See enum PP_FEATURE_MASK in drivers/gpu/drm/amd/include/amd_shared.h. The default is the current set of stable power features.

pcie_gen_cap (uint)

Override PCIE gen speed capabilities. See the CAIL flags in drivers/gpu/drm/amd/include/amd_pcie.h. The default is 0 (automatic for each asic).

pcie_lane_cap (uint)

Override PCIE lanes capabilities. See the CAIL flags in drivers/gpu/drm/amd/include/amd_pcie.h. The default is 0 (automatic for each asic).

cg_mask (uint)

Override Clockgating features enabled on GPU (0 = disable clock gating). See the AMD_CG_SUPPORT flags in drivers/gpu/drm/amd/include/amd_shared.h. The default is 0xffffffff (all enabled).

pg_mask (uint)

Override Powergating features enabled on GPU (0 = disable power gating). See the AMD_PG_SUPPORT flags in drivers/gpu/drm/amd/include/amd_shared.h. The default is 0xffffffff (all enabled).

sdma_phase_quantum (uint)

Override SDMA context switch phase quantum (x 1K GPU clock cycles, 0 = no change). The default is 32.

disable_cu (charp)

Set to disable CUs (It’s set like se.sh.cu,...). The default is NULL.

virtual_display (charp)

Set to enable virtual display feature. This feature provides a virtual display hardware on headless boards or in virtualized environments. It will be set like xxxx:xx:xx.x,x;xxxx:xx:xx.x,x. It’s the pci address of the device, plus the number of crtcs to expose. E.g., 0000:26:00.0,4 would enable 4 virtual crtcs on the pci device at 26:00.0. The default is NULL.

ngg (int)

Set to enable Next Generation Graphics (1 = enable). The default is 0 (disabled).

prim_buf_per_se (int)

Override the size of Primitive Buffer per Shader Engine in Byte. The default is 0 (depending on gfx).

pos_buf_per_se (int)

Override the size of Position Buffer per Shader Engine in Byte. The default is 0 (depending on gfx).

cntl_sb_buf_per_se (int)

Override the size of Control Sideband per Shader Engine in Byte. The default is 0 (depending on gfx).

param_buf_per_se (int)

Override the size of Off-Chip Pramater Cache per Shader Engine in Byte. The default is 0 (depending on gfx).

job_hang_limit (int)

Set how much time allow a job hang and not drop it. The default is 0.

lbpw (int)

Override Load Balancing Per Watt (LBPW) support (1 = enable, 0 = disable). The default is -1 (auto, enabled).

gpu_recovery (int)

Set to enable GPU recovery mechanism (1 = enable, 0 = disable). The default is -1 (auto, disabled except SRIOV).

emu_mode (int)

Set value 1 to enable emulation mode. This is only needed when running on an emulator. The default is 0 (disabled).

si_support (int)

Set SI support driver. This parameter works after set config CONFIG_DRM_AMDGPU_SI. For SI asic, when radeon driver is enabled, set value 0 to use radeon driver, while set value 1 to use amdgpu driver. The default is using radeon driver when it available, otherwise using amdgpu driver.

cik_support (int)

Set CIK support driver. This parameter works after set config CONFIG_DRM_AMDGPU_CIK. For CIK asic, when radeon driver is enabled, set value 0 to use radeon driver, while set value 1 to use amdgpu driver. The default is using radeon driver when it available, otherwise using amdgpu driver.

smu_memory_pool_size (uint)

It is used to reserve gtt for smu debug usage, setting value 0 to disable it. The actual size is value * 256MiB. E.g. 0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte. The default is 0 (disabled).

sched_policy (int)

Set scheduling policy. Default is HWS(hardware scheduling) with over-subscription. Setting 1 disables over-subscription. Setting 2 disables HWS and statically assigns queues to HQDs.

hws_max_conc_proc (int)

Maximum number of processes that HWS can schedule concurrently. The maximum is the number of VMIDs assigned to the HWS, which is also the default.

cwsr_enable (int)

CWSR(compute wave store and resume) allows the GPU to preempt shader execution in the middle of a compute wave. Default is 1 to enable this feature. Setting 0 disables it.

max_num_of_queues_per_device (int)

Maximum number of queues per device. Valid setting is between 1 and 4096. Default is 4096.

send_sigterm (int)

Send sigterm to HSA process on unhandled exceptions. Default is not to send sigterm but just print errors on dmesg. Setting 1 enables sending sigterm.

debug_largebar (int)

Set debug_largebar as 1 to enable simulating large-bar capability on non-large bar system. This limits the VRAM size reported to ROCm applications to the visible size, usually 256MB. Default value is 0, diabled.

ignore_crat (int)

Ignore CRAT table during KFD initialization. By default, KFD uses the ACPI CRAT table to get information about AMD APUs. This option can serve as a workaround on systems with a broken CRAT table.

noretry (int)

This parameter sets sh_mem_config.retry_disable. Default value, 0, enables retry. Setting 1 disables retry. Retry is needed for recoverable page faults.

halt_if_hws_hang (int)

Halt if HWS hang is detected. Default value, 0, disables the halt on hang. Setting 1 enables halt on hang.

dcfeaturemask (uint)

Override display features enabled. See enum DC_FEATURE_MASK in drivers/gpu/drm/amd/include/amd_shared.h. The default is the current set of stable display features.

Core Driver Infrastructure

This section covers core driver infrastructure.

Memory Domains

AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible. Memory in this pool could be swapped out to disk if there is pressure.

AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the GPU’s virtual address space via gart. Gart memory linearizes non-contiguous pages of system memory, allows GPU access system memory in a linezrized fashion.

AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory carved out by the BIOS.

AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data across shader threads.

AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the execution of all the waves on a device.

AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines for appending data.

Buffer Objects

This defines the interfaces to operate on an amdgpu_bo buffer object which represents memory used by driver (VRAM, system memory, etc.). The driver provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces to create/destroy/set buffer object which are then managed by the kernel TTM memory manager. The interfaces are also used internally by kernel clients, including gfx, uvd, etc. for kernel managed allocations used by the GPU.

void amdgpu_bo_subtract_pin_size(struct amdgpu_bo * bo)

Remove BO from pin_size accounting

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object

Description

This function is called when a BO stops being pinned, and updates the amdgpu_device pin_size values accordingly.

bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object * bo)

check if the buffer object is an amdgpu_bo

Parameters

struct ttm_buffer_object * bo
buffer object to be checked

Description

Uses destroy function associated with the object to determine if this is an amdgpu_bo.

Return

true if the object belongs to amdgpu_bo, false if not.

void amdgpu_bo_placement_from_domain(struct amdgpu_bo * abo, u32 domain)

set buffer’s placement

Parameters

struct amdgpu_bo * abo
amdgpu_bo buffer object whose placement is to be set
u32 domain
requested domain

Description

Sets buffer’s placement according to requested domain and the buffer’s flags.

int amdgpu_bo_create_reserved(struct amdgpu_device * adev, unsigned long size, int align, u32 domain, struct amdgpu_bo ** bo_ptr, u64 * gpu_addr, void ** cpu_addr)

create reserved BO for kernel use

Parameters

struct amdgpu_device * adev
amdgpu device object
unsigned long size
size for the new BO
int align
alignment for the new BO
u32 domain
where to place it
struct amdgpu_bo ** bo_ptr
used to initialize BOs in structures
u64 * gpu_addr
GPU addr of the pinned BO
void ** cpu_addr
optional CPU address mapping

Description

Allocates and pins a BO for kernel internal use, and returns it still reserved.

Note

For bo_ptr new BO is only created if bo_ptr points to NULL.

Return

0 on success, negative error code otherwise.

int amdgpu_bo_create_kernel(struct amdgpu_device * adev, unsigned long size, int align, u32 domain, struct amdgpu_bo ** bo_ptr, u64 * gpu_addr, void ** cpu_addr)

create BO for kernel use

Parameters

struct amdgpu_device * adev
amdgpu device object
unsigned long size
size for the new BO
int align
alignment for the new BO
u32 domain
where to place it
struct amdgpu_bo ** bo_ptr
used to initialize BOs in structures
u64 * gpu_addr
GPU addr of the pinned BO
void ** cpu_addr
optional CPU address mapping

Description

Allocates and pins a BO for kernel internal use.

Note

For bo_ptr new BO is only created if bo_ptr points to NULL.

Return

0 on success, negative error code otherwise.

void amdgpu_bo_free_kernel(struct amdgpu_bo ** bo, u64 * gpu_addr, void ** cpu_addr)

free BO for kernel use

Parameters

struct amdgpu_bo ** bo
amdgpu BO to free
u64 * gpu_addr
pointer to where the BO’s GPU memory space address was stored
void ** cpu_addr
pointer to where the BO’s CPU memory space address was stored

Description

unmaps and unpin a BO for kernel internal use.

int amdgpu_bo_create(struct amdgpu_device * adev, struct amdgpu_bo_param * bp, struct amdgpu_bo ** bo_ptr)

create an amdgpu_bo buffer object

Parameters

struct amdgpu_device * adev
amdgpu device object
struct amdgpu_bo_param * bp
parameters to be used for the buffer object
struct amdgpu_bo ** bo_ptr
pointer to the buffer object pointer

Description

Creates an amdgpu_bo buffer object; and if requested, also creates a shadow object. Shadow object is used to backup the original buffer object, and is always in GTT.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_backup_to_shadow(struct amdgpu_device * adev, struct amdgpu_ring * ring, struct amdgpu_bo * bo, struct reservation_object * resv, struct dma_fence ** fence, bool direct)

Backs up an amdgpu_bo buffer object

Parameters

struct amdgpu_device * adev
amdgpu device object
struct amdgpu_ring * ring
amdgpu_ring for the engine handling the buffer operations
struct amdgpu_bo * bo
amdgpu_bo buffer to be backed up
struct reservation_object * resv
reservation object with embedded fence
struct dma_fence ** fence
dma_fence associated with the operation
bool direct
whether to submit the job directly

Description

Copies an amdgpu_bo buffer object to its shadow object. Not used for now.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_validate(struct amdgpu_bo * bo)

validate an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
pointer to the buffer object

Description

Sets placement according to domain; and changes placement and caching policy of the buffer object according to the placement. This is used for validating shadow bos. It calls ttm_bo_validate() to make sure the buffer is resident where it needs to be.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_restore_shadow(struct amdgpu_bo * shadow, struct dma_fence ** fence)

restore an amdgpu_bo shadow

Parameters

struct amdgpu_bo * shadow
amdgpu_bo shadow to be restored
struct dma_fence ** fence
dma_fence associated with the operation

Description

Copies a buffer object’s shadow content back to the object. This is used for recovering a buffer from its shadow in case of a gpu reset where vram context may be lost.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_kmap(struct amdgpu_bo * bo, void ** ptr)

map an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object to be mapped
void ** ptr
kernel virtual address to be returned

Description

Calls ttm_bo_kmap() to set up the kernel virtual mapping; calls amdgpu_bo_kptr() to get the kernel virtual address.

Return

0 for success or a negative error code on failure.

void * amdgpu_bo_kptr(struct amdgpu_bo * bo)

returns a kernel virtual address of the buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object

Description

Calls ttm_kmap_obj_virtual() to get the kernel virtual address

Return

the virtual address of a buffer object area.

void amdgpu_bo_kunmap(struct amdgpu_bo * bo)

unmap an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object to be unmapped

Description

Unmaps a kernel map set up by amdgpu_bo_kmap().

struct amdgpu_bo * amdgpu_bo_ref(struct amdgpu_bo * bo)

reference an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object

Description

References the contained ttm_buffer_object.

Return

a refcounted pointer to the amdgpu_bo buffer object.

void amdgpu_bo_unref(struct amdgpu_bo ** bo)

unreference an amdgpu_bo buffer object

Parameters

struct amdgpu_bo ** bo
amdgpu_bo buffer object

Description

Unreferences the contained ttm_buffer_object and clear the pointer

int amdgpu_bo_pin_restricted(struct amdgpu_bo * bo, u32 domain, u64 min_offset, u64 max_offset)

pin an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object to be pinned
u32 domain
domain to be pinned to
u64 min_offset
the start of requested address range
u64 max_offset
the end of requested address range

Description

Pins the buffer object according to requested domain and address range. If the memory is unbound gart memory, binds the pages into gart table. Adjusts pin_count and pin_size accordingly.

Pinning means to lock pages in memory along with keeping them at a fixed offset. It is required when a buffer can not be moved, for example, when a display buffer is being scanned out.

Compared with amdgpu_bo_pin(), this function gives more flexibility on where to pin a buffer if there are specific restrictions on where a buffer must be located.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_pin(struct amdgpu_bo * bo, u32 domain)

pin an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object to be pinned
u32 domain
domain to be pinned to

Description

A simple wrapper to amdgpu_bo_pin_restricted(). Provides a simpler API for buffers that do not have any strict restrictions on where a buffer must be located.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_unpin(struct amdgpu_bo * bo)

unpin an amdgpu_bo buffer object

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object to be unpinned

Description

Decreases the pin_count, and clears the flags if pin_count reaches 0. Changes placement and pin size accordingly.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_evict_vram(struct amdgpu_device * adev)

evict VRAM buffers

Parameters

struct amdgpu_device * adev
amdgpu device object

Description

Evicts all VRAM buffers on the lru list of the memory type. Mainly used for evicting vram at suspend time.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_init(struct amdgpu_device * adev)

initialize memory manager

Parameters

struct amdgpu_device * adev
amdgpu device object

Description

Calls amdgpu_ttm_init() to initialize amdgpu memory manager.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_late_init(struct amdgpu_device * adev)

late init

Parameters

struct amdgpu_device * adev
amdgpu device object

Description

Calls amdgpu_ttm_late_init() to free resources used earlier during initialization.

Return

0 for success or a negative error code on failure.

void amdgpu_bo_fini(struct amdgpu_device * adev)

tear down memory manager

Parameters

struct amdgpu_device * adev
amdgpu device object

Description

Reverses amdgpu_bo_init() to tear down memory manager.

int amdgpu_bo_fbdev_mmap(struct amdgpu_bo * bo, struct vm_area_struct * vma)

mmap fbdev memory

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object
struct vm_area_struct * vma
vma as input from the fbdev mmap method

Description

Calls ttm_fbdev_mmap() to mmap fbdev memory if it is backed by a bo.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_set_tiling_flags(struct amdgpu_bo * bo, u64 tiling_flags)

set tiling flags

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object
u64 tiling_flags
new flags

Description

Sets buffer object’s tiling flags with the new one. Used by GEM ioctl or kernel driver to set the tiling flags on a buffer.

Return

0 for success or a negative error code on failure.

void amdgpu_bo_get_tiling_flags(struct amdgpu_bo * bo, u64 * tiling_flags)

get tiling flags

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object
u64 * tiling_flags
returned flags

Description

Gets buffer object’s tiling flags. Used by GEM ioctl or kernel driver to set the tiling flags on a buffer.

int amdgpu_bo_set_metadata(struct amdgpu_bo * bo, void * metadata, uint32_t metadata_size, uint64_t flags)

set metadata

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object
void * metadata
new metadata
uint32_t metadata_size
size of the new metadata
uint64_t flags
flags of the new metadata

Description

Sets buffer object’s metadata, its size and flags. Used via GEM ioctl.

Return

0 for success or a negative error code on failure.

int amdgpu_bo_get_metadata(struct amdgpu_bo * bo, void * buffer, size_t buffer_size, uint32_t * metadata_size, uint64_t * flags)

get metadata

Parameters

struct amdgpu_bo * bo
amdgpu_bo buffer object
void * buffer
returned metadata
size_t buffer_size
size of the buffer
uint32_t * metadata_size
size of the returned metadata
uint64_t * flags
flags of the returned metadata

Description

Gets buffer object’s metadata, its size and flags. buffer_size shall not be less than metadata_size. Used via GEM ioctl.

Return

0 for success or a negative error code on failure.

void amdgpu_bo_move_notify(struct ttm_buffer_object * bo, bool evict, struct ttm_mem_reg * new_mem)

notification about a memory move

Parameters

struct ttm_buffer_object * bo
pointer to a buffer object
bool evict
if this move is evicting the buffer from the graphics address space
struct ttm_mem_reg * new_mem
new information of the bufer object

Description

Marks the corresponding amdgpu_bo buffer object as invalid, also performs bookkeeping. TTM driver callback which is called when ttm moves a buffer.

int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object * bo)

notification about a memory fault

Parameters

struct ttm_buffer_object * bo
pointer to a buffer object

Description

Notifies the driver we are taking a fault on this BO and have reserved it, also performs bookkeeping. TTM driver callback for dealing with vm faults.

Return

0 for success or a negative error code on failure.

void amdgpu_bo_fence(struct amdgpu_bo * bo, struct dma_fence * fence, bool shared)

add fence to buffer object

Parameters

struct amdgpu_bo * bo
buffer object in question
struct dma_fence * fence
fence to add
bool shared
true if fence should be added shared
u64 amdgpu_bo_gpu_offset(struct amdgpu_bo * bo)

return GPU offset of bo

Parameters

struct amdgpu_bo * bo
amdgpu object for which we query the offset

Note

object should either be pinned or reserved when calling this function, it might be useful to add check for this for debugging.

Return

current GPU offset of the object.

uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device * adev, uint32_t domain)

get preferred domain for scanout

Parameters

struct amdgpu_device * adev
amdgpu device object
uint32_t domain
allowed memory domains

Return

Which of the allowed domains is preferred for pinning the BO for scanout.

PRIME Buffer Sharing

The following callback implementations are used for sharing GEM buffer objects between different devices via PRIME.

struct sg_table * amdgpu_gem_prime_get_sg_table(struct drm_gem_object * obj)

drm_driver.gem_prime_get_sg_table implementation

Parameters

struct drm_gem_object * obj
GEM buffer object (BO)

Return

A scatter/gather table for the pinned pages of the BO’s memory.

void * amdgpu_gem_prime_vmap(struct drm_gem_object * obj)

dma_buf_ops.vmap implementation

Parameters

struct drm_gem_object * obj
GEM BO

Description

Sets up an in-kernel virtual mapping of the BO’s memory.

Return

The virtual address of the mapping or an error pointer.

void amdgpu_gem_prime_vunmap(struct drm_gem_object * obj, void * vaddr)

dma_buf_ops.vunmap implementation

Parameters

struct drm_gem_object * obj
GEM BO
void * vaddr
Virtual address (unused)

Description

Tears down the in-kernel virtual mapping of the BO’s memory.

int amdgpu_gem_prime_mmap(struct drm_gem_object * obj, struct vm_area_struct * vma)

drm_driver.gem_prime_mmap implementation

Parameters

struct drm_gem_object * obj
GEM BO
struct vm_area_struct * vma
Virtual memory area

Description

Sets up a userspace mapping of the BO’s memory in the given virtual memory area.

Return

0 on success or a negative error code on failure.

struct drm_gem_object * amdgpu_gem_prime_import_sg_table(struct drm_device * dev, struct dma_buf_attachment * attach, struct sg_table * sg)

drm_driver.gem_prime_import_sg_table implementation

Parameters

struct drm_device * dev
DRM device
struct dma_buf_attachment * attach
DMA-buf attachment
struct sg_table * sg
Scatter/gather table

Description

Imports shared DMA buffer memory exported by another device.

Return

A new GEM BO of the given DRM device, representing the memory described by the given DMA-buf attachment and scatter/gather table.

int amdgpu_gem_map_attach(struct dma_buf * dma_buf, struct dma_buf_attachment * attach)

dma_buf_ops.attach implementation

Parameters

struct dma_buf * dma_buf
Shared DMA buffer
struct dma_buf_attachment * attach
DMA-buf attachment

Description

Makes sure that the shared DMA buffer can be accessed by the target device. For now, simply pins it to the GTT domain, where it should be accessible by all DMA devices.

Return

0 on success or a negative error code on failure.

void amdgpu_gem_map_detach(struct dma_buf * dma_buf, struct dma_buf_attachment * attach)

dma_buf_ops.detach implementation

Parameters

struct dma_buf * dma_buf
Shared DMA buffer
struct dma_buf_attachment * attach
DMA-buf attachment

Description

This is called when a shared DMA buffer no longer needs to be accessible by another device. For now, simply unpins the buffer from GTT.

struct reservation_object * amdgpu_gem_prime_res_obj(struct drm_gem_object * obj)

drm_driver.gem_prime_res_obj implementation

Parameters

struct drm_gem_object * obj
GEM BO

Return

The BO’s reservation object.

int amdgpu_gem_begin_cpu_access(struct dma_buf * dma_buf, enum dma_data_direction direction)

dma_buf_ops.begin_cpu_access implementation

Parameters

struct dma_buf * dma_buf
Shared DMA buffer
enum dma_data_direction direction
Direction of DMA transfer

Description

This is called before CPU access to the shared DMA buffer’s memory. If it’s a read access, the buffer is moved to the GTT domain if possible, for optimal CPU read performance.

Return

0 on success or a negative error code on failure.

struct dma_buf * amdgpu_gem_prime_export(struct drm_device * dev, struct drm_gem_object * gobj, int flags)

drm_driver.gem_prime_export implementation

Parameters

struct drm_device * dev
DRM device
struct drm_gem_object * gobj
GEM BO
int flags
Flags such as DRM_CLOEXEC and DRM_RDWR.

Description

The main work is done by the drm_gem_prime_export helper, which in turn uses amdgpu_gem_prime_res_obj.

Return

Shared DMA buffer representing the GEM BO from the given device.

struct drm_gem_object * amdgpu_gem_prime_import(struct drm_device * dev, struct dma_buf * dma_buf)

drm_driver.gem_prime_import implementation

Parameters

struct drm_device * dev
DRM device
struct dma_buf * dma_buf
Shared DMA buffer

Description

The main work is done by the drm_gem_prime_import helper, which in turn uses amdgpu_gem_prime_import_sg_table.

Return

GEM BO representing the shared DMA buffer for the given device.

MMU Notifier

For coherent userptr handling registers an MMU notifier to inform the driver about updates on the page tables of a process.

When somebody tries to invalidate the page tables we block the update until all operations on the pages in question are completed, then those pages are marked as accessed and also dirty if it wasn’t a read only access.

New command submissions using the userptrs in question are delayed until all page table invalidation are completed and we once more see a coherent process address space.

struct amdgpu_mn

Definition

struct amdgpu_mn {
  struct amdgpu_device    *adev;
  struct mm_struct        *mm;
  struct mmu_notifier     mn;
  enum amdgpu_mn_type     type;
  struct work_struct      work;
  struct hlist_node       node;
  struct rw_semaphore     lock;
  struct rb_root_cached   objects;
  struct mutex            read_lock;
  atomic_t recursion;
};

Members

adev
amdgpu device pointer
mm
process address space
mn
MMU notifier structure
type
type of MMU notifier
work
destruction work item
node
hash table node to find structure by adev and mn
lock
rw semaphore protecting the notifier nodes
objects
interval tree containing amdgpu_mn_nodes
read_lock
mutex for recursive locking of lock
recursion
depth of recursion

Description

Data for each amdgpu device and process address space.

struct amdgpu_mn_node

Definition

struct amdgpu_mn_node {
  struct interval_tree_node       it;
  struct list_head                bos;
};

Members

it
interval node defining start-last of the affected address range
bos
list of all BOs in the affected address range

Description

Manages all BOs which are affected of a certain range of address space.

void amdgpu_mn_destroy(struct work_struct * work)

destroy the MMU notifier

Parameters

struct work_struct * work
previously sheduled work item

Description

Lazy destroys the notifier from a work item

void amdgpu_mn_release(struct mmu_notifier * mn, struct mm_struct * mm)

callback to notify about mm destruction

Parameters

struct mmu_notifier * mn
our notifier
struct mm_struct * mm
the mm this callback is about

Description

Shedule a work item to lazy destroy our notifier.

void amdgpu_mn_lock(struct amdgpu_mn * mn)

take the write side lock for this notifier

Parameters

struct amdgpu_mn * mn
our notifier
void amdgpu_mn_unlock(struct amdgpu_mn * mn)

drop the write side lock for this notifier

Parameters

struct amdgpu_mn * mn
our notifier
int amdgpu_mn_read_lock(struct amdgpu_mn * amn, bool blockable)

take the read side lock for this notifier

Parameters

struct amdgpu_mn * amn
our notifier
bool blockable
undescribed
void amdgpu_mn_read_unlock(struct amdgpu_mn * amn)

drop the read side lock for this notifier

Parameters

struct amdgpu_mn * amn
our notifier
void amdgpu_mn_invalidate_node(struct amdgpu_mn_node * node, unsigned long start, unsigned long end)

unmap all BOs of a node

Parameters

struct amdgpu_mn_node * node
the node with the BOs to unmap
unsigned long start
start of address range affected
unsigned long end
end of address range affected

Description

Block for operations on BOs to finish and mark pages as accessed and potentially dirty.

int amdgpu_mn_invalidate_range_start_gfx(struct mmu_notifier * mn, struct mm_struct * mm, unsigned long start, unsigned long end, bool blockable)

callback to notify about mm change

Parameters

struct mmu_notifier * mn
our notifier
struct mm_struct * mm
the mm this callback is about
unsigned long start
start of updated range
unsigned long end
end of updated range
bool blockable
undescribed

Description

Block for operations on BOs to finish and mark pages as accessed and potentially dirty.

int amdgpu_mn_invalidate_range_start_hsa(struct mmu_notifier * mn, struct mm_struct * mm, unsigned long start, unsigned long end, bool blockable)

callback to notify about mm change

Parameters

struct mmu_notifier * mn
our notifier
struct mm_struct * mm
the mm this callback is about
unsigned long start
start of updated range
unsigned long end
end of updated range
bool blockable
undescribed

Description

We temporarily evict all BOs between start and end. This necessitates evicting all user-mode queues of the process. The BOs are restorted in amdgpu_mn_invalidate_range_end_hsa.

void amdgpu_mn_invalidate_range_end(struct mmu_notifier * mn, struct mm_struct * mm, unsigned long start, unsigned long end)

callback to notify about mm change

Parameters

struct mmu_notifier * mn
our notifier
struct mm_struct * mm
the mm this callback is about
unsigned long start
start of updated range
unsigned long end
end of updated range

Description

Release the lock again to allow new command submissions.

struct amdgpu_mn * amdgpu_mn_get(struct amdgpu_device * adev, enum amdgpu_mn_type type)

create notifier context

Parameters

struct amdgpu_device * adev
amdgpu device pointer
enum amdgpu_mn_type type
type of MMU notifier context

Description

Creates a notifier context for current->mm.

int amdgpu_mn_register(struct amdgpu_bo * bo, unsigned long addr)

register a BO for notifier updates

Parameters

struct amdgpu_bo * bo
amdgpu buffer object
unsigned long addr
userptr addr we should monitor

Description

Registers an MMU notifier for the given BO at the specified address. Returns 0 on success, -ERRNO if anything goes wrong.

void amdgpu_mn_unregister(struct amdgpu_bo * bo)

unregister a BO for notifier updates

Parameters

struct amdgpu_bo * bo
amdgpu buffer object

Description

Remove any registration of MMU notifier updates from the buffer object.

AMDGPU Virtual Memory

GPUVM is similar to the legacy gart on older asics, however rather than there being a single global gart table for the entire GPU, there are multiple VM page tables active at any given time. The VM page tables can contain a mix vram pages and system memory pages and system memory pages can be mapped as snooped (cached system pages) or unsnooped (uncached system pages). Each VM has an ID associated with it and there is a page table associated with each VMID. When execting a command buffer, the kernel tells the the ring what VMID to use for that command buffer. VMIDs are allocated dynamically as commands are submitted. The userspace drivers maintain their own address space and the kernel sets up their pages tables accordingly when they submit their command buffers and a VMID is assigned. Cayman/Trinity support up to 8 active VMs at any given time; SI supports 16.

struct amdgpu_pte_update_params

Local structure

Definition

struct amdgpu_pte_update_params {
  struct amdgpu_device *adev;
  struct amdgpu_vm *vm;
  uint64_t src;
  struct amdgpu_ib *ib;
  void (*func)(struct amdgpu_pte_update_params *params,struct amdgpu_bo *bo, uint64_t pe,uint64_t addr, unsigned count, uint32_t incr, uint64_t flags);
  dma_addr_t *pages_addr;
  void *kptr;
};

Members

adev
amdgpu device we do this update for
vm
optional amdgpu_vm we do this update for
src
address where to copy page table entries from
ib
indirect buffer to fill with commands
func
Function which actually does the update
pages_addr
DMA addresses to use for mapping, used during VM update by CPU
kptr
Kernel pointer of PD/PT BO that needs to be updated, used during VM update by CPU

Description

Encapsulate some VM table update parameters to reduce the number of function parameters

struct amdgpu_prt_cb

Helper to disable partial resident texture feature from a fence callback

Definition

struct amdgpu_prt_cb {
  struct amdgpu_device *adev;
  struct dma_fence_cb cb;
};

Members

adev
amdgpu device
cb
callback
unsigned amdgpu_vm_level_shift(struct amdgpu_device * adev, unsigned level)

return the addr shift for each level

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
unsigned level
VMPT level

Return

The number of bits the pfn needs to be right shifted for a level.

unsigned amdgpu_vm_num_entries(struct amdgpu_device * adev, unsigned level)

return the number of entries in a PD/PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
unsigned level
VMPT level

Return

The number of entries in a page directory or page table.

uint32_t amdgpu_vm_entries_mask(struct amdgpu_device * adev, unsigned int level)

the mask to get the entry number of a PD/PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
unsigned int level
VMPT level

Return

The mask to extract the entry number of a PD/PT from an address.

unsigned amdgpu_vm_bo_size(struct amdgpu_device * adev, unsigned level)

returns the size of the BOs in bytes

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
unsigned level
VMPT level

Return

The size of the BO for a page directory or page table in bytes.

void amdgpu_vm_bo_evicted(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is evicted

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is evicted

Description

State for PDs/PTs and per VM BOs which are not at the location they should be.

void amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is reloacted

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is relocated

Description

State for PDs/PTs which needs to update their parent PD.

void amdgpu_vm_bo_moved(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is moved

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is moved

Description

State for per VM BOs which are moved, but that change is not yet reflected in the page tables.

void amdgpu_vm_bo_idle(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is idle

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is now idle

Description

State for PDs/PTs and per VM BOs which have gone through the state machine and are now idle.

void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is invalidated

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is now invalidated

Description

State for normal BOs which are invalidated and that change not yet reflected in the PTs.

void amdgpu_vm_bo_done(struct amdgpu_vm_bo_base * vm_bo)

vm_bo is done

Parameters

struct amdgpu_vm_bo_base * vm_bo
vm_bo which is now done

Description

State for normal BOs which are invalidated and that change has been updated in the PTs.

void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base * base, struct amdgpu_vm * vm, struct amdgpu_bo * bo)

Adds bo to the list of bos associated with the vm

Parameters

struct amdgpu_vm_bo_base * base
base structure for tracking BO usage in a VM
struct amdgpu_vm * vm
vm to which bo is to be added
struct amdgpu_bo * bo
amdgpu buffer object

Description

Initialize a bo_va_base structure and add it to the appropriate lists

struct amdgpu_vm_pt * amdgpu_vm_pt_parent(struct amdgpu_vm_pt * pt)

get the parent page directory

Parameters

struct amdgpu_vm_pt * pt
child page table

Description

Helper to get the parent entry for the child page table. NULL if we are at the root page directory.

void amdgpu_vm_pt_start(struct amdgpu_device * adev, struct amdgpu_vm * vm, uint64_t start, struct amdgpu_vm_pt_cursor * cursor)

start PD/PT walk

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
amdgpu_vm structure
uint64_t start
start address of the walk
struct amdgpu_vm_pt_cursor * cursor
state to initialize

Description

Initialize a amdgpu_vm_pt_cursor to start a walk.

bool amdgpu_vm_pt_descendant(struct amdgpu_device * adev, struct amdgpu_vm_pt_cursor * cursor)

go to child node

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm_pt_cursor * cursor
current state

Description

Walk to the child node of the current node.

Return

True if the walk was possible, false otherwise.

bool amdgpu_vm_pt_sibling(struct amdgpu_device * adev, struct amdgpu_vm_pt_cursor * cursor)

go to sibling node

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm_pt_cursor * cursor
current state

Description

Walk to the sibling node of the current node.

Return

True if the walk was possible, false otherwise.

bool amdgpu_vm_pt_ancestor(struct amdgpu_vm_pt_cursor * cursor)

go to parent node

Parameters

struct amdgpu_vm_pt_cursor * cursor
current state

Description

Walk to the parent node of the current node.

Return

True if the walk was possible, false otherwise.

void amdgpu_vm_pt_next(struct amdgpu_device * adev, struct amdgpu_vm_pt_cursor * cursor)

get next PD/PT in hieratchy

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm_pt_cursor * cursor
current state

Description

Walk the PD/PT tree to the next node.

void amdgpu_vm_pt_first_leaf(struct amdgpu_device * adev, struct amdgpu_vm * vm, uint64_t start, struct amdgpu_vm_pt_cursor * cursor)

get first leaf PD/PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
amdgpu_vm structure
uint64_t start
start addr of the walk
struct amdgpu_vm_pt_cursor * cursor
state to initialize

Description

Start a walk and go directly to the leaf node.

void amdgpu_vm_pt_next_leaf(struct amdgpu_device * adev, struct amdgpu_vm_pt_cursor * cursor)

get next leaf PD/PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm_pt_cursor * cursor
current state

Description

Walk the PD/PT tree to the next leaf node.

for_each_amdgpu_vm_pt_leaf(adev, vm, start, end, cursor)

walk over all leaf PDs/PTs in the hierarchy

Parameters

adev
undescribed
vm
undescribed
start
undescribed
end
undescribed
cursor
undescribed
void amdgpu_vm_pt_first_dfs(struct amdgpu_device * adev, struct amdgpu_vm * vm, struct amdgpu_vm_pt_cursor * cursor)

start a deep first search

Parameters

struct amdgpu_device * adev
amdgpu_device structure
struct amdgpu_vm * vm
amdgpu_vm structure
struct amdgpu_vm_pt_cursor * cursor
state to initialize

Description

Starts a deep first traversal of the PD/PT tree.

void amdgpu_vm_pt_next_dfs(struct amdgpu_device * adev, struct amdgpu_vm_pt_cursor * cursor)

get the next node for a deep first search

Parameters

struct amdgpu_device * adev
amdgpu_device structure
struct amdgpu_vm_pt_cursor * cursor
current state

Description

Move the cursor to the next node in a deep first search.

for_each_amdgpu_vm_pt_dfs_safe(adev, vm, cursor, entry)

safe deep first search of all PDs/PTs

Parameters

adev
undescribed
vm
undescribed
cursor
undescribed
entry
undescribed
void amdgpu_vm_get_pd_bo(struct amdgpu_vm * vm, struct list_head * validated, struct amdgpu_bo_list_entry * entry)

add the VM PD to a validation list

Parameters

struct amdgpu_vm * vm
vm providing the BOs
struct list_head * validated
head of validation list
struct amdgpu_bo_list_entry * entry
entry to add

Description

Add the page directory to the list of BOs to validate for command submission.

void amdgpu_vm_move_to_lru_tail(struct amdgpu_device * adev, struct amdgpu_vm * vm)

move all BOs to the end of LRU

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_vm * vm
vm providing the BOs

Description

Move all BOs to the end of LRU and remember their positions to put them together.

int amdgpu_vm_validate_pt_bos(struct amdgpu_device * adev, struct amdgpu_vm * vm, int (*validate) (void *p, struct amdgpu_bo *bo, void * param)

validate the page table BOs

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_vm * vm
vm providing the BOs
int (*)(void *p, struct amdgpu_bo *bo) validate
callback to do the validation
void * param
parameter for the validation callback

Description

Validate the page table BOs on command submission if neccessary.

Return

Validation result.

bool amdgpu_vm_ready(struct amdgpu_vm * vm)

check VM is ready for updates

Parameters

struct amdgpu_vm * vm
VM to check

Description

Check if all VM PDs/PTs are ready for updates

Return

True if eviction list is empty.

int amdgpu_vm_clear_bo(struct amdgpu_device * adev, struct amdgpu_vm * vm, struct amdgpu_bo * bo, unsigned level, bool pte_support_ats)

initially clear the PDs/PTs

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
VM to clear BO from
struct amdgpu_bo * bo
BO to clear
unsigned level
level this BO is at
bool pte_support_ats
indicate ATS support from PTE

Description

Root PD needs to be reserved when calling this.

Return

0 on success, errno otherwise.

void amdgpu_vm_bo_param(struct amdgpu_device * adev, struct amdgpu_vm * vm, int level, struct amdgpu_bo_param * bp)

fill in parameters for PD/PT allocation

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requesting vm
int level
undescribed
struct amdgpu_bo_param * bp
resulting BO allocation parameters
int amdgpu_vm_alloc_pts(struct amdgpu_device * adev, struct amdgpu_vm * vm, uint64_t saddr, uint64_t size)

Allocate page tables.

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
VM to allocate page tables for
uint64_t saddr
Start address which needs to be allocated
uint64_t size
Size from start address we need.

Description

Make sure the page directories and page tables are allocated

Return

0 on success, errno otherwise.

void amdgpu_vm_free_pts(struct amdgpu_device * adev, struct amdgpu_vm * vm)

free PD/PT levels

Parameters

struct amdgpu_device * adev
amdgpu device structure
struct amdgpu_vm * vm
amdgpu vm structure

Description

Free the page directory or page table level and all sub levels.

void amdgpu_vm_check_compute_bug(struct amdgpu_device * adev)

check whether asic has compute vm bug

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring * ring, struct amdgpu_job * job)

Check if pipe sync is needed for job.

Parameters

struct amdgpu_ring * ring
ring on which the job will be submitted
struct amdgpu_job * job
job to submit

Return

True if sync is needed.

int amdgpu_vm_flush(struct amdgpu_ring * ring, struct amdgpu_job * job, bool need_pipe_sync)

hardware flush the vm

Parameters

struct amdgpu_ring * ring
ring to use for flush
struct amdgpu_job * job
related job
bool need_pipe_sync
is pipe sync needed

Description

Emit a VM flush when it is necessary.

Return

0 on success, errno otherwise.

struct amdgpu_bo_va * amdgpu_vm_bo_find(struct amdgpu_vm * vm, struct amdgpu_bo * bo)

find the bo_va for a specific vm & bo

Parameters

struct amdgpu_vm * vm
requested vm
struct amdgpu_bo * bo
requested buffer object

Description

Find bo inside the requested vm. Search inside the bos vm list for the requested vm Returns the found bo_va or NULL if none is found

Object has to be reserved!

Return

Found bo_va or NULL.

void amdgpu_vm_do_set_ptes(struct amdgpu_pte_update_params * params, struct amdgpu_bo * bo, uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags)

helper to call the right asic function

Parameters

struct amdgpu_pte_update_params * params
see amdgpu_pte_update_params definition
struct amdgpu_bo * bo
PD/PT to update
uint64_t pe
addr of the page entry
uint64_t addr
dst addr to write into pe
unsigned count
number of page entries to update
uint32_t incr
increase next addr by incr bytes
uint64_t flags
hw access flags

Description

Traces the parameters and calls the right asic functions to setup the page table using the DMA.

void amdgpu_vm_do_copy_ptes(struct amdgpu_pte_update_params * params, struct amdgpu_bo * bo, uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags)

copy the PTEs from the GART

Parameters

struct amdgpu_pte_update_params * params
see amdgpu_pte_update_params definition
struct amdgpu_bo * bo
PD/PT to update
uint64_t pe
addr of the page entry
uint64_t addr
dst addr to write into pe
unsigned count
number of page entries to update
uint32_t incr
increase next addr by incr bytes
uint64_t flags
hw access flags

Description

Traces the parameters and calls the DMA function to copy the PTEs.

uint64_t amdgpu_vm_map_gart(const dma_addr_t * pages_addr, uint64_t addr)

Resolve gart mapping of addr

Parameters

const dma_addr_t * pages_addr
optional DMA address to use for lookup
uint64_t addr
the unmapped addr

Description

Look up the physical address of the page that the pte resolves to.

Return

The pointer for the page table entry.

void amdgpu_vm_cpu_set_ptes(struct amdgpu_pte_update_params * params, struct amdgpu_bo * bo, uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags)

helper to update page tables via CPU

Parameters

struct amdgpu_pte_update_params * params
see amdgpu_pte_update_params definition
struct amdgpu_bo * bo
PD/PT to update
uint64_t pe
kmap addr of the page entry
uint64_t addr
dst addr to write into pe
unsigned count
number of page entries to update
uint32_t incr
increase next addr by incr bytes
uint64_t flags
hw access flags

Description

Write count number of PT/PD entries directly.

int amdgpu_vm_wait_pd(struct amdgpu_device * adev, struct amdgpu_vm * vm, void * owner)

Wait for PT BOs to be free.

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
related vm
void * owner
fence owner

Return

0 on success, errno otherwise.

void amdgpu_vm_update_func(struct amdgpu_pte_update_params * params, struct amdgpu_bo * bo, uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags)

helper to call update function

Parameters

struct amdgpu_pte_update_params * params
undescribed
struct amdgpu_bo * bo
undescribed
uint64_t pe
undescribed
uint64_t addr
undescribed
unsigned count
undescribed
uint32_t incr
undescribed
uint64_t flags
undescribed

Description

Calls the update function for both the given BO as well as its shadow.

void amdgpu_vm_update_huge(struct amdgpu_pte_update_params * params, struct amdgpu_bo * bo, unsigned level, uint64_t pe, uint64_t addr, unsigned count, uint32_t incr, uint64_t flags)

figure out parameters for PTE updates

Parameters

struct amdgpu_pte_update_params * params
undescribed
struct amdgpu_bo * bo
undescribed
unsigned level
undescribed
uint64_t pe
undescribed
uint64_t addr
undescribed
unsigned count
undescribed
uint32_t incr
undescribed
uint64_t flags
undescribed

Description

Make sure to set the right flags for the PTEs at the desired level.

void amdgpu_vm_fragment(struct amdgpu_pte_update_params * params, uint64_t start, uint64_t end, uint64_t flags, unsigned int * frag, uint64_t * frag_end)

get fragment for PTEs

Parameters

struct amdgpu_pte_update_params * params
see amdgpu_pte_update_params definition
uint64_t start
first PTE to handle
uint64_t end
last PTE to handle
uint64_t flags
hw mapping flags
unsigned int * frag
resulting fragment size
uint64_t * frag_end
end of this fragment

Description

Returns the first possible fragment for the start and end address.

int amdgpu_vm_update_ptes(struct amdgpu_pte_update_params * params, uint64_t start, uint64_t end, uint64_t dst, uint64_t flags)

make sure that page tables are valid

Parameters

struct amdgpu_pte_update_params * params
see amdgpu_pte_update_params definition
uint64_t start
start of GPU address range
uint64_t end
end of GPU address range
uint64_t dst
destination address to map to, the next dst inside the function
uint64_t flags
mapping flags

Description

Update the page tables in the range start - end.

Return

0 for success, -EINVAL for failure.

int amdgpu_vm_bo_update_mapping(struct amdgpu_device * adev, struct dma_fence * exclusive, dma_addr_t * pages_addr, struct amdgpu_vm * vm, uint64_t start, uint64_t last, uint64_t flags, uint64_t addr, struct dma_fence ** fence)

update a mapping in the vm page table

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct dma_fence * exclusive
fence we need to sync to
dma_addr_t * pages_addr
DMA addresses to use for mapping
struct amdgpu_vm * vm
requested vm
uint64_t start
start of mapped range
uint64_t last
last mapped entry
uint64_t flags
flags for the entries
uint64_t addr
addr to set the area to
struct dma_fence ** fence
optional resulting fence

Description

Fill in the page table entries between start and last.

Return

0 for success, -EINVAL for failure.

int amdgpu_vm_bo_split_mapping(struct amdgpu_device * adev, struct dma_fence * exclusive, dma_addr_t * pages_addr, struct amdgpu_vm * vm, struct amdgpu_bo_va_mapping * mapping, uint64_t flags, struct drm_mm_node * nodes, struct dma_fence ** fence)

split a mapping into smaller chunks

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct dma_fence * exclusive
fence we need to sync to
dma_addr_t * pages_addr
DMA addresses to use for mapping
struct amdgpu_vm * vm
requested vm
struct amdgpu_bo_va_mapping * mapping
mapped range and flags to use for the update
uint64_t flags
HW flags for the mapping
struct drm_mm_node * nodes
array of drm_mm_nodes with the MC addresses
struct dma_fence ** fence
optional resulting fence

Description

Split the mapping into smaller chunks so that each update fits into a SDMA IB.

Return

0 for success, -EINVAL for failure.

int amdgpu_vm_bo_update(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va, bool clear)

update all BO mappings in the vm page table

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
requested BO and VM object
bool clear
if true clear the entries

Description

Fill in the page table entries for bo_va.

Return

0 for success, -EINVAL for failure.

void amdgpu_vm_update_prt_state(struct amdgpu_device * adev)

update the global PRT state

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
void amdgpu_vm_prt_get(struct amdgpu_device * adev)

add a PRT user

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
void amdgpu_vm_prt_put(struct amdgpu_device * adev)

drop a PRT user

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
void amdgpu_vm_prt_cb(struct dma_fence * fence, struct dma_fence_cb * _cb)

callback for updating the PRT status

Parameters

struct dma_fence * fence
fence for the callback
struct dma_fence_cb * _cb
the callback function
void amdgpu_vm_add_prt_cb(struct amdgpu_device * adev, struct dma_fence * fence)

add callback for updating the PRT status

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct dma_fence * fence
fence for the callback
void amdgpu_vm_free_mapping(struct amdgpu_device * adev, struct amdgpu_vm * vm, struct amdgpu_bo_va_mapping * mapping, struct dma_fence * fence)

free a mapping

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm
struct amdgpu_bo_va_mapping * mapping
mapping to be freed
struct dma_fence * fence
fence of the unmap operation

Description

Free a mapping and make sure we decrease the PRT usage count if applicable.

void amdgpu_vm_prt_fini(struct amdgpu_device * adev, struct amdgpu_vm * vm)

finish all prt mappings

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm

Description

Register a cleanup callback to disable PRT support after VM dies.

int amdgpu_vm_clear_freed(struct amdgpu_device * adev, struct amdgpu_vm * vm, struct dma_fence ** fence)

clear freed BOs in the PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm
struct dma_fence ** fence
optional resulting fence (unchanged if no work needed to be done or if an error occurred)

Description

Make sure all freed BOs are cleared in the PT. PTs have to be reserved and mutex must be locked!

Return

0 for success.

int amdgpu_vm_handle_moved(struct amdgpu_device * adev, struct amdgpu_vm * vm)

handle moved BOs in the PT

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm

Description

Make sure all BOs which are moved are updated in the PTs.

Return

0 for success.

PTs have to be reserved!

struct amdgpu_bo_va * amdgpu_vm_bo_add(struct amdgpu_device * adev, struct amdgpu_vm * vm, struct amdgpu_bo * bo)

add a bo to a specific vm

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm
struct amdgpu_bo * bo
amdgpu buffer object

Description

Add bo into the requested vm. Add bo to the list of bos associated with the vm

Return

Newly added bo_va or NULL for failure

Object has to be reserved!

void amdgpu_vm_bo_insert_map(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va, struct amdgpu_bo_va_mapping * mapping)

insert a new mapping

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
bo_va to store the address
struct amdgpu_bo_va_mapping * mapping
the mapping to insert

Description

Insert a new mapping into all structures.

int amdgpu_vm_bo_map(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va, uint64_t saddr, uint64_t offset, uint64_t size, uint64_t flags)

map bo inside a vm

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
bo_va to store the address
uint64_t saddr
where to map the BO
uint64_t offset
requested offset in the BO
uint64_t size
BO size in bytes
uint64_t flags
attributes of pages (read/write/valid/etc.)

Description

Add a mapping of the BO at the specefied addr into the VM.

Return

0 for success, error for failure.

Object has to be reserved and unreserved outside!

int amdgpu_vm_bo_replace_map(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va, uint64_t saddr, uint64_t offset, uint64_t size, uint64_t flags)

map bo inside a vm, replacing existing mappings

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
bo_va to store the address
uint64_t saddr
where to map the BO
uint64_t offset
requested offset in the BO
uint64_t size
BO size in bytes
uint64_t flags
attributes of pages (read/write/valid/etc.)

Description

Add a mapping of the BO at the specefied addr into the VM. Replace existing mappings as we do so.

Return

0 for success, error for failure.

Object has to be reserved and unreserved outside!

int amdgpu_vm_bo_unmap(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va, uint64_t saddr)

remove bo mapping from vm

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
bo_va to remove the address from
uint64_t saddr
where to the BO is mapped

Description

Remove a mapping of the BO at the specefied addr from the VM.

Return

0 for success, error for failure.

Object has to be reserved and unreserved outside!

int amdgpu_vm_bo_clear_mappings(struct amdgpu_device * adev, struct amdgpu_vm * vm, uint64_t saddr, uint64_t size)

remove all mappings in a specific range

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
VM structure to use
uint64_t saddr
start of the range
uint64_t size
size of the range

Description

Remove all mappings in a range, split them as appropriate.

Return

0 for success, error for failure.

struct amdgpu_bo_va_mapping * amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm * vm, uint64_t addr)

find mapping by address

Parameters

struct amdgpu_vm * vm
the requested VM
uint64_t addr
the address

Description

Find a mapping by it’s address.

Return

The amdgpu_bo_va_mapping matching for addr or NULL

void amdgpu_vm_bo_trace_cs(struct amdgpu_vm * vm, struct ww_acquire_ctx * ticket)

trace all reserved mappings

Parameters

struct amdgpu_vm * vm
the requested vm
struct ww_acquire_ctx * ticket
CS ticket

Description

Trace all mappings of BOs reserved during a command submission.

void amdgpu_vm_bo_rmv(struct amdgpu_device * adev, struct amdgpu_bo_va * bo_va)

remove a bo to a specific vm

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo_va * bo_va
requested bo_va

Description

Remove bo_va->bo from the requested vm.

Object have to be reserved!

void amdgpu_vm_bo_invalidate(struct amdgpu_device * adev, struct amdgpu_bo * bo, bool evicted)

mark the bo as invalid

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_bo * bo
amdgpu buffer object
bool evicted
is the BO evicted

Description

Mark bo as invalid.

uint32_t amdgpu_vm_get_block_size(uint64_t vm_size)

calculate VM page table size as power of two

Parameters

uint64_t vm_size
VM size

Return

VM page table as power of two

void amdgpu_vm_adjust_size(struct amdgpu_device * adev, uint32_t min_vm_size, uint32_t fragment_size_default, unsigned max_level, unsigned max_bits)

adjust vm size, block size and fragment size

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
uint32_t min_vm_size
the minimum vm size in GB if it’s set auto
uint32_t fragment_size_default
Default PTE fragment size
unsigned max_level
max VMPT level
unsigned max_bits
max address space size in bits
int amdgpu_vm_init(struct amdgpu_device * adev, struct amdgpu_vm * vm, int vm_context, unsigned int pasid)

initialize a vm instance

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm
int vm_context
Indicates if it GFX or Compute context
unsigned int pasid
Process address space identifier

Description

Init vm fields.

Return

0 for success, error for failure.

int amdgpu_vm_make_compute(struct amdgpu_device * adev, struct amdgpu_vm * vm, unsigned int pasid)

Turn a GFX VM into a compute VM

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm
unsigned int pasid
undescribed

Description

This only works on GFX VMs that don’t have any BOs added and no page tables allocated yet.

Changes the following VM parameters: - use_cpu_for_update - pte_supports_ats - pasid (old PASID is released, because compute manages its own PASIDs)

Reinitializes the page directory to reflect the changed ATS setting.

Return

0 for success, -errno for errors.

void amdgpu_vm_release_compute(struct amdgpu_device * adev, struct amdgpu_vm * vm)

release a compute vm

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
a vm turned into compute vm by calling amdgpu_vm_make_compute

Description

This is a correspondant of amdgpu_vm_make_compute. It decouples compute pasid from vm. Compute should stop use of vm after this call.

void amdgpu_vm_fini(struct amdgpu_device * adev, struct amdgpu_vm * vm)

tear down a vm instance

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
struct amdgpu_vm * vm
requested vm

Description

Tear down vm. Unbind the VM and remove all bos from the vm bo list

bool amdgpu_vm_pasid_fault_credit(struct amdgpu_device * adev, unsigned int pasid)

Check fault credit for given PASID

Parameters

struct amdgpu_device * adev
amdgpu_device pointer
unsigned int pasid
PASID do identify the VM

Description

This function is expected to be called in interrupt context.

Return

True if there was fault credit, false otherwise

void amdgpu_vm_manager_init(struct amdgpu_device * adev)

init the VM manager

Parameters

struct amdgpu_device * adev
amdgpu_device pointer

Description

Initialize the VM manager structures

void amdgpu_vm_manager_fini(struct amdgpu_device * adev)

cleanup VM manager

Parameters

struct amdgpu_device * adev
amdgpu_device pointer

Description

Cleanup the VM manager and free resources.

int amdgpu_vm_ioctl(struct drm_device * dev, void * data, struct drm_file * filp)

Manages VMID reservation for vm hubs.

Parameters

struct drm_device * dev
drm device pointer
void * data
drm_amdgpu_vm
struct drm_file * filp
drm file pointer

Return

0 for success, -errno for errors.

void amdgpu_vm_get_task_info(struct amdgpu_device * adev, unsigned int pasid, struct amdgpu_task_info * task_info)

Extracts task info for a PASID.

Parameters

struct amdgpu_device * adev
drm device pointer
unsigned int pasid
PASID identifier for VM
struct amdgpu_task_info * task_info
task_info to fill.
void amdgpu_vm_set_task_info(struct amdgpu_vm * vm)

Sets VMs task info.

Parameters

struct amdgpu_vm * vm
vm for which to set the info
int amdgpu_vm_add_fault(struct amdgpu_retryfault_hashtable * fault_hash, u64 key)

Add a page fault record to fault hash table

Parameters

struct amdgpu_retryfault_hashtable * fault_hash
fault hash table
u64 key
64-bit encoding of PASID and address

Description

This should be called when a retry page fault interrupt is received. If this is a new page fault, it will be added to a hash table. The return value indicates whether this is a new fault, or a fault that was already known and is already being handled.

If there are too many pending page faults, this will fail. Retry interrupts should be ignored in this case until there is enough free space.

Returns 0 if the fault was added, 1 if the fault was already known, -ENOSPC if there are too many pending faults.

void amdgpu_vm_clear_fault(struct amdgpu_retryfault_hashtable * fault_hash, u64 key)

Remove a page fault record

Parameters

struct amdgpu_retryfault_hashtable * fault_hash
fault hash table
u64 key
64-bit encoding of PASID and address

Description

This should be called when a page fault has been handled. Any future interrupt with this key will be processed as a new page fault.

Interrupt Handling

Interrupts generated within GPU hardware raise interrupt requests that are passed to amdgpu IRQ handler which is responsible for detecting source and type of the interrupt and dispatching matching handlers. If handling an interrupt requires calling kernel functions that may sleep processing is dispatched to work handlers.

If MSI functionality is not disabled by module parameter then MSI support will be enabled.

For GPU interrupt sources that may be driven by another driver, IRQ domain support is used (with mapping between virtual and hardware IRQs).

void amdgpu_hotplug_work_func(struct work_struct * work)

work handler for display hotplug event

Parameters

struct work_struct * work
work struct pointer

Description

This is the hotplug event work handler (all ASICs). The work gets scheduled from the IRQ handler if there was a hotplug interrupt. It walks through the connector table and calls hotplug handler for each connector. After this, it sends a DRM hotplug event to alert userspace.

This design approach is required in order to defer hotplug event handling from the IRQ handler to a work handler because hotplug handler has to use mutexes which cannot be locked in an IRQ handler (since mutex_lock may sleep).

void amdgpu_irq_reset_work_func(struct work_struct * work)

execute GPU reset

Parameters

struct work_struct * work
work struct pointer

Description

Execute scheduled GPU reset (Cayman+). This function is called when the IRQ handler thinks we need a GPU reset.

void amdgpu_irq_disable_all(struct amdgpu_device * adev)

disable all interrupts

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Disable all types of interrupts from all sources.

void amdgpu_irq_callback(struct amdgpu_device * adev, struct amdgpu_ih_ring * ih)

callback from the IH ring

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_ih_ring * ih
amdgpu ih ring

Description

Callback from IH ring processing to handle the entry at the current position and advance the read pointer.

irqreturn_t amdgpu_irq_handler(int irq, void * arg)

IRQ handler

Parameters

int irq
IRQ number (unused)
void * arg
pointer to DRM device

Description

IRQ handler for amdgpu driver (all ASICs).

Return

result of handling the IRQ, as defined by irqreturn_t

bool amdgpu_msi_ok(struct amdgpu_device * adev)

check whether MSI functionality is enabled

Parameters

struct amdgpu_device * adev
amdgpu device pointer (unused)

Description

Checks whether MSI functionality has been disabled via module parameter (all ASICs).

Return

true if MSIs are allowed to be enabled or false otherwise

int amdgpu_irq_init(struct amdgpu_device * adev)

initialize interrupt handling

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Sets up work functions for hotplug and reset interrupts, enables MSI functionality, initializes vblank, hotplug and reset interrupt handling.

Return

0 on success or error code on failure

void amdgpu_irq_fini(struct amdgpu_device * adev)

shut down interrupt handling

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Tears down work functions for hotplug and reset interrupts, disables MSI functionality, shuts down vblank, hotplug and reset interrupt handling, turns off interrupts from all sources (all ASICs).

int amdgpu_irq_add_id(struct amdgpu_device * adev, unsigned client_id, unsigned src_id, struct amdgpu_irq_src * source)

register IRQ source

Parameters

struct amdgpu_device * adev
amdgpu device pointer
unsigned client_id
client id
unsigned src_id
source id
struct amdgpu_irq_src * source
IRQ source pointer

Description

Registers IRQ source on a client.

Return

0 on success or error code otherwise

void amdgpu_irq_dispatch(struct amdgpu_device * adev, struct amdgpu_iv_entry * entry)

dispatch IRQ to IP blocks

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_iv_entry * entry
interrupt vector pointer

Description

Dispatches IRQ to IP blocks.

int amdgpu_irq_update(struct amdgpu_device * adev, struct amdgpu_irq_src * src, unsigned type)

update hardware interrupt state

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_irq_src * src
interrupt source pointer
unsigned type
type of interrupt

Description

Updates interrupt state for the specific source (all ASICs).

void amdgpu_irq_gpu_reset_resume_helper(struct amdgpu_device * adev)

update interrupt states on all sources

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Updates state of all types of interrupts on all sources on resume after reset.

int amdgpu_irq_get(struct amdgpu_device * adev, struct amdgpu_irq_src * src, unsigned type)

enable interrupt

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_irq_src * src
interrupt source pointer
unsigned type
type of interrupt

Description

Enables specified type of interrupt on the specified source (all ASICs).

Return

0 on success or error code otherwise

int amdgpu_irq_put(struct amdgpu_device * adev, struct amdgpu_irq_src * src, unsigned type)

disable interrupt

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_irq_src * src
interrupt source pointer
unsigned type
type of interrupt

Description

Enables specified type of interrupt on the specified source (all ASICs).

Return

0 on success or error code otherwise

bool amdgpu_irq_enabled(struct amdgpu_device * adev, struct amdgpu_irq_src * src, unsigned type)

check whether interrupt is enabled or not

Parameters

struct amdgpu_device * adev
amdgpu device pointer
struct amdgpu_irq_src * src
interrupt source pointer
unsigned type
type of interrupt

Description

Checks whether the given type of interrupt is enabled on the given source.

Return

true if interrupt is enabled, false if interrupt is disabled or on invalid parameters

int amdgpu_irqdomain_map(struct irq_domain * d, unsigned int irq, irq_hw_number_t hwirq)

create mapping between virtual and hardware IRQ numbers

Parameters

struct irq_domain * d
amdgpu IRQ domain pointer (unused)
unsigned int irq
virtual IRQ number
irq_hw_number_t hwirq
hardware irq number

Description

Current implementation assigns simple interrupt handler to the given virtual IRQ.

Return

0 on success or error code otherwise

int amdgpu_irq_add_domain(struct amdgpu_device * adev)

create a linear IRQ domain

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Creates an IRQ domain for GPU interrupt sources that may be driven by another driver (e.g., ACP).

Return

0 on success or error code otherwise

void amdgpu_irq_remove_domain(struct amdgpu_device * adev)

remove the IRQ domain

Parameters

struct amdgpu_device * adev
amdgpu device pointer

Description

Removes the IRQ domain for GPU interrupt sources that may be driven by another driver (e.g., ACP).

unsigned amdgpu_irq_create_mapping(struct amdgpu_device * adev, unsigned src_id)

create mapping between domain Linux IRQs

Parameters

struct amdgpu_device * adev
amdgpu device pointer
unsigned src_id
IH source id

Description

Creates mapping between a domain IRQ (GPU IH src id) and a Linux IRQ Use this for components that generate a GPU interrupt, but are driven by a different driver (e.g., ACP).

Return

Linux IRQ

GPU Power/Thermal Controls and Monitoring

This section covers hwmon and power/thermal controls.

HWMON Interfaces

The amdgpu driver exposes the following sensor interfaces:

  • GPU temperature (via the on-die sensor)
  • GPU voltage
  • Northbridge voltage (APUs only)
  • GPU power
  • GPU fan

hwmon interfaces for GPU temperature:

  • temp1_input: the on die GPU temperature in millidegrees Celsius
  • temp1_crit: temperature critical max value in millidegrees Celsius
  • temp1_crit_hyst: temperature hysteresis for critical limit in millidegrees Celsius

hwmon interfaces for GPU voltage:

  • in0_input: the voltage on the GPU in millivolts
  • in1_input: the voltage on the Northbridge in millivolts

hwmon interfaces for GPU power:

  • power1_average: average power used by the GPU in microWatts
  • power1_cap_min: minimum cap supported in microWatts
  • power1_cap_max: maximum cap supported in microWatts
  • power1_cap: selected power cap in microWatts

hwmon interfaces for GPU fan:

  • pwm1: pulse width modulation fan level (0-255)
  • pwm1_enable: pulse width modulation fan control method (0: no fan speed control, 1: manual fan speed control using pwm interface, 2: automatic fan speed control)
  • pwm1_min: pulse width modulation fan control minimum level (0)
  • pwm1_max: pulse width modulation fan control maximum level (255)
  • fan1_min: an minimum value Unit: revolution/min (RPM)
  • fan1_max: an maxmum value Unit: revolution/max (RPM)
  • fan1_input: fan speed in RPM
  • fan[1-*]_target: Desired fan speed Unit: revolution/min (RPM)
  • fan[1-*]_enable: Enable or disable the sensors.1: Enable 0: Disable

You can use hwmon tools like sensors to view this information on your system.

GPU sysfs Power State Interfaces

GPU power controls are exposed via sysfs files.

power_dpm_state

The power_dpm_state file is a legacy interface and is only provided for backwards compatibility. The amdgpu driver provides a sysfs API for adjusting certain power related parameters. The file power_dpm_state is used for this. It accepts the following arguments:

  • battery
  • balanced
  • performance

battery

On older GPUs, the vbios provided a special power state for battery operation. Selecting battery switched to this state. This is no longer provided on newer GPUs so the option does nothing in that case.

balanced

On older GPUs, the vbios provided a special power state for balanced operation. Selecting balanced switched to this state. This is no longer provided on newer GPUs so the option does nothing in that case.

performance

On older GPUs, the vbios provided a special power state for performance operation. Selecting performance switched to this state. This is no longer provided on newer GPUs so the option does nothing in that case.

power_dpm_force_performance_level

The amdgpu driver provides a sysfs API for adjusting certain power related parameters. The file power_dpm_force_performance_level is used for this. It accepts the following arguments:

  • auto
  • low
  • high
  • manual
  • profile_standard
  • profile_min_sclk
  • profile_min_mclk
  • profile_peak

auto

When auto is selected, the driver will attempt to dynamically select the optimal power profile for current conditions in the driver.

low

When low is selected, the clocks are forced to the lowest power state.

high

When high is selected, the clocks are forced to the highest power state.

manual

When manual is selected, the user can manually adjust which power states are enabled for each clock domain via the sysfs pp_dpm_mclk, pp_dpm_sclk, and pp_dpm_pcie files and adjust the power state transition heuristics via the pp_power_profile_mode sysfs file.

profile_standard profile_min_sclk profile_min_mclk profile_peak

When the profiling modes are selected, clock and power gating are disabled and the clocks are set for different profiling cases. This mode is recommended for profiling specific work loads where you do not want clock or power gating for clock fluctuation to interfere with your results. profile_standard sets the clocks to a fixed clock level which varies from asic to asic. profile_min_sclk forces the sclk to the lowest level. profile_min_mclk forces the mclk to the lowest level. profile_peak sets all clocks (mclk, sclk, pcie) to the highest levels.

pp_table

The amdgpu driver provides a sysfs API for uploading new powerplay tables. The file pp_table is used for this. Reading the file will dump the current power play table. Writing to the file will attempt to upload a new powerplay table and re-initialize powerplay using that new table.

pp_od_clk_voltage

The amdgpu driver provides a sysfs API for adjusting the clocks and voltages in each power level within a power state. The pp_od_clk_voltage is used for this.

< For Vega10 and previous ASICs >

Reading the file will display:

  • a list of engine clock levels and voltages labeled OD_SCLK
  • a list of memory clock levels and voltages labeled OD_MCLK
  • a list of valid ranges for sclk, mclk, and voltage labeled OD_RANGE

To manually adjust these settings, first select manual using power_dpm_force_performance_level. Enter a new value for each level by writing a string that contains “s/m level clock voltage” to the file. E.g., “s 1 500 820” will update sclk level 1 to be 500 MHz at 820 mV; “m 0 350 810” will update mclk level 0 to be 350 MHz at 810 mV. When you have edited all of the states as needed, write “c” (commit) to the file to commit your changes. If you want to reset to the default power levels, write “r” (reset) to the file to reset them.

< For Vega20 >

Reading the file will display:

  • minimum and maximum engine clock labeled OD_SCLK
  • maximum memory clock labeled OD_MCLK
  • three <frequency, voltage> points labeled OD_VDDC_CURVE. They can be used to calibrate the sclk voltage curve.
  • a list of valid ranges for sclk, mclk, and voltage curve points labeled OD_RANGE

To manually adjust these settings:

  • First select manual using power_dpm_force_performance_level

  • For clock frequency setting, enter a new value by writing a string that contains “s/m index clock” to the file. The index should be 0 if to set minimum clock. And 1 if to set maximum clock. E.g., “s 0 500” will update minimum sclk to be 500 MHz. “m 1 800” will update maximum mclk to be 800Mhz.

    For sclk voltage curve, enter the new values by writing a string that contains “vc point clock voltage” to the file. The points are indexed by 0, 1 and 2. E.g., “vc 0 300 600” will update point1 with clock set as 300Mhz and voltage as 600mV. “vc 2 1000 1000” will update point3 with clock set as 1000Mhz and voltage 1000mV.

  • When you have edited all of the states as needed, write “c” (commit) to the file to commit your changes

  • If you want to reset to the default power levels, write “r” (reset) to the file to reset them

pp_dpm_sclk pp_dpm_mclk pp_dpm_pcie

The amdgpu driver provides a sysfs API for adjusting what power levels are enabled for a given power state. The files pp_dpm_sclk, pp_dpm_mclk, and pp_dpm_pcie are used for this.

Reading back the files will show you the available power levels within the power state and the clock information for those levels.

To manually adjust these states, first select manual using power_dpm_force_performance_level. Secondly,Enter a new value for each level by inputing a string that contains ” echo xx xx xx > pp_dpm_sclk/mclk/pcie” E.g., echo 4 5 6 to > pp_dpm_sclk will enable sclk levels 4, 5, and 6.

pp_power_profile_mode

The amdgpu driver provides a sysfs API for adjusting the heuristics related to switching between power levels in a power state. The file pp_power_profile_mode is used for this.

Reading this file outputs a list of all of the predefined power profiles and the relevant heuristics settings for that profile.

To select a profile or create a custom profile, first select manual using power_dpm_force_performance_level. Writing the number of a predefined profile to pp_power_profile_mode will enable those heuristics. To create a custom set of heuristics, write a string of numbers to the file starting with the number of the custom profile along with a setting for each heuristic parameter. Due to differences across asic families the heuristic parameters vary from family to family.

busy_percent

The amdgpu driver provides a sysfs API for reading how busy the GPU is as a percentage. The file gpu_busy_percent is used for this. The SMU firmware computes a percentage of load based on the aggregate activity level in the IP cores.