Core Driver Infrastructure¶
GPU Hardware Structure¶
Each ASIC is a collection of hardware blocks. We refer to them as “IPs” (Intellectual Property blocks). Each IP encapsulates certain functionality. IPs are versioned and can also be mixed and matched. E.g., you might have two different ASICs that both have System DMA (SDMA) 5.x IPs. The driver is arranged by IPs. There are driver components to handle the initialization and operation of each IP. There are also a bunch of smaller IPs that don’t really need much if any driver interaction. Those end up getting lumped into the common stuff in the soc files. The soc files (e.g., vi.c, soc15.c nv.c) contain code for aspects of the SoC itself rather than specific IPs. E.g., things like GPU resets and register access functions are SoC dependent.
An APU contains more than just CPU and GPU, it also contains all of the platform stuff (audio, usb, gpio, etc.). Also, a lot of components are shared between the CPU, platform, and the GPU (e.g., SMU, PSP, etc.). Specific components (CPU, GPU, etc.) usually have their interface to interact with those common components. For things like S0i3 there is a ton of coordination required across all the components, but that is probably a bit beyond the scope of this section.
With respect to the GPU, we have the following major IPs:
- GMC (Graphics Memory Controller)
This was a dedicated IP on older pre-vega chips, but has since become somewhat decentralized on vega and newer chips. They now have dedicated memory hubs for specific IPs or groups of IPs. We still treat it as a single component in the driver however since the programming model is still pretty similar. This is how the different IPs on the GPU get the memory (VRAM or system memory). It also provides the support for per process GPU virtual address spaces.
- IH (Interrupt Handler)
This is the interrupt controller on the GPU. All of the IPs feed their interrupts into this IP and it aggregates them into a set of ring buffers that the driver can parse to handle interrupts from different IPs.
- PSP (Platform Security Processor)
This handles security policy for the SoC and executes trusted applications, and validates and loads firmwares for other blocks.
- SMU (System Management Unit)
This is the power management microcontroller. It manages the entire SoC. The driver interacts with it to control power management features like clocks, voltages, power rails, etc.
- DCN (Display Controller Next)
This is the display controller. It handles the display hardware. It is described in more details in Display Core.
- SDMA (System DMA)
This is a multi-purpose DMA engine. The kernel driver uses it for various things including paging and GPU page table updates. It’s also exposed to userspace for use by user mode drivers (OpenGL, Vulkan, etc.)
- GC (Graphics and Compute)
This is the graphics and compute engine, i.e., the block that encompasses the 3D pipeline and and shader blocks. This is by far the largest block on the GPU. The 3D pipeline has tons of sub-blocks. In addition to that, it also contains the CP microcontrollers (ME, PFP, CE, MEC) and the RLC microcontroller. It’s exposed to userspace for user mode drivers (OpenGL, Vulkan, OpenCL, etc.)
- VCN (Video Core Next)
This is the multi-media engine. It handles video and image encode and decode. It’s exposed to userspace for user mode drivers (VA-API, OpenMAX, etc.)
Graphics and Compute Microcontrollers¶
- CP (Command Processor)
The name for the hardware block that encompasses the front end of the GFX/Compute pipeline. Consists mainly of a bunch of microcontrollers (PFP, ME, CE, MEC). The firmware that runs on these microcontrollers provides the driver interface to interact with the GFX/Compute engine.
- MEC (MicroEngine Compute)
This is the microcontroller that controls the compute queues on the GFX/compute engine.
- MES (MicroEngine Scheduler)
This is a new engine for managing queues. This is currently unused.
- RLC (RunList Controller)
This is another microcontroller in the GFX/Compute engine. It handles power management related functionality within the GFX/Compute engine. The name is a vestige of old hardware where it was originally added and doesn’t really have much relation to what the engine does now.
Driver Structure¶
In general, the driver has a list of all of the IPs on a particular SoC and for things like init/fini/suspend/resume, more or less just walks the list and handles each IP.
Some useful constructs:
- KIQ (Kernel Interface Queue)
This is a control queue used by the kernel driver to manage other gfx and compute queues on the GFX/compute engine. You can use it to map/unmap additional queues, etc.
- IB (Indirect Buffer)
A command buffer for a particular engine. Rather than writing commands directly to the queue, you can write the commands into a piece of memory and then put a pointer to the memory into the queue. The hardware will then follow the pointer and execute the commands in the memory, then returning to the rest of the commands in the ring.
Memory Domains¶
AMDGPU_GEM_DOMAIN_CPU System memory that is not GPU accessible.
Memory in this pool could be swapped out to disk if there is pressure.
AMDGPU_GEM_DOMAIN_GTT GPU accessible system memory, mapped into the
GPU’s virtual address space via gart. Gart memory linearizes non-contiguous
pages of system memory, allows GPU access system memory in a linearized
fashion.
AMDGPU_GEM_DOMAIN_VRAM Local video memory. For APUs, it is memory
carved out by the BIOS.
AMDGPU_GEM_DOMAIN_GDS Global on-chip data storage used to share data
across shader threads.
AMDGPU_GEM_DOMAIN_GWS Global wave sync, used to synchronize the
execution of all the waves on a device.
AMDGPU_GEM_DOMAIN_OA Ordered append, used by 3D or Compute engines
for appending data.
Buffer Objects¶
This defines the interfaces to operate on an amdgpu_bo buffer object which
represents memory used by driver (VRAM, system memory, etc.). The driver
provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
to create/destroy/set buffer object which are then managed by the kernel TTM
memory manager.
The interfaces are also used internally by kernel clients, including gfx,
uvd, etc. for kernel managed allocations used by the GPU.
-
bool
amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo)¶ check if the buffer object is an
amdgpu_bo
Parameters
struct ttm_buffer_object *bobuffer object to be checked
Description
Uses destroy function associated with the object to determine if this is
an amdgpu_bo.
Return
true if the object belongs to amdgpu_bo, false if not.
-
void
amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain)¶ set buffer’s placement
Parameters
struct amdgpu_bo *aboamdgpu_bobuffer object whose placement is to be setu32 domainrequested domain
Description
Sets buffer’s placement according to requested domain and the buffer’s flags.
-
int
amdgpu_bo_create_reserved(struct amdgpu_device *adev, unsigned long size, int align, u32 domain, struct amdgpu_bo **bo_ptr, u64 *gpu_addr, void **cpu_addr)¶ create reserved BO for kernel use
Parameters
struct amdgpu_device *adevamdgpu device object
unsigned long sizesize for the new BO
int alignalignment for the new BO
u32 domainwhere to place it
struct amdgpu_bo **bo_ptrused to initialize BOs in structures
u64 *gpu_addrGPU addr of the pinned BO
void **cpu_addroptional CPU address mapping
Description
Allocates and pins a BO for kernel internal use, and returns it still reserved.
Note
For bo_ptr new BO is only created if bo_ptr points to NULL.
Return
0 on success, negative error code otherwise.
-
int
amdgpu_bo_create_kernel(struct amdgpu_device *adev, unsigned long size, int align, u32 domain, struct amdgpu_bo **bo_ptr, u64 *gpu_addr, void **cpu_addr)¶ create BO for kernel use
Parameters
struct amdgpu_device *adevamdgpu device object
unsigned long sizesize for the new BO
int alignalignment for the new BO
u32 domainwhere to place it
struct amdgpu_bo **bo_ptrused to initialize BOs in structures
u64 *gpu_addrGPU addr of the pinned BO
void **cpu_addroptional CPU address mapping
Description
Allocates and pins a BO for kernel internal use.
Note
For bo_ptr new BO is only created if bo_ptr points to NULL.
Return
0 on success, negative error code otherwise.
-
int
amdgpu_bo_create_kernel_at(struct amdgpu_device *adev, uint64_t offset, uint64_t size, uint32_t domain, struct amdgpu_bo **bo_ptr, void **cpu_addr)¶ create BO for kernel use at specific location
Parameters
struct amdgpu_device *adevamdgpu device object
uint64_t offsetoffset of the BO
uint64_t sizesize of the BO
uint32_t domainwhere to place it
struct amdgpu_bo **bo_ptrused to initialize BOs in structures
void **cpu_addroptional CPU address mapping
Description
Creates a kernel BO at a specific offset in the address space of the domain.
Return
0 on success, negative error code otherwise.
-
void
amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr, void **cpu_addr)¶ free BO for kernel use
Parameters
struct amdgpu_bo **boamdgpu BO to free
u64 *gpu_addrpointer to where the BO’s GPU memory space address was stored
void **cpu_addrpointer to where the BO’s CPU memory space address was stored
Description
unmaps and unpin a BO for kernel internal use.
-
int
amdgpu_bo_create(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo **bo_ptr)¶ create an
amdgpu_bobuffer object
Parameters
struct amdgpu_device *adevamdgpu device object
struct amdgpu_bo_param *bpparameters to be used for the buffer object
struct amdgpu_bo **bo_ptrpointer to the buffer object pointer
Description
Creates an amdgpu_bo buffer object.
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_create_user(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo_user **ubo_ptr)¶ create an
amdgpu_bo_userbuffer object
Parameters
struct amdgpu_device *adevamdgpu device object
struct amdgpu_bo_param *bpparameters to be used for the buffer object
struct amdgpu_bo_user **ubo_ptrpointer to the buffer object pointer
Description
Create a BO to be used by user application;
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_create_vm(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo_vm **vmbo_ptr)¶ create an
amdgpu_bo_vmbuffer object
Parameters
struct amdgpu_device *adevamdgpu device object
struct amdgpu_bo_param *bpparameters to be used for the buffer object
struct amdgpu_bo_vm **vmbo_ptrpointer to the buffer object pointer
Description
Create a BO to be for GPUVM.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_add_to_shadow_list(struct amdgpu_bo_vm *vmbo)¶ add a BO to the shadow list
Parameters
struct amdgpu_bo_vm *vmboBO that will be inserted into the shadow list
Description
Insert a BO to the shadow list.
-
int
amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)¶ restore an
amdgpu_boshadow
Parameters
struct amdgpu_bo *shadowamdgpu_boshadow to be restoredstruct dma_fence **fencedma_fence associated with the operation
Description
Copies a buffer object’s shadow content back to the object. This is used for recovering a buffer from its shadow in case of a gpu reset where vram context may be lost.
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)¶ map an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object to be mappedvoid **ptrkernel virtual address to be returned
Description
Calls ttm_bo_kmap() to set up the kernel virtual mapping; calls
amdgpu_bo_kptr() to get the kernel virtual address.
Return
0 for success or a negative error code on failure.
-
void *
amdgpu_bo_kptr(struct amdgpu_bo *bo)¶ returns a kernel virtual address of the buffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object
Description
Calls ttm_kmap_obj_virtual() to get the kernel virtual address
Return
the virtual address of a buffer object area.
-
void
amdgpu_bo_kunmap(struct amdgpu_bo *bo)¶ unmap an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object to be unmapped
Description
Unmaps a kernel map set up by amdgpu_bo_kmap().
-
struct amdgpu_bo *
amdgpu_bo_ref(struct amdgpu_bo *bo)¶ reference an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object
Description
References the contained ttm_buffer_object.
Return
a refcounted pointer to the amdgpu_bo buffer object.
-
void
amdgpu_bo_unref(struct amdgpu_bo **bo)¶ unreference an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo **boamdgpu_bobuffer object
Description
Unreferences the contained ttm_buffer_object and clear the pointer
-
int
amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain, u64 min_offset, u64 max_offset)¶ pin an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object to be pinnedu32 domaindomain to be pinned to
u64 min_offsetthe start of requested address range
u64 max_offsetthe end of requested address range
Description
Pins the buffer object according to requested domain and address range. If the memory is unbound gart memory, binds the pages into gart table. Adjusts pin_count and pin_size accordingly.
Pinning means to lock pages in memory along with keeping them at a fixed offset. It is required when a buffer can not be moved, for example, when a display buffer is being scanned out.
Compared with amdgpu_bo_pin(), this function gives more flexibility on
where to pin a buffer if there are specific restrictions on where a buffer
must be located.
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_pin(struct amdgpu_bo *bo, u32 domain)¶ pin an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object to be pinnedu32 domaindomain to be pinned to
Description
A simple wrapper to amdgpu_bo_pin_restricted().
Provides a simpler API for buffers that do not have any strict restrictions
on where a buffer must be located.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_unpin(struct amdgpu_bo *bo)¶ unpin an
amdgpu_bobuffer object
Parameters
struct amdgpu_bo *boamdgpu_bobuffer object to be unpinned
Description
Decreases the pin_count, and clears the flags if pin_count reaches 0. Changes placement and pin size accordingly.
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_init(struct amdgpu_device *adev)¶ initialize memory manager
Parameters
struct amdgpu_device *adevamdgpu device object
Description
Calls amdgpu_ttm_init() to initialize amdgpu memory manager.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_fini(struct amdgpu_device *adev)¶ tear down memory manager
Parameters
struct amdgpu_device *adevamdgpu device object
Description
Reverses amdgpu_bo_init() to tear down memory manager.
-
int
amdgpu_bo_set_tiling_flags(struct amdgpu_bo *bo, u64 tiling_flags)¶ set tiling flags
Parameters
struct amdgpu_bo *boamdgpu_bobuffer objectu64 tiling_flagsnew flags
Description
Sets buffer object’s tiling flags with the new one. Used by GEM ioctl or kernel driver to set the tiling flags on a buffer.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_get_tiling_flags(struct amdgpu_bo *bo, u64 *tiling_flags)¶ get tiling flags
Parameters
struct amdgpu_bo *boamdgpu_bobuffer objectu64 *tiling_flagsreturned flags
Description
Gets buffer object’s tiling flags. Used by GEM ioctl or kernel driver to set the tiling flags on a buffer.
-
int
amdgpu_bo_set_metadata(struct amdgpu_bo *bo, void *metadata, uint32_t metadata_size, uint64_t flags)¶ set metadata
Parameters
struct amdgpu_bo *boamdgpu_bobuffer objectvoid *metadatanew metadata
uint32_t metadata_sizesize of the new metadata
uint64_t flagsflags of the new metadata
Description
Sets buffer object’s metadata, its size and flags. Used via GEM ioctl.
Return
0 for success or a negative error code on failure.
-
int
amdgpu_bo_get_metadata(struct amdgpu_bo *bo, void *buffer, size_t buffer_size, uint32_t *metadata_size, uint64_t *flags)¶ get metadata
Parameters
struct amdgpu_bo *boamdgpu_bobuffer objectvoid *bufferreturned metadata
size_t buffer_sizesize of the buffer
uint32_t *metadata_sizesize of the returned metadata
uint64_t *flagsflags of the returned metadata
Description
Gets buffer object’s metadata, its size and flags. buffer_size shall not be less than metadata_size. Used via GEM ioctl.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_move_notify(struct ttm_buffer_object *bo, bool evict, struct ttm_resource *new_mem)¶ notification about a memory move
Parameters
struct ttm_buffer_object *bopointer to a buffer object
bool evictif this move is evicting the buffer from the graphics address space
struct ttm_resource *new_memnew information of the bufer object
Description
Marks the corresponding amdgpu_bo buffer object as invalid, also performs
bookkeeping.
TTM driver callback which is called when ttm moves a buffer.
-
void
amdgpu_bo_release_notify(struct ttm_buffer_object *bo)¶ notification about a BO being released
Parameters
struct ttm_buffer_object *bopointer to a buffer object
Description
Wipes VRAM buffers whose contents should not be leaked before the memory is released.
-
vm_fault_t
amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)¶ notification about a memory fault
Parameters
struct ttm_buffer_object *bopointer to a buffer object
Description
Notifies the driver we are taking a fault on this BO and have reserved it, also performs bookkeeping. TTM driver callback for dealing with vm faults.
Return
0 for success or a negative error code on failure.
-
void
amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence, bool shared)¶ add fence to buffer object
Parameters
struct amdgpu_bo *bobuffer object in question
struct dma_fence *fencefence to add
bool sharedtrue if fence should be added shared
-
int
amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv, enum amdgpu_sync_mode sync_mode, void *owner, bool intr)¶ Wait for BO reservation fences
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct dma_resv *resvreservation object to sync to
enum amdgpu_sync_mode sync_modesynchronization mode
void *ownerfence owner
bool intrWhether the wait is interruptible
Description
Extract the fences from the reservation object and waits for them to finish.
Return
0 on success, errno otherwise.
-
int
amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)¶ Wrapper for amdgpu_bo_sync_wait_resv
Parameters
struct amdgpu_bo *bobuffer object to wait for
void *ownerfence owner
bool intrWhether the wait is interruptible
Description
Wrapper to wait for fences in a BO.
Return
0 on success, errno otherwise.
-
u64
amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)¶ return GPU offset of bo
Parameters
struct amdgpu_bo *boamdgpu object for which we query the offset
Note
object should either be pinned or reserved when calling this function, it might be useful to add check for this for debugging.
Return
current GPU offset of the object.
-
u64
amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo)¶ return GPU offset of bo
Parameters
struct amdgpu_bo *boamdgpu object for which we query the offset
Return
current GPU offset of the object without raising warnings.
-
uint32_t
amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev, uint32_t domain)¶ get preferred domain
Parameters
struct amdgpu_device *adevamdgpu device object
uint32_t domainallowed memory domains
Return
Which of the allowed domains is preferred for allocating the BO.
-
u64
amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)¶ print BO info in debugfs file
Parameters
int idIndex or Id of the BO
struct amdgpu_bo *boRequested BO for printing info
struct seq_file *mdebugfs file
Description
Print BO information in debugfs file
Return
Size of the BO in bytes.
PRIME Buffer Sharing¶
The following callback implementations are used for sharing GEM buffer objects between different devices via PRIME.
-
int
amdgpu_dma_buf_attach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)¶ dma_buf_ops.attachimplementation
Parameters
struct dma_buf *dmabufDMA-buf where we attach to
struct dma_buf_attachment *attachattachment to add
Description
Add the attachment as user to the exported DMA-buf.
-
void
amdgpu_dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)¶ dma_buf_ops.detachimplementation
Parameters
struct dma_buf *dmabufDMA-buf where we remove the attachment from
struct dma_buf_attachment *attachthe attachment to remove
Description
Called when an attachment is removed from the DMA-buf.
-
int
amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)¶ dma_buf_ops.pinimplementation
Parameters
struct dma_buf_attachment *attachattachment to pin down
Description
Pin the BO which is backing the DMA-buf so that it can’t move any more.
-
void
amdgpu_dma_buf_unpin(struct dma_buf_attachment *attach)¶ dma_buf_ops.unpinimplementation
Parameters
struct dma_buf_attachment *attachattachment to unpin
Description
Unpin a previously pinned BO to make it movable again.
-
struct sg_table *
amdgpu_dma_buf_map(struct dma_buf_attachment *attach, enum dma_data_direction dir)¶ dma_buf_ops.map_dma_bufimplementation
Parameters
struct dma_buf_attachment *attachDMA-buf attachment
enum dma_data_direction dirDMA direction
Description
Makes sure that the shared DMA buffer can be accessed by the target device. For now, simply pins it to the GTT domain, where it should be accessible by all DMA devices.
Return
sg_table filled with the DMA addresses to use or ERR_PRT with negative error code.
-
void
amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir)¶ dma_buf_ops.unmap_dma_bufimplementation
Parameters
struct dma_buf_attachment *attachDMA-buf attachment
struct sg_table *sgtsg_table to unmap
enum dma_data_direction dirDMA direction
Description
This is called when a shared DMA buffer no longer needs to be accessible by another device. For now, simply unpins the buffer from GTT.
-
int
amdgpu_dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)¶ dma_buf_ops.begin_cpu_accessimplementation
Parameters
struct dma_buf *dma_bufShared DMA buffer
enum dma_data_direction directionDirection of DMA transfer
Description
This is called before CPU access to the shared DMA buffer’s memory. If it’s a read access, the buffer is moved to the GTT domain if possible, for optimal CPU read performance.
Return
0 on success or a negative error code on failure.
-
struct dma_buf *
amdgpu_gem_prime_export(struct drm_gem_object *gobj, int flags)¶ drm_driver.gem_prime_exportimplementation
Parameters
struct drm_gem_object *gobjGEM BO
int flagsFlags such as DRM_CLOEXEC and DRM_RDWR.
Description
The main work is done by the drm_gem_prime_export helper.
Return
Shared DMA buffer representing the GEM BO from the given device.
-
struct drm_gem_object *
amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf)¶ create BO for DMA-buf import
Parameters
struct drm_device *devDRM device
struct dma_buf *dma_bufDMA-buf
Description
Creates an empty SG BO for DMA-buf import.
Return
A new GEM BO of the given DRM device, representing the memory described by the given DMA-buf attachment and scatter/gather table.
-
void
amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)¶ attach.move_notifyimplementation
Parameters
struct dma_buf_attachment *attachthe DMA-buf attachment
Description
Invalidate the DMA-buf attachment, making sure that the we re-create the mapping before the next use.
-
struct drm_gem_object *
amdgpu_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)¶ drm_driver.gem_prime_importimplementation
Parameters
struct drm_device *devDRM device
struct dma_buf *dma_bufShared DMA buffer
Description
Import a dma_buf into a the driver and potentially create a new GEM object.
Return
GEM BO representing the shared DMA buffer for the given device.
-
bool
amdgpu_dmabuf_is_xgmi_accessible(struct amdgpu_device *adev, struct amdgpu_bo *bo)¶ Check if xgmi available for P2P transfer
Parameters
struct amdgpu_device *adevamdgpu_device pointer of the importer
struct amdgpu_bo *boamdgpu buffer object
Return
True if dmabuf accessible over xgmi, false otherwise.
MMU Notifier¶
For coherent userptr handling registers an MMU notifier to inform the driver about updates on the page tables of a process.
When somebody tries to invalidate the page tables we block the update until all operations on the pages in question are completed, then those pages are marked as accessed and also dirty if it wasn’t a read only access.
New command submissions using the userptrs in question are delayed until all page table invalidation are completed and we once more see a coherent process address space.
-
bool
amdgpu_mn_invalidate_gfx(struct mmu_interval_notifier *mni, const struct mmu_notifier_range *range, unsigned long cur_seq)¶ callback to notify about mm change
Parameters
struct mmu_interval_notifier *mnithe range (mm) is about to update
const struct mmu_notifier_range *rangedetails on the invalidation
unsigned long cur_seqValue to pass to mmu_interval_set_seq()
Description
Block for operations on BOs to finish and mark pages as accessed and potentially dirty.
-
bool
amdgpu_mn_invalidate_hsa(struct mmu_interval_notifier *mni, const struct mmu_notifier_range *range, unsigned long cur_seq)¶ callback to notify about mm change
Parameters
struct mmu_interval_notifier *mnithe range (mm) is about to update
const struct mmu_notifier_range *rangedetails on the invalidation
unsigned long cur_seqValue to pass to mmu_interval_set_seq()
Description
We temporarily evict the BO attached to this range. This necessitates evicting all user-mode queues of the process.
-
int
amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)¶ register a BO for notifier updates
Parameters
struct amdgpu_bo *boamdgpu buffer object
unsigned long addruserptr addr we should monitor
Description
Registers a mmu_notifier for the given BO at the specified address. Returns 0 on success, -ERRNO if anything goes wrong.
-
void
amdgpu_mn_unregister(struct amdgpu_bo *bo)¶ unregister a BO for notifier updates
Parameters
struct amdgpu_bo *boamdgpu buffer object
Description
Remove any registration of mmu notifier updates from the buffer object.
AMDGPU Virtual Memory¶
GPUVM is similar to the legacy gart on older asics, however rather than there being a single global gart table for the entire GPU, there are multiple VM page tables active at any given time. The VM page tables can contain a mix vram pages and system memory pages and system memory pages can be mapped as snooped (cached system pages) or unsnooped (uncached system pages). Each VM has an ID associated with it and there is a page table associated with each VMID. When executing a command buffer, the kernel tells the the ring what VMID to use for that command buffer. VMIDs are allocated dynamically as commands are submitted. The userspace drivers maintain their own address space and the kernel sets up their pages tables accordingly when they submit their command buffers and a VMID is assigned. Cayman/Trinity support up to 8 active VMs at any given time; SI supports 16.
-
struct
amdgpu_prt_cb¶ Helper to disable partial resident texture feature from a fence callback
Definition
struct amdgpu_prt_cb {
struct amdgpu_device *adev;
struct dma_fence_cb cb;
};
Members
adevamdgpu device
cbcallback
-
int
amdgpu_vm_set_pasid(struct amdgpu_device *adev, struct amdgpu_vm *vm, u32 pasid)¶ manage pasid and vm ptr mapping
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmamdgpu_vm pointer
u32 pasidthe pasid the VM is using on this GPU
Description
Set the pasid this VM is using on this GPU, can also be used to remove the pasid by passing in zero.
-
unsigned
amdgpu_vm_level_shift(struct amdgpu_device *adev, unsigned level)¶ return the addr shift for each level
Parameters
struct amdgpu_device *adevamdgpu_device pointer
unsigned levelVMPT level
Return
The number of bits the pfn needs to be right shifted for a level.
-
unsigned
amdgpu_vm_num_entries(struct amdgpu_device *adev, unsigned level)¶ return the number of entries in a PD/PT
Parameters
struct amdgpu_device *adevamdgpu_device pointer
unsigned levelVMPT level
Return
The number of entries in a page directory or page table.
-
unsigned
amdgpu_vm_num_ats_entries(struct amdgpu_device *adev)¶ return the number of ATS entries in the root PD
Parameters
struct amdgpu_device *adevamdgpu_device pointer
Return
The number of entries in the root page directory which needs the ATS setting.
-
uint32_t
amdgpu_vm_entries_mask(struct amdgpu_device *adev, unsigned int level)¶ the mask to get the entry number of a PD/PT
Parameters
struct amdgpu_device *adevamdgpu_device pointer
unsigned int levelVMPT level
Return
The mask to extract the entry number of a PD/PT from an address.
-
unsigned
amdgpu_vm_bo_size(struct amdgpu_device *adev, unsigned level)¶ returns the size of the BOs in bytes
Parameters
struct amdgpu_device *adevamdgpu_device pointer
unsigned levelVMPT level
Return
The size of the BO for a page directory or page table in bytes.
-
void
amdgpu_vm_bo_evicted(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is evicted
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is evicted
Description
State for PDs/PTs and per VM BOs which are not at the location they should be.
-
void
amdgpu_vm_bo_moved(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is moved
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is moved
Description
State for per VM BOs which are moved, but that change is not yet reflected in the page tables.
-
void
amdgpu_vm_bo_idle(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is idle
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is now idle
Description
State for PDs/PTs and per VM BOs which have gone through the state machine and are now idle.
-
void
amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is invalidated
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is now invalidated
Description
State for normal BOs which are invalidated and that change not yet reflected in the PTs.
-
void
amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is reloacted
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is relocated
Description
State for PDs/PTs which needs to update their parent PD. For the root PD, just move to idle state.
-
void
amdgpu_vm_bo_done(struct amdgpu_vm_bo_base *vm_bo)¶ vm_bo is done
Parameters
struct amdgpu_vm_bo_base *vm_bovm_bo which is now done
Description
State for normal BOs which are invalidated and that change has been updated in the PTs.
-
void
amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, struct amdgpu_vm *vm, struct amdgpu_bo *bo)¶ Adds bo to the list of bos associated with the vm
Parameters
struct amdgpu_vm_bo_base *basebase structure for tracking BO usage in a VM
struct amdgpu_vm *vmvm to which bo is to be added
struct amdgpu_bo *boamdgpu buffer object
Description
Initialize a bo_va_base structure and add it to the appropriate lists
-
struct amdgpu_vm_bo_base *
amdgpu_vm_pt_parent(struct amdgpu_vm_bo_base *pt)¶ get the parent page directory
Parameters
struct amdgpu_vm_bo_base *ptchild page table
Description
Helper to get the parent entry for the child page table. NULL if we are at the root page directory.
-
void
amdgpu_vm_pt_start(struct amdgpu_device *adev, struct amdgpu_vm *vm, uint64_t start, struct amdgpu_vm_pt_cursor *cursor)¶ start PD/PT walk
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmamdgpu_vm structure
uint64_t startstart address of the walk
struct amdgpu_vm_pt_cursor *cursorstate to initialize
Description
Initialize a amdgpu_vm_pt_cursor to start a walk.
-
bool
amdgpu_vm_pt_descendant(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *cursor)¶ go to child node
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm_pt_cursor *cursorcurrent state
Description
Walk to the child node of the current node.
Return
True if the walk was possible, false otherwise.
-
bool
amdgpu_vm_pt_sibling(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *cursor)¶ go to sibling node
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm_pt_cursor *cursorcurrent state
Description
Walk to the sibling node of the current node.
Return
True if the walk was possible, false otherwise.
-
bool
amdgpu_vm_pt_ancestor(struct amdgpu_vm_pt_cursor *cursor)¶ go to parent node
Parameters
struct amdgpu_vm_pt_cursor *cursorcurrent state
Description
Walk to the parent node of the current node.
Return
True if the walk was possible, false otherwise.
-
void
amdgpu_vm_pt_next(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *cursor)¶ get next PD/PT in hieratchy
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm_pt_cursor *cursorcurrent state
Description
Walk the PD/PT tree to the next node.
-
void
amdgpu_vm_pt_first_dfs(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_vm_pt_cursor *start, struct amdgpu_vm_pt_cursor *cursor)¶ start a deep first search
Parameters
struct amdgpu_device *adevamdgpu_device structure
struct amdgpu_vm *vmamdgpu_vm structure
struct amdgpu_vm_pt_cursor *startoptional cursor to start with
struct amdgpu_vm_pt_cursor *cursorstate to initialize
Description
Starts a deep first traversal of the PD/PT tree.
-
bool
amdgpu_vm_pt_continue_dfs(struct amdgpu_vm_pt_cursor *start, struct amdgpu_vm_bo_base *entry)¶ check if the deep first search should continue
Parameters
struct amdgpu_vm_pt_cursor *startstarting point for the search
struct amdgpu_vm_bo_base *entrycurrent entry
Return
True when the search should continue, false otherwise.
-
void
amdgpu_vm_pt_next_dfs(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *cursor)¶ get the next node for a deep first search
Parameters
struct amdgpu_device *adevamdgpu_device structure
struct amdgpu_vm_pt_cursor *cursorcurrent state
Description
Move the cursor to the next node in a deep first search.
-
void
amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm, struct list_head *validated, struct amdgpu_bo_list_entry *entry)¶ add the VM PD to a validation list
Parameters
struct amdgpu_vm *vmvm providing the BOs
struct list_head *validatedhead of validation list
struct amdgpu_bo_list_entry *entryentry to add
Description
Add the page directory to the list of BOs to validate for command submission.
-
void
amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)¶ update bulk_moveable flag
Parameters
struct ttm_buffer_object *boBO which was removed from the LRU
Description
Make sure the bulk_moveable flag is updated when a BO is removed from the LRU.
-
void
amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ move all BOs to the end of LRU
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_vm *vmvm providing the BOs
Description
Move all BOs to the end of LRU and remember their positions to put them together.
-
int
amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm, int (*validate)(void *p, struct amdgpu_bo *bo), void *param)¶ validate the page table BOs
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_vm *vmvm providing the BOs
int (*validate)(void *p, struct amdgpu_bo *bo)callback to do the validation
void *paramparameter for the validation callback
Description
Validate the page table BOs on command submission if neccessary.
Return
Validation result.
-
bool
amdgpu_vm_ready(struct amdgpu_vm *vm)¶ check VM is ready for updates
Parameters
struct amdgpu_vm *vmVM to check
Description
Check if all VM PDs/PTs are ready for updates
Return
True if VM is not evicting.
-
int
amdgpu_vm_clear_bo(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_bo_vm *vmbo, bool immediate)¶ initially clear the PDs/PTs
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmVM to clear BO from
struct amdgpu_bo_vm *vmboBO to clear
bool immediateuse an immediate update
Description
Root PD needs to be reserved when calling this.
Return
0 on success, errno otherwise.
-
int
amdgpu_vm_pt_create(struct amdgpu_device *adev, struct amdgpu_vm *vm, int level, bool immediate, struct amdgpu_bo_vm **vmbo)¶ create bo for PD/PT
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequesting vm
int levelthe page table level
bool immediateuse a immediate update
struct amdgpu_bo_vm **vmbopointer to the buffer object pointer
-
int
amdgpu_vm_alloc_pts(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_vm_pt_cursor *cursor, bool immediate)¶ Allocate a specific page table
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmVM to allocate page tables for
struct amdgpu_vm_pt_cursor *cursorWhich page table to allocate
bool immediateuse an immediate update
Description
Make sure a specific page table or directory is allocated.
Return
1 if page table needed to be allocated, 0 if page table was already allocated, negative errno if an error occurred.
-
void
amdgpu_vm_free_table(struct amdgpu_vm_bo_base *entry)¶ fre one PD/PT
Parameters
struct amdgpu_vm_bo_base *entryPDE to free
-
void
amdgpu_vm_free_pts(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_vm_pt_cursor *start)¶ free PD/PT levels
Parameters
struct amdgpu_device *adevamdgpu device structure
struct amdgpu_vm *vmamdgpu vm structure
struct amdgpu_vm_pt_cursor *startoptional cursor where to start freeing PDs/PTs
Description
Free the page directory or page table level and all sub levels.
-
void
amdgpu_vm_check_compute_bug(struct amdgpu_device *adev)¶ check whether asic has compute vm bug
Parameters
struct amdgpu_device *adevamdgpu_device pointer
-
bool
amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring, struct amdgpu_job *job)¶ Check if pipe sync is needed for job.
Parameters
struct amdgpu_ring *ringring on which the job will be submitted
struct amdgpu_job *jobjob to submit
Return
True if sync is needed.
-
int
amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool need_pipe_sync)¶ hardware flush the vm
Parameters
struct amdgpu_ring *ringring to use for flush
struct amdgpu_job *jobrelated job
bool need_pipe_syncis pipe sync needed
Description
Emit a VM flush when it is necessary.
Return
0 on success, errno otherwise.
-
struct amdgpu_bo_va *
amdgpu_vm_bo_find(struct amdgpu_vm *vm, struct amdgpu_bo *bo)¶ find the bo_va for a specific vm & bo
Parameters
struct amdgpu_vm *vmrequested vm
struct amdgpu_bo *borequested buffer object
Description
Find bo inside the requested vm. Search inside the bos vm list for the requested vm Returns the found bo_va or NULL if none is found
Object has to be reserved!
Return
Found bo_va or NULL.
-
uint64_t
amdgpu_vm_map_gart(const dma_addr_t *pages_addr, uint64_t addr)¶ Resolve gart mapping of addr
Parameters
const dma_addr_t *pages_addroptional DMA address to use for lookup
uint64_t addrthe unmapped addr
Description
Look up the physical address of the page that the pte resolves to.
Return
The pointer for the page table entry.
-
int
amdgpu_vm_update_pde(struct amdgpu_vm_update_params *params, struct amdgpu_vm *vm, struct amdgpu_vm_bo_base *entry)¶ update a single level in the hierarchy
Parameters
struct amdgpu_vm_update_params *paramsparameters for the update
struct amdgpu_vm *vmrequested vm
struct amdgpu_vm_bo_base *entryentry to update
Description
Makes sure the requested entry in parent is up to date.
-
void
amdgpu_vm_invalidate_pds(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ mark all PDs as invalid
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrelated vm
Description
Mark all PD level as invalid after an error.
-
int
amdgpu_vm_update_pdes(struct amdgpu_device *adev, struct amdgpu_vm *vm, bool immediate)¶ make sure that all directories are valid
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
bool immediatesubmit immediately to the paging queue
Description
Makes sure all directories are up to date.
Return
0 for success, error for failure.
-
void
amdgpu_vm_fragment(struct amdgpu_vm_update_params *params, uint64_t start, uint64_t end, uint64_t flags, unsigned int *frag, uint64_t *frag_end)¶ get fragment for PTEs
Parameters
struct amdgpu_vm_update_params *paramssee amdgpu_vm_update_params definition
uint64_t startfirst PTE to handle
uint64_t endlast PTE to handle
uint64_t flagshw mapping flags
unsigned int *fragresulting fragment size
uint64_t *frag_endend of this fragment
Description
Returns the first possible fragment for the start and end address.
-
int
amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, uint64_t start, uint64_t end, uint64_t dst, uint64_t flags)¶ make sure that page tables are valid
Parameters
struct amdgpu_vm_update_params *paramssee amdgpu_vm_update_params definition
uint64_t startstart of GPU address range
uint64_t endend of GPU address range
uint64_t dstdestination address to map to, the next dst inside the function
uint64_t flagsmapping flags
Description
Update the page tables in the range start - end.
Return
0 for success, -EINVAL for failure.
-
int
amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, struct amdgpu_device *bo_adev, struct amdgpu_vm *vm, bool immediate, bool unlocked, struct dma_resv *resv, uint64_t start, uint64_t last, uint64_t flags, uint64_t offset, struct ttm_resource *res, dma_addr_t *pages_addr, struct dma_fence **fence, bool *table_freed)¶ update a mapping in the vm page table
Parameters
struct amdgpu_device *adevamdgpu_device pointer of the VM
struct amdgpu_device *bo_adevamdgpu_device pointer of the mapped BO
struct amdgpu_vm *vmrequested vm
bool immediateimmediate submission in a page fault
bool unlockedunlocked invalidation during MM callback
struct dma_resv *resvfences we need to sync to
uint64_t startstart of mapped range
uint64_t lastlast mapped entry
uint64_t flagsflags for the entries
uint64_t offsetoffset into nodes and pages_addr
struct ttm_resource *resttm_resource to map
dma_addr_t *pages_addrDMA addresses to use for mapping
struct dma_fence **fenceoptional resulting fence
bool *table_freedreturn true if page table is freed
Description
Fill in the page table entries between start and last.
Return
0 for success, -EINVAL for failure.
-
int
amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, bool clear, bool *table_freed)¶ update all BO mappings in the vm page table
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_varequested BO and VM object
bool clearif true clear the entries
bool *table_freedreturn true if page table is freed
Description
Fill in the page table entries for bo_va.
Return
0 for success, -EINVAL for failure.
-
void
amdgpu_vm_update_prt_state(struct amdgpu_device *adev)¶ update the global PRT state
Parameters
struct amdgpu_device *adevamdgpu_device pointer
-
void
amdgpu_vm_prt_get(struct amdgpu_device *adev)¶ add a PRT user
Parameters
struct amdgpu_device *adevamdgpu_device pointer
-
void
amdgpu_vm_prt_put(struct amdgpu_device *adev)¶ drop a PRT user
Parameters
struct amdgpu_device *adevamdgpu_device pointer
-
void
amdgpu_vm_prt_cb(struct dma_fence *fence, struct dma_fence_cb *_cb)¶ callback for updating the PRT status
Parameters
struct dma_fence *fencefence for the callback
struct dma_fence_cb *_cbthe callback function
-
void
amdgpu_vm_add_prt_cb(struct amdgpu_device *adev, struct dma_fence *fence)¶ add callback for updating the PRT status
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct dma_fence *fencefence for the callback
-
void
amdgpu_vm_free_mapping(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_bo_va_mapping *mapping, struct dma_fence *fence)¶ free a mapping
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
struct amdgpu_bo_va_mapping *mappingmapping to be freed
struct dma_fence *fencefence of the unmap operation
Description
Free a mapping and make sure we decrease the PRT usage count if applicable.
-
void
amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ finish all prt mappings
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
Description
Register a cleanup callback to disable PRT support after VM dies.
-
int
amdgpu_vm_clear_freed(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct dma_fence **fence)¶ clear freed BOs in the PT
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
struct dma_fence **fenceoptional resulting fence (unchanged if no work needed to be done or if an error occurred)
Description
Make sure all freed BOs are cleared in the PT. PTs have to be reserved and mutex must be locked!
Return
0 for success.
-
int
amdgpu_vm_handle_moved(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ handle moved BOs in the PT
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
Description
Make sure all BOs which are moved are updated in the PTs.
PTs have to be reserved!
Return
0 for success.
-
struct amdgpu_bo_va *
amdgpu_vm_bo_add(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct amdgpu_bo *bo)¶ add a bo to a specific vm
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
struct amdgpu_bo *boamdgpu buffer object
Description
Add bo into the requested vm. Add bo to the list of bos associated with the vm
Object has to be reserved!
Return
Newly added bo_va or NULL for failure
-
void
amdgpu_vm_bo_insert_map(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, struct amdgpu_bo_va_mapping *mapping)¶ insert a new mapping
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_vabo_va to store the address
struct amdgpu_bo_va_mapping *mappingthe mapping to insert
Description
Insert a new mapping into all structures.
-
int
amdgpu_vm_bo_map(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, uint64_t saddr, uint64_t offset, uint64_t size, uint64_t flags)¶ map bo inside a vm
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_vabo_va to store the address
uint64_t saddrwhere to map the BO
uint64_t offsetrequested offset in the BO
uint64_t sizeBO size in bytes
uint64_t flagsattributes of pages (read/write/valid/etc.)
Description
Add a mapping of the BO at the specefied addr into the VM.
Object has to be reserved and unreserved outside!
Return
0 for success, error for failure.
-
int
amdgpu_vm_bo_replace_map(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, uint64_t saddr, uint64_t offset, uint64_t size, uint64_t flags)¶ map bo inside a vm, replacing existing mappings
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_vabo_va to store the address
uint64_t saddrwhere to map the BO
uint64_t offsetrequested offset in the BO
uint64_t sizeBO size in bytes
uint64_t flagsattributes of pages (read/write/valid/etc.)
Description
Add a mapping of the BO at the specefied addr into the VM. Replace existing mappings as we do so.
Object has to be reserved and unreserved outside!
Return
0 for success, error for failure.
-
int
amdgpu_vm_bo_unmap(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, uint64_t saddr)¶ remove bo mapping from vm
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_vabo_va to remove the address from
uint64_t saddrwhere to the BO is mapped
Description
Remove a mapping of the BO at the specefied addr from the VM.
Object has to be reserved and unreserved outside!
Return
0 for success, error for failure.
-
int
amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev, struct amdgpu_vm *vm, uint64_t saddr, uint64_t size)¶ remove all mappings in a specific range
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmVM structure to use
uint64_t saddrstart of the range
uint64_t sizesize of the range
Description
Remove all mappings in a range, split them as appropriate.
Return
0 for success, error for failure.
-
struct amdgpu_bo_va_mapping *
amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm, uint64_t addr)¶ find mapping by address
Parameters
struct amdgpu_vm *vmthe requested VM
uint64_t addrthe address
Description
Find a mapping by it’s address.
Return
The amdgpu_bo_va_mapping matching for addr or NULL
-
void
amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx *ticket)¶ trace all reserved mappings
Parameters
struct amdgpu_vm *vmthe requested vm
struct ww_acquire_ctx *ticketCS ticket
Description
Trace all mappings of BOs reserved during a command submission.
-
void
amdgpu_vm_bo_rmv(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va)¶ remove a bo to a specific vm
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo_va *bo_varequested bo_va
Description
Remove bo_va->bo from the requested vm.
Object have to be reserved!
-
bool
amdgpu_vm_evictable(struct amdgpu_bo *bo)¶ check if we can evict a VM
Parameters
struct amdgpu_bo *boA page table of the VM.
Description
Check if it is possible to evict a VM.
-
void
amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, struct amdgpu_bo *bo, bool evicted)¶ mark the bo as invalid
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_bo *boamdgpu buffer object
bool evictedis the BO evicted
Description
Mark bo as invalid.
-
uint32_t
amdgpu_vm_get_block_size(uint64_t vm_size)¶ calculate VM page table size as power of two
Parameters
uint64_t vm_sizeVM size
Return
VM page table as power of two
-
void
amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size, uint32_t fragment_size_default, unsigned max_level, unsigned max_bits)¶ adjust vm size, block size and fragment size
Parameters
struct amdgpu_device *adevamdgpu_device pointer
uint32_t min_vm_sizethe minimum vm size in GB if it’s set auto
uint32_t fragment_size_defaultDefault PTE fragment size
unsigned max_levelmax VMPT level
unsigned max_bitsmax address space size in bits
-
long
amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)¶ wait for the VM to become idle
Parameters
struct amdgpu_vm *vmVM object to wait for
long timeouttimeout to wait for VM to become idle
-
int
amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ initialize a vm instance
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
Description
Init vm fields.
Return
0 for success, error for failure.
-
int
amdgpu_vm_check_clean_reserved(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ check if a VM is clean
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmthe VM to check
Description
check all entries of the root PD, if any subsequent PDs are allocated, it means there are page table creating and filling, and is no a clean VM
Return
0 if this VM is clean
-
int
amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ Turn a GFX VM into a compute VM
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
Description
This only works on GFX VMs that don’t have any BOs added and no page tables allocated yet.
Changes the following VM parameters: - use_cpu_for_update - pte_supports_ats
Reinitializes the page directory to reflect the changed ATS setting.
Return
0 for success, -errno for errors.
-
void
amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ release a compute vm
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vma vm turned into compute vm by calling amdgpu_vm_make_compute
Description
This is a correspondant of amdgpu_vm_make_compute. It decouples compute pasid from vm. Compute should stop use of vm after this call.
-
void
amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)¶ tear down a vm instance
Parameters
struct amdgpu_device *adevamdgpu_device pointer
struct amdgpu_vm *vmrequested vm
Description
Tear down vm. Unbind the VM and remove all bos from the vm bo list
-
void
amdgpu_vm_manager_init(struct amdgpu_device *adev)¶ init the VM manager
Parameters
struct amdgpu_device *adevamdgpu_device pointer
Description
Initialize the VM manager structures
-
void
amdgpu_vm_manager_fini(struct amdgpu_device *adev)¶ cleanup VM manager
Parameters
struct amdgpu_device *adevamdgpu_device pointer
Description
Cleanup the VM manager and free resources.
-
int
amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)¶ Manages VMID reservation for vm hubs.
Parameters
struct drm_device *devdrm device pointer
void *datadrm_amdgpu_vm
struct drm_file *filpdrm file pointer
Return
0 for success, -errno for errors.
-
void
amdgpu_vm_get_task_info(struct amdgpu_device *adev, u32 pasid, struct amdgpu_task_info *task_info)¶ Extracts task info for a PASID.
Parameters
struct amdgpu_device *adevdrm device pointer
u32 pasidPASID identifier for VM
struct amdgpu_task_info *task_infotask_info to fill.
-
void
amdgpu_vm_set_task_info(struct amdgpu_vm *vm)¶ Sets VMs task info.
Parameters
struct amdgpu_vm *vmvm for which to set the info
-
bool
amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid, uint64_t addr, bool write_fault)¶ graceful handling of VM faults.
Parameters
struct amdgpu_device *adevamdgpu device pointer
u32 pasidPASID of the VM
uint64_t addrAddress of the fault
bool write_faulttrue is write fault, false is read fault
Description
Try to gracefully handle a VM fault. Return true if the fault was handled and shouldn’t be reported any more.
-
void
amdgpu_debugfs_vm_bo_info(struct amdgpu_vm *vm, struct seq_file *m)¶ print BO info for the VM
Parameters
struct amdgpu_vm *vmRequested VM for printing BO info
struct seq_file *mdebugfs file
Description
Print BO information in debugfs file for the VM
Interrupt Handling¶
Interrupts generated within GPU hardware raise interrupt requests that are passed to amdgpu IRQ handler which is responsible for detecting source and type of the interrupt and dispatching matching handlers. If handling an interrupt requires calling kernel functions that may sleep processing is dispatched to work handlers.
If MSI functionality is not disabled by module parameter then MSI support will be enabled.
For GPU interrupt sources that may be driven by another driver, IRQ domain support is used (with mapping between virtual and hardware IRQs).
-
void
amdgpu_hotplug_work_func(struct work_struct *work)¶ work handler for display hotplug event
Parameters
struct work_struct *workwork struct pointer
Description
This is the hotplug event work handler (all ASICs). The work gets scheduled from the IRQ handler if there was a hotplug interrupt. It walks through the connector table and calls hotplug handler for each connector. After this, it sends a DRM hotplug event to alert userspace.
This design approach is required in order to defer hotplug event handling
from the IRQ handler to a work handler because hotplug handler has to use
mutexes which cannot be locked in an IRQ handler (since mutex_lock may
sleep).
-
void
amdgpu_irq_disable_all(struct amdgpu_device *adev)¶ disable all interrupts
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Disable all types of interrupts from all sources.
-
irqreturn_t
amdgpu_irq_handler(int irq, void *arg)¶ IRQ handler
Parameters
int irqIRQ number (unused)
void *argpointer to DRM device
Description
IRQ handler for amdgpu driver (all ASICs).
Return
result of handling the IRQ, as defined by irqreturn_t
-
void
amdgpu_irq_handle_ih1(struct work_struct *work)¶ kick of processing for IH1
Parameters
struct work_struct *workwork structure in struct amdgpu_irq
Description
Kick of processing IH ring 1.
-
void
amdgpu_irq_handle_ih2(struct work_struct *work)¶ kick of processing for IH2
Parameters
struct work_struct *workwork structure in struct amdgpu_irq
Description
Kick of processing IH ring 2.
-
void
amdgpu_irq_handle_ih_soft(struct work_struct *work)¶ kick of processing for ih_soft
Parameters
struct work_struct *workwork structure in struct amdgpu_irq
Description
Kick of processing IH soft ring.
-
bool
amdgpu_msi_ok(struct amdgpu_device *adev)¶ check whether MSI functionality is enabled
Parameters
struct amdgpu_device *adevamdgpu device pointer (unused)
Description
Checks whether MSI functionality has been disabled via module parameter (all ASICs).
Return
true if MSIs are allowed to be enabled or false otherwise
-
int
amdgpu_irq_init(struct amdgpu_device *adev)¶ initialize interrupt handling
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Sets up work functions for hotplug and reset interrupts, enables MSI functionality, initializes vblank, hotplug and reset interrupt handling.
Return
0 on success or error code on failure
-
void
amdgpu_irq_fini_sw(struct amdgpu_device *adev)¶ shut down interrupt handling
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Tears down work functions for hotplug and reset interrupts, disables MSI functionality, shuts down vblank, hotplug and reset interrupt handling, turns off interrupts from all sources (all ASICs).
-
int
amdgpu_irq_add_id(struct amdgpu_device *adev, unsigned client_id, unsigned src_id, struct amdgpu_irq_src *source)¶ register IRQ source
Parameters
struct amdgpu_device *adevamdgpu device pointer
unsigned client_idclient id
unsigned src_idsource id
struct amdgpu_irq_src *sourceIRQ source pointer
Description
Registers IRQ source on a client.
Return
0 on success or error code otherwise
-
void
amdgpu_irq_dispatch(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)¶ dispatch IRQ to IP blocks
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_ih_ring *ihinterrupt ring instance
Description
Dispatches IRQ to IP blocks.
-
void
amdgpu_irq_delegate(struct amdgpu_device *adev, struct amdgpu_iv_entry *entry, unsigned int num_dw)¶ delegate IV to soft IH ring
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_iv_entry *entryIV entry
unsigned int num_dwsize of IV
Description
Delegate the IV to the soft IH ring and schedule processing of it. Used if the hardware delegation to IH1 or IH2 doesn’t work for some reason.
-
int
amdgpu_irq_update(struct amdgpu_device *adev, struct amdgpu_irq_src *src, unsigned type)¶ update hardware interrupt state
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_irq_src *srcinterrupt source pointer
unsigned typetype of interrupt
Description
Updates interrupt state for the specific source (all ASICs).
-
void
amdgpu_irq_gpu_reset_resume_helper(struct amdgpu_device *adev)¶ update interrupt states on all sources
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Updates state of all types of interrupts on all sources on resume after reset.
-
int
amdgpu_irq_get(struct amdgpu_device *adev, struct amdgpu_irq_src *src, unsigned type)¶ enable interrupt
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_irq_src *srcinterrupt source pointer
unsigned typetype of interrupt
Description
Enables specified type of interrupt on the specified source (all ASICs).
Return
0 on success or error code otherwise
-
int
amdgpu_irq_put(struct amdgpu_device *adev, struct amdgpu_irq_src *src, unsigned type)¶ disable interrupt
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_irq_src *srcinterrupt source pointer
unsigned typetype of interrupt
Description
Enables specified type of interrupt on the specified source (all ASICs).
Return
0 on success or error code otherwise
-
bool
amdgpu_irq_enabled(struct amdgpu_device *adev, struct amdgpu_irq_src *src, unsigned type)¶ check whether interrupt is enabled or not
Parameters
struct amdgpu_device *adevamdgpu device pointer
struct amdgpu_irq_src *srcinterrupt source pointer
unsigned typetype of interrupt
Description
Checks whether the given type of interrupt is enabled on the given source.
Return
true if interrupt is enabled, false if interrupt is disabled or on invalid parameters
-
int
amdgpu_irqdomain_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hwirq)¶ create mapping between virtual and hardware IRQ numbers
Parameters
struct irq_domain *damdgpu IRQ domain pointer (unused)
unsigned int irqvirtual IRQ number
irq_hw_number_t hwirqhardware irq number
Description
Current implementation assigns simple interrupt handler to the given virtual IRQ.
Return
0 on success or error code otherwise
-
int
amdgpu_irq_add_domain(struct amdgpu_device *adev)¶ create a linear IRQ domain
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Creates an IRQ domain for GPU interrupt sources that may be driven by another driver (e.g., ACP).
Return
0 on success or error code otherwise
-
void
amdgpu_irq_remove_domain(struct amdgpu_device *adev)¶ remove the IRQ domain
Parameters
struct amdgpu_device *adevamdgpu device pointer
Description
Removes the IRQ domain for GPU interrupt sources that may be driven by another driver (e.g., ACP).
-
unsigned
amdgpu_irq_create_mapping(struct amdgpu_device *adev, unsigned src_id)¶ create mapping between domain Linux IRQs
Parameters
struct amdgpu_device *adevamdgpu device pointer
unsigned src_idIH source id
Description
Creates mapping between a domain IRQ (GPU IH src id) and a Linux IRQ Use this for components that generate a GPU interrupt, but are driven by a different driver (e.g., ACP).
Return
Linux IRQ
IP Blocks¶
GPUs are composed of IP (intellectual property) blocks. These IP blocks provide various functionalities: display, graphics, video decode, etc. The IP blocks that comprise a particular GPU are listed in the GPU’s respective SoC file. amdgpu_device.c acquires the list of IP blocks for the GPU in use on initialization. It can then operate on this list to perform standard driver operations such as: init, fini, suspend, resume, etc.
IP block implementations are named using the following convention: <functionality>_v<version> (E.g.: gfx_v6_0).
-
enum
amd_ip_block_type¶ Used to classify IP blocks by functionality.
Constants
AMD_IP_BLOCK_TYPE_COMMONGPU Family
AMD_IP_BLOCK_TYPE_GMCGraphics Memory Controller
AMD_IP_BLOCK_TYPE_IHInterrupt Handler
AMD_IP_BLOCK_TYPE_SMCSystem Management Controller
AMD_IP_BLOCK_TYPE_PSPPlatform Security Processor
AMD_IP_BLOCK_TYPE_DCEDisplay and Compositing Engine
AMD_IP_BLOCK_TYPE_GFXGraphics and Compute Engine
AMD_IP_BLOCK_TYPE_SDMASystem DMA Engine
AMD_IP_BLOCK_TYPE_UVDUnified Video Decoder
AMD_IP_BLOCK_TYPE_VCEVideo Compression Engine
AMD_IP_BLOCK_TYPE_ACPAudio Co-Processor
AMD_IP_BLOCK_TYPE_VCNVideo Core/Codec Next
AMD_IP_BLOCK_TYPE_MESMicro-Engine Scheduler
AMD_IP_BLOCK_TYPE_JPEGJPEG Engine
AMD_IP_BLOCK_TYPE_NUMundescribed
-
struct
amd_ip_funcs¶ general hooks for managing amdgpu IP Blocks
Definition
struct amd_ip_funcs {
char *name;
int (*early_init)(void *handle);
int (*late_init)(void *handle);
int (*sw_init)(void *handle);
int (*sw_fini)(void *handle);
int (*early_fini)(void *handle);
int (*hw_init)(void *handle);
int (*hw_fini)(void *handle);
void (*late_fini)(void *handle);
int (*suspend)(void *handle);
int (*resume)(void *handle);
bool (*is_idle)(void *handle);
int (*wait_for_idle)(void *handle);
bool (*check_soft_reset)(void *handle);
int (*pre_soft_reset)(void *handle);
int (*soft_reset)(void *handle);
int (*post_soft_reset)(void *handle);
int (*set_clockgating_state)(void *handle, enum amd_clockgating_state state);
int (*set_powergating_state)(void *handle, enum amd_powergating_state state);
void (*get_clockgating_state)(void *handle, u32 *flags);
int (*enable_umd_pstate)(void *handle, enum amd_dpm_forced_level *level);
};
Members
nameName of IP block
early_initsets up early driver state (pre sw_init), does not configure hw - Optional
late_initsets up late driver/hw state (post hw_init) - Optional
sw_initsets up driver state, does not configure hw
sw_finitears down driver state, does not configure hw
early_finitears down stuff before dev detached from driver
hw_initsets up the hw state
hw_finitears down the hw state
late_finifinal cleanup
suspendhandles IP specific hw/sw changes for suspend
resumehandles IP specific hw/sw changes for resume
is_idlereturns current IP block idle status
wait_for_idlepoll for idle
check_soft_resetcheck soft reset the IP block
pre_soft_resetpre soft reset the IP block
soft_resetsoft reset the IP block
post_soft_resetpost soft reset the IP block
set_clockgating_stateenable/disable cg for the IP block
set_powergating_stateenable/disable pg for the IP block
get_clockgating_stateget current clockgating status
enable_umd_pstateenable UMD powerstate
Description
These hooks provide an interface for controlling the operational state of IP blocks. After acquiring a list of IP blocks for the GPU in use, the driver can make chip-wide state changes by walking this list and making calls to hooks from each IP block. This list is ordered to ensure that the driver initializes the IP blocks in a safe sequence.