Brave Search

1 week ago - Buffer resources may be passed to AMDGPU buffer intrinsics, and they may be converted to and from ``i128. Casting a buffer resource to a buffer fat pointer is permitted and adds an offset of 0. Buffer resources can be created from 64-bit pointers (which should be either generic or global) using the llvm.amdgcn.make.buffer.rsrc intrinsic, which takes the pointer, which becomes the base of the resource, the 16-bit stride (and swzizzle control) field stored in bits 63:48 of a V#, the 32-bit NumRecords/extent field (bits 95:64), and the 32-bit flags field (bits 127:96).

AMD ROCm

rocm.docs.amd.com › projects › llvm-project › en › latest › LLVM › llvm › html › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 22.0.0git documentation

The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.

GitHub

github.com › llvm › llvm-project › blob › main › llvm › lib › Target › AMDGPU › AMDGPU.td

llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td at main · llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - llvm-project/llvm/lib/Target/AMDGPU/AMDGPU.td at main · llvm/llvm-project

Author llvm

MLIR

mlir.llvm.org › docs › Dialects › AMDGPU

'amdgpu' Dialect - MLIR

In general terms, operations or types should be added to this dialect when they wrap some AMD-specific functionality in a way that makes it work better with the MLIR ecosystem and its types or when those buitins would be needlessly complex to work with (such as if they features magic constants at the LLVM level). An additional set of operations that belong in this dialect are those that have chipset-specific differences that can be abstracted over in a useful way. ... amdgpu.mfma and amdgpu.wmma exist in order to make a large set of intrinsics more compatible with the MLIR type system (such as by allowing 8-bit float vectors to be passed as vector<N x f8E4M3FN> or vector<N x f8E4M2> instead of as packed 32-bit integers whose element type is controlled by separate operator-level constants.

Phoronix

phoronix.com › news › AMDGPU-LLVM-RDNA3-True16

AMDGPU LLVM Backend Flips On "True16" Mode For All RDNA3 GPUs - Phoronix

A late change to the AMDGPU LLVM compiler back-end that may help efforts particularly for the ROCm compute support on RDNA3 hardware is finally merging support for using true 16-bit instructions and registers on all RDNA3 GPUs. Merged earlier this summer was [AMDGPU][True16] set true16 mode as default on gfx110x.

LLVM

llvm.org › docs › AMDGPUOperandSyntax.html

Syntax of AMDGPU Instruction Operands — LLVM 23.0.0git documentation

2. Conversion. The input value is converted to the expected type as described in the table below. Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W (or both).

LLVM

releases.llvm.org › 10.0.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 10 documentation

The AMDGPU backend supports the following memory models: ... The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). ... The OpenCL memory model which has separate happens-before relations for the global and local address spaces. Only a fence specifying both global and local address space, and seq_cst instructions join the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to conservatively assume both local and global address space was specified.

libunwind

bcain-llvm.readthedocs.io › projects › llvm › en › latest › AMDGPUUsage

User Guide for AMDGPU Backend — LLVM 8 documentation

The AMDGPU backend supports the following memory models: ... The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). ... The OpenCL memory model which has separate happens-before relations for the global and local address spaces. Only a fence specifying both global and local address space, and seq_cst instructions join the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to convervatively assume both local and global address space was specified.

LLVM

releases.llvm.org › 6.0.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 6 documentation

The AMDGPU backend supports the following memory models: ... The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). ... The OpenCL memory model which has separate happens-before relations for the global and local address spaces. Only a fence specifying both global and local address space, and seq_cst instructions join the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to convervatively assume both local and global address space was specified.

LLVM

releases.llvm.org › 7.1.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 7 documentation

The AMDGPU backend supports the following memory models: ... The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). ... The OpenCL memory model which has separate happens-before relations for the global and local address spaces. Only a fence specifying both global and local address space, and seq_cst instructions join the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to convervatively assume both local and global address space was specified.

Find elsewhere

Google Bing Mojeek

LLVM

releases.llvm.org › 12.0.1 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 12 documentation

July 9, 2021 - The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.

LLVM

releases.llvm.org › 9.0.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 9 documentation

The AMDGPU backend supports the following memory models: ... The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). ... The OpenCL memory model which has separate happens-before relations for the global and local address spaces. Only a fence specifying both global and local address space, and seq_cst instructions join the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to convervatively assume both local and global address space was specified.

Llvm

prereleases.llvm.org › 18.1.0 › rc3 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 18.1.0rc documentation

The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.

LLVM

releases.llvm.org › 5.0.2 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 5 documentation

The AMDGPU memory model supports both the HSA [HSA] memory model, and the OpenCL [OpenCL] memory model. The HSA memory model uses a single happens-before relation for all address spaces (see Address Spaces). The OpenCL memory model which has separate happens-before relations for the global and local address spaces, and only a fence specifying both global and local address space joins the relationships. Since the LLVM memfence instruction does not allow an address space to be specified the OpenCL fence has to convervatively assume both local and global address space was specified.

LLVM

llvm.org › docs › AMDGPUInstructionSyntax.html

AMDGPU Instruction Syntax — LLVM 22.0.0git documentation

LLVM

releases.llvm.org › 21.1.2 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 21.1.2 documentation

The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.

LLVM

releases.llvm.org › 11.0.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 11 documentation

The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.

Spack

packages.spack.io › package.html

llvm-amdgpu

Contribute to Spack Packages Repository on GitHub

LLVM

releases.llvm.org › 13.0.0 › docs › AMDGPUUsage.html

User Guide for AMDGPU Backend — LLVM 13 documentation

October 19, 2021 - The memory model supported is based on the HSA memory model [HSA] which is based in turn on HRF-indirect with scope inclusion [HRF]. The happens-before relation is transitive over the synchronizes-with relation independent of scope and synchronizes-with allows the memory scope instances to be inclusive (see table AMDHSA LLVM Sync Scopes). This is different to the OpenCL [OpenCL] memory model which does not have scope inclusion and requires the memory scopes to exactly match. However, this is conservatively correct for OpenCL. The AMDGPU backend implements the following LLVM IR intrinsics.