All Questions
2 questions
0
votes
0
answers
67
views
Understanding MUBUF instruction in AMD GCN Architecture
I am trying to understand how MUBUF instruction works using the following kernel. Assume only 1 wavefront (64 WIs).
According to ISA ref guide gcn3-instruction-set-architecture.pdf,
ADDR = Base + ...
3
votes
3
answers
363
views
How do I Load Multiple Float4 from Memory to Registers using Inline GCN assembly in AMD HIP?
Motivation
I'm doing some micro-benchmarks on AMD GPUs to understand its performance characteristics in order to improve kernel performance. I'm now suspecting that different register allocation and ...