Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
67 views

Understanding MUBUF instruction in AMD GCN Architecture

I am trying to understand how MUBUF instruction works using the following kernel. Assume only 1 wavefront (64 WIs). According to ISA ref guide gcn3-instruction-set-architecture.pdf, ADDR = Base + ...
Lokananda Hari's user avatar
3 votes
3 answers
363 views

How do I Load Multiple Float4 from Memory to Registers using Inline GCN assembly in AMD HIP?

Motivation I'm doing some micro-benchmarks on AMD GPUs to understand its performance characteristics in order to improve kernel performance. I'm now suspecting that different register allocation and ...
比尔盖子's user avatar
  • 3,517