Newest 'amd-gpu' Questions

0 votes

0 answers

44 views

Enormous number of amdxx64.dll threads

I'm sure I provoke it somehow, but may be somebody knows what causes it and how to deal with it. The only practical problem I have now is that I cannot find my own threads because the number of these ...

dev_null

1,997

asked Dec 6 at 16:51

0 votes

0 answers

131 views

Load kernel 1 failed in NBMiner

Im trying to use NBMiner_Win but i get next error [21:21:46] ERROR - Load kernel 1 failed. [21:21:39] INFO - |ID|PCI| CC| Memory| CU| RamType| RamVendor| [21:21:46] ERROR - Error loading kernel for ...

Laurens Gouwy

23

asked Aug 4 at 19:28

0 votes

1 answer

57 views

Can't seem to achieve anywhere near my GPU global memory bandwidth in OpenCL

Using opencl on my AMD GPU, I've only been able to achieve 4% (15 GB/sec) of the GPU global-memory bandwidth reported by clpeak (375 GB/sec). Before resigning myself to this, I want to make sure I'm ...

Kensmosis

103

asked Jul 21 at 17:36

0 votes

0 answers

67 views

Understanding MUBUF instruction in AMD GCN Architecture

I am trying to understand how MUBUF instruction works using the following kernel. Assume only 1 wavefront (64 WIs). According to ISA ref guide gcn3-instruction-set-architecture.pdf, ADDR = Base + ...

Lokananda Hari

1

asked Jun 16 at 17:56

0 votes

0 answers

23 views

How to run high graphics demanding programs on dedicated gpu in laptop

My laptop specs is : Amd Ryzen 5000 series Nvidia Rtx 3050 Laptop gpu 16 Gb Ram 512 gb ssd Asus Zenbook 14 The problem is that no matter what I do, in Nvidia control panel and add the programs in ...

BOSS

23

asked May 31 at 17:05

1 vote

0 answers

73 views

Understanding how division works on AMD GPUs

I am trying to understand how AMD GPUs perform integer division, so I disassembled this program: extern "C" __attribute__((global))void __attribute__((amdgpu_flat_work_group_size(1, 1)))test(...

SzymonO

586

asked May 13 at 16:33

0 votes

0 answers

354 views

RuntimeError. Large language model with DirectML on win11+amd gpu

I follow Enable PyTorch with DirectML on Windows and can use AMD GPU to run simple code: import torch import torch_directml dml = torch_directml.device() #Test a simple operation x = torch.tensor([1....

HongE

1

asked Apr 18 at 20:26

0 votes

0 answers

329 views

Android emulator failed to start. Failed to process .ini file

I am getting the following warning when trying to start my android emulator Failed to process .ini file C:\Users\shopr\.android\avd\..\avd\Nexus_S_API_25.avd\quickbootChoice.ini for reading. error on ...

Robin Ks

1

asked Apr 13 at 0:24

0 votes

0 answers

92 views

Intel ARC equivalent for NvOptimusEnablement (Nvidia) & AmdPowerXpressRequestHighPerformance (AMD)?

Intel Integrated+NVIDIA dual-GPU "Optimus" setups, an application can export NvOptimusEnablement as explained in OptimusRenderingPolicies.pdf. This option allows an application to ensure the ...

Brady Jessup

194

asked Apr 11 at 20:37

0 votes

1 answer

145 views

OpenCL dynamic parallelism enqueue_kernel() functionality

I am trying to use the functionality provided by OpenCL 2.0 to call kernels from within kernels but cannot seem to get it working. For instance I have these kernels: __kernel void test2(){ printf(&...

Iordan Bogdan

43

asked Mar 26 at 21:44

1 vote

0 answers

524 views

WARNING: amdgpu dkms failed for running kernel

i am new to all linux and i am trying to install amd graphics drivers, it gives me this error, can someone help me? mawo@DESKTOP-4L5PH5G:~/Downloads$ amdgpu-install -y --usecase=graphics INFO: i386 ...

Mawo

11

asked Mar 15 at 19:21

0 votes

0 answers

135 views

Compiling hip code using hipcc -O0 for AMD GPU

I'm using AMD RADEON RX7600. I've been trying to compile HIP code using the precompiled hipcc compiler for a few weeks now. When compiled with O[1/2/3] everything works fine. When I encountered some ...

The Noderinator

1

asked Mar 12 at 9:45

0 votes

2 answers

693 views

Accelerated PyTorch for Macbook with AMD GPUS

I followed instructions from apple website (https://developer.apple.com/metal/pytorch/) and when I verified mps support with its Python script, it just gave me back something I do not understand. (It'...

Pica

21

asked Jan 13 at 13:59

0 votes

0 answers

131 views

Blender and other 3D applications don't launch

Sys specs; Arch, Ryzen 5 5600x, Radeon 570X, 16GB ram. I reinstalled my arch install a month ago and tried to open blender with this message: Writing: /tmp/blender.crash.txt Segmentation fault (core ...

Kinetic

21

asked Nov 24, 2023 at 12:06

0 votes

0 answers

255 views

How to compile clang llvm to amd gcn on linux ubuntu

I've been trying for about two days to compile clang-llvm for amd gcn and I'm stuck. My goal here is to be able to compile a hip program using the triple amdgcn-amd-amdhsa. I cloned the project (llvm ...

The Noderinator

1

asked Nov 22, 2023 at 16:20

2 votes

0 answers

1k views

[drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx_0.0.0 timeout, signaled seq=1552686, emitted seq=1552688

image1 image2 I usually have the problem of resetting GPU in directx 11 and directx 12 supported games with wine. How can I solve this problem? Can you help me please? lspci -k | grep -A 3 -E "(...

Alper Okur

21

asked Nov 7, 2023 at 11:34

-1 votes

1 answer

432 views

libc6-dev/libc-dev : "Unable to fix problems, bad packets are in “keep as is” mode."

first sorry, since my terminal is in french, I'll use google translate for all the outputs so the terminology may not be perfectly accurate. I am running Ubuntu 20.04.6 LTS and I want to use ...

Willy Lutz

314

asked Oct 23, 2023 at 8:58

3 votes

3 answers

363 views

How do I Load Multiple Float4 from Memory to Registers using Inline GCN assembly in AMD HIP?

Motivation I'm doing some micro-benchmarks on AMD GPUs to understand its performance characteristics in order to improve kernel performance. I'm now suspecting that different register allocation and ...

比尔盖子

3,507

asked Sep 17, 2023 at 6:07

0 votes

1 answer

1k views

Running pytorch or tensorflow in AMD APU

I have a machine with a AMD Radeon APU: gfx90c I am using arch linux. I have been battling to get the PyTorch and TensorFlow to use the APU, but so far no results. I already tried to build the arch ...

João Santos

23

asked Sep 12, 2023 at 9:42

0 votes

1 answer

67 views

How can I make fragment_shader have a ouput to stencil_attachment?

I want to implement a pipeline, which can copy stencil image to stencil image. I want to use a texelfetch read the src image as the sampler2D in glsl, and output to the stencil image_attachment. Are ...

Lucas

67

asked Jul 28, 2023 at 9:38

-1 votes

2 answers

188 views

GLSL Error: '##' : not supported for these tokens

I have an AMD Radeon Graphics (Ryzen 7000) GPU and I am creating a program using OpenGL. I wrote the shaders in GLSL version 330 with the extension GL_ARB_shading_language_420pack enabled. However, ...

AboodXD

158

asked Jun 26, 2023 at 10:55

0 votes

0 answers

237 views

hipMemcpy fails to copy

I am trying to use <hip/hip_runtime.h> library, but I keep getting gibberish when exchanging data. here is my code: #include <hip/hip_runtime.h> #include <iostream> int main() { ...

Nika Tark

5

asked Jun 22, 2023 at 19:49

0 votes

1 answer

157 views

Linux Stripes on Screen

I was building an Android application with using Android Studio Giraffe. Suddenly I saw some colorful squares and screen was frozen. I restarted it and it give me kfd kfd amdgpu oland not supported in ...

Cagdas

111

asked Jun 19, 2023 at 22:47

1 vote

1 answer

96 views

Are there any separate Texture Caches in AMD GPUs of xDNA architecture?

I have a testbench that captures data about several types of caches, including texture cache. Since I decided to verify whether acquired data is correct or not, I would like to know if AMD GPUs of ...

Max Azatian

73

asked Jun 11, 2023 at 11:35

-3 votes

1 answer

199 views

Personal GPU running while using remote Desktop

I am connected to a remote desktop so I can train my model by using GPU (10 Go). However, I noticed after some time that my local GPU (AMD Intel on my local desktop) is being used as well for about 35%...

Rimee

9

asked Jun 2, 2023 at 9:20

1 vote

1 answer

3k views

RuntimeError: Cannot set version_counter for inference - Trying DirectML in AI Project for AMD

Actually is converting a PyTorch CUDA project (https://github.com/suno-ai/bark) with DirectML for use my AMD GPU RX6700xt, I am having the problem RuntimeError: Cannot set version_counter for ...

Milor123

555

asked May 8, 2023 at 21:26

5 votes

2 answers

280 views

Interpreting AMD RDNA3 instruction names

I am trying to analyze my OpenCL kernel as compiled for an RDNA3 AMD GPU. I use the Radeon GPU Analyzer for that. When I load my OpenCL kernel in the analyzer, it displays the assembly instruction for ...

Bram

8,173

asked May 1, 2023 at 0:09

0 votes

1 answer

757 views

How do I trace power consumption for Radeon GPUs?

I developed an OpenCL application and now I'm trying to assess the energy consumption of the same application running on two different GPUs: one NVIDIA Titan V and one Radeon RX6900 XT. The idea is to ...

Iago Storch

68

asked Apr 10, 2023 at 16:00

0 votes

0 answers

76 views

opencl crashes when using anything but CL_MEM_USE_HOST_PTR

I have a problem with my code trying to utilize the opencl capabilities of my gpu. Especially I am developing this project: https://github.com/alekstheod/tnnlib The openCL related code is located here:...

AlexTheo

4,164

asked Mar 26, 2023 at 21:31

0 votes

1 answer

5k views

How to run stable-diffusion-webui with directml (for AMD GPU) [closed]

My laptop is GPD Win Max 2 Windows 11. I have successfully installed stable-diffusion-webui-directml. It can use AMD GPU to generate one 512x512 image in about 2.5 minutes. So that is not the CPU mode'...

Fenix Lam

386

asked Mar 18, 2023 at 10:25

2 votes

1 answer

6k views

AMD ROCm with Pytorch on Navi10 (RX 5700 XT) and HSA_OVERRIDE_GFX_VERSION=10.3.0 fails

I saw AMD ROCm with Pytorch on Navi10 (RX 5700 / RX 5700 XT) recommending to use HSA_OVERRIDE_GFX_VERSION=10.3.0 to run Pytorch with ROCm on a 5700XT card, but I couldn't get it to work. My steps: $ ...

Eric Armbruster

29

asked Mar 6, 2023 at 10:20

2 votes

1 answer

687 views

How to use AMD GPU to process data with tensorflow and keras on Windows 11 pc

I'm unable to use my AMD GPU for work with data in Python code and using TensorFlow and Keras in Windows 11 Pro. I already tried some things like Intel plaidML and other third-party software but ...

Walgo2

21

asked Feb 22, 2023 at 23:52

1 vote

1 answer

330 views

Resolving "No Op-Kernel was registered to support Op" on AMD GPU

model = Sequential([ LSTM(units=50,return_sequences=True,input_shape=(lookback,1)), Dense(units=1) ]) model.compile(loss=keras.losses.MSE,optimizer=keras.optimizers.Adam(),metrics=['accuracy'])...

Lakshit Karsoliya

25

asked Feb 22, 2023 at 11:24

-1 votes

1 answer

186 views

Is there any tool to change spir-v shader, so I can insert and delete some instructions

I want to edit spv directly. Is there any tool to change spir-v shader, so I can insert and delete some instructions

bodino

15

asked Feb 9, 2023 at 3:18

0 votes

1 answer

99 views

How Vulkan application find the address of the function in driver

I am studying Vulkan driver code. I want to know how the application call the driver func because the func name is different.

bodino

15

asked Dec 22, 2022 at 3:15

0 votes

1 answer

186 views

OpenACC code runs 17036.0939901 times faster on Nvidia V100 GPU than on AMD MI250 GPU

I am trying to understand why my OpenACC code runs 17036.0939901 times faster on Nvidia V100 GPU than on AMD Mi-250 GPU. It is a simple matrix-matrix multiplication code. Here is output which I ...

Ilkhom Abdurakhmanov

13

asked Dec 1, 2022 at 4:30

1 vote

1 answer

239 views

what algorithm 8xmsaa & 16xmsaa use to generate the position of 8 points& 16 points

I want to know what pattern the point position is. i dont find the law of these eight points

bodino

15

asked Nov 23, 2022 at 2:43

0 votes

2 answers

4k views

What platform to use for YOLO output when using AMD GPU?

long time tormented by this question, I ask your advice in what direction to move. Objective - to develop universal application with yolo on windows, which can use computing power of AMD/Nvidia/Intel ...

xeetu

1

asked Nov 22, 2022 at 21:44

1 vote

1 answer

110 views

Using OpenCL to get the energy consumption of my OpenCL Kernel

I am trying to estimate the power consumption of my OpenCL kernel running on AMD Radeon RX Vega GPU. is there a way to access the power consumption through OpenCL directly? I tried using profilers but ...

Lama RL

31

asked Nov 21, 2022 at 10:42

3 votes

1 answer

579 views

OpenMP offloading on GPU, 'simd' specificities

I was wondering how to interpret the following OpenMP constructs: #pragma omp target teams distribute parallel for for(int i = 0; i < N; ++i) { // compute } #pragma omp target teams distribute ...

Etienne M

656

asked Nov 10, 2022 at 13:22

2 votes

1 answer

191 views

How to dispatch Kernel to AMD integrated GPU?

AMD provides lots of ressources regarding what instructions can be run on their integrated GPUs: http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf ...

KGM

294

asked Sep 27, 2022 at 12:59

2 votes

0 answers

276 views

DirectML InvalidArgumentError: Graph execution error: No OpKernel was registered to support Op 'CudnnRNN'

I was trying to use DirectML for usage of my AMD rx580 graphics card in TensorFlow, but I'm having a real hard time to pull this up. I'm getting this error: --------------------------------------------...

Rodrigo Oviedo

41

asked Sep 9, 2022 at 15:24

8 votes

3 answers

15k views

AMD ROCm with Pytorch on Navi10 (RX 5700 / RX 5700 XT)

I am one of those miserable creatures who own a AMD GPU (RX 5700, Navi10). I want to use up-to-date PyTorch libraries to do some Deep Learning on my local machine and stop using cloud instances. I saw ...

makesense

115

asked Aug 4, 2022 at 0:32

1 vote

1 answer

675 views

How can I determine the generation/codename of an AMD GPU in Linux?

I want to detect the AMD GPU generation in Python code. My case is that to run specific application (DaVinci Resolve), it is required to use AMDGPU-PRO drivers for GPU cards before Vega. And AMDGPU-...

Ashark

843

asked Jul 10, 2022 at 10:57

0 votes

1 answer

294 views

How to get 64-bit addressing, full RAM access using OpenCL with 2019 MacBook Pro 16" Intel/AMD

I have a 2019 MacBook Pro 16". It has an Intel Core i9, 8-core processor and an AMD Radeon Pro 5500M with 8 GB GPU RAM. I have the laptop dual booting Mac OS 12.4 and Windows 11. Running clinfo ...

Vince W.

3,775

asked Jul 6, 2022 at 15:14

2 votes

1 answer

209 views

How to optimize SYCL kernel

I'm studying SYCL at university and I have a question about performance of a code. In particular I have this C/C++ code: And I need to translate it in a SYCL kernel with parallelization and I do this:...

user17271389

asked Jun 30, 2022 at 15:16

0 votes

1 answer

521 views

OpenCL Kernel and traditional loops

I'm studying OpenCL and I don't understand the relationship between traditional loop in a C/C++ code and kernel code. Just for be clear a situation like that: So my question is: In the traditional ...

user17271389

asked Jun 30, 2022 at 8:11

1 vote

1 answer

404 views

Work-item branch divergence in OpenCL, how does it work?

I'm studying something about OpenCL and I don't understand very well the concept of "work-item divergence or Divergent Control Flow". As we can see in the picture below, there are some warp ...

user17271389

asked Jun 29, 2022 at 16:11

1 vote

1 answer

1k views

Open CL programming with G++ in Windows

I am trying to write OpenCL programs in C++ using G++ compiler in Windows 10 but I am not able to find any SDK for my work. Nvidia CUDA requires Visual Studio compilers to work and AMD AMP SDK seems ...

Gokul Raj V

27

asked Jun 19, 2022 at 16:03

6 votes

2 answers

26k views

How to make AMD GPU available by WSL for use with DALL-E Playground AI Sever

I'm trying to run and deploy the Dalle Playground on my local machine using an AMD GPU, I'm on Windows 11 with a WSL instance running. Link to Dalle Playground repo System OS: Windows 11 Pro - ...

Joel Gray

309

asked Jun 12, 2022 at 0:52

Collectives™ on Stack Overflow

Related Tags