Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
44 views

Enormous number of amdxx64.dll threads

I'm sure I provoke it somehow, but may be somebody knows what causes it and how to deal with it. The only practical problem I have now is that I cannot find my own threads because the number of these ...
dev_null's user avatar
  • 1,997
0 votes
0 answers
131 views

Load kernel 1 failed in NBMiner

Im trying to use NBMiner_Win but i get next error [21:21:46] ERROR - Load kernel 1 failed. [21:21:39] INFO - |ID|PCI| CC| Memory| CU| RamType| RamVendor| [21:21:46] ERROR - Error loading kernel for ...
Laurens Gouwy's user avatar
0 votes
1 answer
57 views

Can't seem to achieve anywhere near my GPU global memory bandwidth in OpenCL

Using opencl on my AMD GPU, I've only been able to achieve 4% (15 GB/sec) of the GPU global-memory bandwidth reported by clpeak (375 GB/sec). Before resigning myself to this, I want to make sure I'm ...
Kensmosis's user avatar
  • 103
0 votes
0 answers
67 views

Understanding MUBUF instruction in AMD GCN Architecture

I am trying to understand how MUBUF instruction works using the following kernel. Assume only 1 wavefront (64 WIs). According to ISA ref guide gcn3-instruction-set-architecture.pdf, ADDR = Base + ...
Lokananda Hari's user avatar
0 votes
0 answers
23 views

How to run high graphics demanding programs on dedicated gpu in laptop

My laptop specs is : Amd Ryzen 5000 series Nvidia Rtx 3050 Laptop gpu 16 Gb Ram 512 gb ssd Asus Zenbook 14 The problem is that no matter what I do, in Nvidia control panel and add the programs in ...
BOSS's user avatar
  • 23
1 vote
0 answers
73 views

Understanding how division works on AMD GPUs

I am trying to understand how AMD GPUs perform integer division, so I disassembled this program: extern "C" __attribute__((global))void __attribute__((amdgpu_flat_work_group_size(1, 1)))test(...
SzymonO's user avatar
  • 586
0 votes
0 answers
354 views

RuntimeError. Large language model with DirectML on win11+amd gpu

I follow Enable PyTorch with DirectML on Windows and can use AMD GPU to run simple code: import torch import torch_directml dml = torch_directml.device() #Test a simple operation x = torch.tensor([1....
HongE's user avatar
  • 1
0 votes
0 answers
329 views

Android emulator failed to start. Failed to process .ini file

I am getting the following warning when trying to start my android emulator Failed to process .ini file C:\Users\shopr\.android\avd\..\avd\Nexus_S_API_25.avd\quickbootChoice.ini for reading. error on ...
Robin Ks's user avatar
0 votes
0 answers
92 views

Intel ARC equivalent for NvOptimusEnablement (Nvidia) & AmdPowerXpressRequestHighPerformance (AMD)?

Intel Integrated+NVIDIA dual-GPU "Optimus" setups, an application can export NvOptimusEnablement as explained in OptimusRenderingPolicies.pdf. This option allows an application to ensure the ...
Brady Jessup's user avatar
0 votes
1 answer
145 views

OpenCL dynamic parallelism enqueue_kernel() functionality

I am trying to use the functionality provided by OpenCL 2.0 to call kernels from within kernels but cannot seem to get it working. For instance I have these kernels: __kernel void test2(){ printf(&...
Iordan Bogdan's user avatar
1 vote
0 answers
524 views

WARNING: amdgpu dkms failed for running kernel

i am new to all linux and i am trying to install amd graphics drivers, it gives me this error, can someone help me? mawo@DESKTOP-4L5PH5G:~/Downloads$ amdgpu-install -y --usecase=graphics INFO: i386 ...
Mawo's user avatar
  • 11
0 votes
0 answers
135 views

Compiling hip code using hipcc -O0 for AMD GPU

I'm using AMD RADEON RX7600. I've been trying to compile HIP code using the precompiled hipcc compiler for a few weeks now. When compiled with O[1/2/3] everything works fine. When I encountered some ...
The Noderinator's user avatar
0 votes
2 answers
693 views

Accelerated PyTorch for Macbook with AMD GPUS

I followed instructions from apple website (https://developer.apple.com/metal/pytorch/) and when I verified mps support with its Python script, it just gave me back something I do not understand. (It'...
Pica's user avatar
  • 21
0 votes
0 answers
131 views

Blender and other 3D applications don't launch

Sys specs; Arch, Ryzen 5 5600x, Radeon 570X, 16GB ram. I reinstalled my arch install a month ago and tried to open blender with this message: Writing: /tmp/blender.crash.txt Segmentation fault (core ...
Kinetic's user avatar
  • 21
0 votes
0 answers
255 views

How to compile clang llvm to amd gcn on linux ubuntu

I've been trying for about two days to compile clang-llvm for amd gcn and I'm stuck. My goal here is to be able to compile a hip program using the triple amdgcn-amd-amdhsa. I cloned the project (llvm ...
The Noderinator's user avatar
2 votes
0 answers
1k views

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1552686, emitted seq=1552688

image1 image2 I usually have the problem of resetting GPU in directx 11 and directx 12 supported games with wine. How can I solve this problem? Can you help me please? lspci -k | grep -A 3 -E "(...
Alper Okur's user avatar
-1 votes
1 answer
432 views

libc6-dev/libc-dev : "Unable to fix problems, bad packets are in “keep as is” mode."

first sorry, since my terminal is in french, I'll use google translate for all the outputs so the terminology may not be perfectly accurate. I am running Ubuntu 20.04.6 LTS and I want to use ...
Willy Lutz's user avatar
3 votes
3 answers
363 views

How do I Load Multiple Float4 from Memory to Registers using Inline GCN assembly in AMD HIP?

Motivation I'm doing some micro-benchmarks on AMD GPUs to understand its performance characteristics in order to improve kernel performance. I'm now suspecting that different register allocation and ...
比尔盖子's user avatar
  • 3,507
0 votes
1 answer
1k views

Running pytorch or tensorflow in AMD APU

I have a machine with a AMD Radeon APU: gfx90c I am using arch linux. I have been battling to get the PyTorch and TensorFlow to use the APU, but so far no results. I already tried to build the arch ...
João Santos's user avatar
0 votes
1 answer
67 views

How can I make fragment_shader have a ouput to stencil_attachment?

I want to implement a pipeline, which can copy stencil image to stencil image. I want to use a texelfetch read the src image as the sampler2D in glsl, and output to the stencil image_attachment. Are ...
Lucas's user avatar
  • 67
-1 votes
2 answers
188 views

GLSL Error: '##' : not supported for these tokens

I have an AMD Radeon Graphics (Ryzen 7000) GPU and I am creating a program using OpenGL. I wrote the shaders in GLSL version 330 with the extension GL_ARB_shading_language_420pack enabled. However, ...
AboodXD's user avatar
  • 158
0 votes
0 answers
237 views

hipMemcpy fails to copy

I am trying to use <hip/hip_runtime.h> library, but I keep getting gibberish when exchanging data. here is my code: #include <hip/hip_runtime.h> #include <iostream> int main() { ...
Nika Tark's user avatar
0 votes
1 answer
157 views

Linux Stripes on Screen

I was building an Android application with using Android Studio Giraffe. Suddenly I saw some colorful squares and screen was frozen. I restarted it and it give me kfd kfd amdgpu oland not supported in ...
Cagdas's user avatar
  • 111
1 vote
1 answer
96 views

Are there any separate Texture Caches in AMD GPUs of xDNA architecture?

I have a testbench that captures data about several types of caches, including texture cache. Since I decided to verify whether acquired data is correct or not, I would like to know if AMD GPUs of ...
Max Azatian's user avatar
-3 votes
1 answer
199 views

Personal GPU running while using remote Desktop

I am connected to a remote desktop so I can train my model by using GPU (10 Go). However, I noticed after some time that my local GPU (AMD Intel on my local desktop) is being used as well for about 35%...
Rimee's user avatar
  • 9
1 vote
1 answer
3k views

RuntimeError: Cannot set version_counter for inference - Trying DirectML in AI Project for AMD

Actually is converting a PyTorch CUDA project (https://github.com/suno-ai/bark) with DirectML for use my AMD GPU RX6700xt, I am having the problem RuntimeError: Cannot set version_counter for ...
Milor123's user avatar
  • 555
5 votes
2 answers
280 views

Interpreting AMD RDNA3 instruction names

I am trying to analyze my OpenCL kernel as compiled for an RDNA3 AMD GPU. I use the Radeon GPU Analyzer for that. When I load my OpenCL kernel in the analyzer, it displays the assembly instruction for ...
Bram's user avatar
  • 8,173
0 votes
1 answer
757 views

How do I trace power consumption for Radeon GPUs?

I developed an OpenCL application and now I'm trying to assess the energy consumption of the same application running on two different GPUs: one NVIDIA Titan V and one Radeon RX6900 XT. The idea is to ...
Iago Storch's user avatar
0 votes
0 answers
76 views

opencl crashes when using anything but CL_MEM_USE_HOST_PTR

I have a problem with my code trying to utilize the opencl capabilities of my gpu. Especially I am developing this project: https://github.com/alekstheod/tnnlib The openCL related code is located here:...
AlexTheo's user avatar
  • 4,164
0 votes
1 answer
5k views

How to run stable-diffusion-webui with directml (for AMD GPU) [closed]

My laptop is GPD Win Max 2 Windows 11. I have successfully installed stable-diffusion-webui-directml. It can use AMD GPU to generate one 512x512 image in about 2.5 minutes. So that is not the CPU mode'...
Fenix Lam's user avatar
  • 386
2 votes
1 answer
6k views

AMD ROCm with Pytorch on Navi10 (RX 5700 XT) and HSA_OVERRIDE_GFX_VERSION=10.3.0 fails

I saw AMD ROCm with Pytorch on Navi10 (RX 5700 / RX 5700 XT) recommending to use HSA_OVERRIDE_GFX_VERSION=10.3.0 to run Pytorch with ROCm on a 5700XT card, but I couldn't get it to work. My steps: $ ...
Eric Armbruster's user avatar
2 votes
1 answer
687 views

How to use AMD GPU to process data with tensorflow and keras on Windows 11 pc

I'm unable to use my AMD GPU for work with data in Python code and using TensorFlow and Keras in Windows 11 Pro. I already tried some things like Intel plaidML and other third-party software but ...
Walgo2 's user avatar
1 vote
1 answer
330 views

Resolving "No Op-Kernel was registered to support Op" on AMD GPU

model = Sequential([ LSTM(units=50,return_sequences=True,input_shape=(lookback,1)), Dense(units=1) ]) model.compile(loss=keras.losses.MSE,optimizer=keras.optimizers.Adam(),metrics=['accuracy'])...
Lakshit Karsoliya's user avatar
-1 votes
1 answer
186 views

Is there any tool to change spir-v shader, so I can insert and delete some instructions

I want to edit spv directly. Is there any tool to change spir-v shader, so I can insert and delete some instructions
bodino's user avatar
  • 15
0 votes
1 answer
99 views

How Vulkan application find the address of the function in driver

I am studying Vulkan driver code. I want to know how the application call the driver func because the func name is different.
bodino's user avatar
  • 15
0 votes
1 answer
186 views

OpenACC code runs 17036.0939901 times faster on Nvidia V100 GPU than on AMD MI250 GPU

I am trying to understand why my OpenACC code runs 17036.0939901 times faster on Nvidia V100 GPU than on AMD Mi-250 GPU. It is a simple matrix-matrix multiplication code. Here is output which I ...
Ilkhom Abdurakhmanov's user avatar
1 vote
1 answer
239 views

what algorithm 8xmsaa & 16xmsaa use to generate the position of 8 points& 16 points

I want to know what pattern the point position is. i dont find the law of these eight points
bodino's user avatar
  • 15
0 votes
2 answers
4k views

What platform to use for YOLO output when using AMD GPU?

long time tormented by this question, I ask your advice in what direction to move. Objective - to develop universal application with yolo on windows, which can use computing power of AMD/Nvidia/Intel ...
xeetu's user avatar
  • 1
1 vote
1 answer
110 views

Using OpenCL to get the energy consumption of my OpenCL Kernel

I am trying to estimate the power consumption of my OpenCL kernel running on AMD Radeon RX Vega GPU. is there a way to access the power consumption through OpenCL directly? I tried using profilers but ...
Lama RL's user avatar
  • 31
3 votes
1 answer
579 views

OpenMP offloading on GPU, 'simd' specificities

I was wondering how to interpret the following OpenMP constructs: #pragma omp target teams distribute parallel for for(int i = 0; i < N; ++i) { // compute } #pragma omp target teams distribute ...
Etienne M's user avatar
  • 656
2 votes
1 answer
191 views

How to dispatch Kernel to AMD integrated GPU?

AMD provides lots of ressources regarding what instructions can be run on their integrated GPUs: http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf ...
KGM's user avatar
  • 294
2 votes
0 answers
276 views

DirectML InvalidArgumentError: Graph execution error: No OpKernel was registered to support Op 'CudnnRNN'

I was trying to use DirectML for usage of my AMD rx580 graphics card in TensorFlow, but I'm having a real hard time to pull this up. I'm getting this error: --------------------------------------------...
Rodrigo Oviedo's user avatar
8 votes
3 answers
15k views

AMD ROCm with Pytorch on Navi10 (RX 5700 / RX 5700 XT)

I am one of those miserable creatures who own a AMD GPU (RX 5700, Navi10). I want to use up-to-date PyTorch libraries to do some Deep Learning on my local machine and stop using cloud instances. I saw ...
makesense's user avatar
  • 115
1 vote
1 answer
675 views

How can I determine the generation/codename of an AMD GPU in Linux?

I want to detect the AMD GPU generation in Python code. My case is that to run specific application (DaVinci Resolve), it is required to use AMDGPU-PRO drivers for GPU cards before Vega. And AMDGPU-...
Ashark's user avatar
  • 843
0 votes
1 answer
294 views

How to get 64-bit addressing, full RAM access using OpenCL with 2019 MacBook Pro 16" Intel/AMD

I have a 2019 MacBook Pro 16". It has an Intel Core i9, 8-core processor and an AMD Radeon Pro 5500M with 8 GB GPU RAM. I have the laptop dual booting Mac OS 12.4 and Windows 11. Running clinfo ...
Vince W.'s user avatar
  • 3,775
2 votes
1 answer
209 views

How to optimize SYCL kernel

I'm studying SYCL at university and I have a question about performance of a code. In particular I have this C/C++ code: And I need to translate it in a SYCL kernel with parallelization and I do this:...
user avatar
0 votes
1 answer
521 views

OpenCL Kernel and traditional loops

I'm studying OpenCL and I don't understand the relationship between traditional loop in a C/C++ code and kernel code. Just for be clear a situation like that: So my question is: In the traditional ...
user avatar
1 vote
1 answer
404 views

Work-item branch divergence in OpenCL, how does it work?

I'm studying something about OpenCL and I don't understand very well the concept of "work-item divergence or Divergent Control Flow". As we can see in the picture below, there are some warp ...
user avatar
1 vote
1 answer
1k views

Open CL programming with G++ in Windows

I am trying to write OpenCL programs in C++ using G++ compiler in Windows 10 but I am not able to find any SDK for my work. Nvidia CUDA requires Visual Studio compilers to work and AMD AMP SDK seems ...
Gokul Raj V's user avatar
6 votes
2 answers
26k views

How to make AMD GPU available by WSL for use with DALL-E Playground AI Sever

I'm trying to run and deploy the Dalle Playground on my local machine using an AMD GPU, I'm on Windows 11 with a WSL instance running. Link to Dalle Playground repo System OS: Windows 11 Pro - ...
Joel Gray's user avatar
  • 309

1
2 3 4 5 6