All Questions
76 questions
0
votes
1
answer
61
views
What do the Instruction Statistics fields in Nsight Compute mean? How do they relate to elapsed cycles?
In my example, what is the meaning of 'Executed Instructions'? According to the literal meaning, it would mean how many instruction have been executed.
But how does it relate to the total run time (...
0
votes
1
answer
96
views
How can I create a container in which to use the Nvidia Nsight Systems graphical interface?
I am looking to create a container in which I can work with the graphical interface of the Nvidia Nsight Systems tool, to be able to obtain application reports with cuda and python, I have found ...
0
votes
0
answers
183
views
How do simple warps causing low warp occupansy and high register usage?
During the warp occupancy investigation of my gbuffer pass, I found even if I simplify the scene and the shader, the nsight still reports a very low warp occupancy, or even much lower than the ...
1
vote
0
answers
277
views
Can NVIDIA Nsight still be used to debug shaders?
Numerous online resources claim it is possible to debug OpenGL shaders using NVIDIA Nsight Visual Studio Edition. Here is an old video of it being done.
However, the Nsight VSE page mentions "the ...
1
vote
1
answer
440
views
Power Usage Profiling in Nsight?
New to Nsight and GPU programming. I need a way to evaluate the affect my code has on power usage in the GPU.
This article from 2013 shows that the feature was part of Nsight's toolset at some point, ...
0
votes
1
answer
760
views
Error in profiling shared memory atomic kernel in Nsight Compute
I am trying the global atomics vs shared atomics code from NVIDIA blog https://developer.nvidia.com/blog/gpu-pro-tip-fast-histograms-using-shared-atomics-maxwell/
But when I am trying to profile with ...
0
votes
1
answer
376
views
How to use CUPTI to get metrics related to Launch Metrics, Source Metrics and Instructions Per Opcode Metrics
I am able to use ncu to get the metrics related to Launch Metrics, Source Metrics and Instructions Per Opcode Metrics (found here). However I am unable to use CUPTI to get the values after modifying ...
0
votes
1
answer
313
views
Command to run callback_profiling sample from CUPTI
I am running the sample code available for Nvidia CUDA CUPTI in /usr/local/cuda-11.8/extras/CUPTI/samples/callback_profiling. There is a Makefile, but I want to run it using single command (without ...
0
votes
1
answer
282
views
Difference in SASS using cuobjdump and Nsight compute
I have a simple kernel as
__global__ void hello_cuda() {
int a = 10;
printf("hello from GPU\n");
}
When I use Nsight compute to see the Source and SASS section, I see:
# Address ...
0
votes
1
answer
1k
views
NSight Compute not showing achieved occupancy in the metrics
I want to calculate the achieved occupancy and compare it with the value that is being displayed in Nsight Compute.
ncu says: Theoretical Occupancy [%] 100, and Achieved Occupancy [%] 93,04. What ...
1
vote
1
answer
818
views
Nsight Compute profiling of a __device__ function in a kernel
I am trying to use Nsight Compute to profile kernels in my CUDA code. But how do I profile functions inside a kernel? Say for example, I have 2 functions (device functions) in a kernel (global). ...
0
votes
1
answer
3k
views
Nsys Does not show the CUDA kernels profiling output
My system is V100 with the following information:
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.6 |
NVIDIA Nsight Systems version 2021.5.2.53-28d0e6e
sudo sh -c “echo 2 >/proc/...
0
votes
1
answer
2k
views
How to see NVTX markers in Nvidia Nsight Systems? With host and guest being the same Windows machine
I am trying profiling CPU/GPU applications, using Nsight suite.
Currently trying to understand a stuttering problem, I added a range around the simulation step (taking place on the CPU):
#include &...
1
vote
1
answer
215
views
OpenGL - Is there a way to track actually used memory allocated by glBufferData / glBufferSubData?
There is a big codebase which allocates empty fixed size of GPU memory using glBufferData function, and fills/updates these empty allocated space partially using glBufferSubData. Since not all of the ...
1
vote
0
answers
1k
views
How to get detailed Nvidia GPU usage?
Nvidia-smi only provides a few metrics to measure GPU utilization. Most importantly, utilization.gpu represents the percent of time over the past sample period during which one or more kernels was ...
0
votes
1
answer
2k
views
nsys profile multiple processes
I'd like to experiment with MPS on Nvidia GPUs, therefore I'd like to be able to profile two process running in parallel.
With the, now deprecated, nvprof, there used to be an option "--profile-...
1
vote
0
answers
340
views
Nsys Profile with MPMD(multiple program and multiple data) simulation
I am trying to profile a MPI+OPENACC program with nsys.
I am using OpenMPI(3.1.6) from Nvidia HPC SDK(20.7) with UCX enabled.
There are three exectuables, exec1, exec2, exec3. I want to profile for ...
0
votes
1
answer
2k
views
Tracing custom CUDA kernels with Nsight Systems
I work on library which is implemented in C++20 and CUDA 11. This library is called from Python via ctypes through a C API that just exchanges JSON strings. We compile it using Clang 11.
In order to ...
1
vote
1
answer
2k
views
NVIDIA Nsight Systems CLI not getting memory statistics
I'm using NVIDIA Nsight Systems cli (nsys) to profile a simple cuda program (vectors adding). I've already checked the documentation but I think I'm missing something.
I'm running the nsys profile ...
0
votes
1
answer
1k
views
How to measure the amount of data copied in NVIDIA nsight systems?
Trivia
In NVIDIA Nsight Systems you can use the --stats=true flag to get the details for data transfer between GPU and CPU. The output includes a section similar to what follows:
CUDA Memory Operation ...
0
votes
1
answer
129
views
Which OpenACC directive will tell compiler to execute a statement on device only?
I am learning OpenACC with Fortran (with a suite of tools from Nvidia) and am doing it by porting my implementation of the Conjugate Gradient (CG) solver to GPUs.
Clearly, I am trying to keep as much ...
0
votes
2
answers
255
views
NVIDIA Nsight waring: OpenACC injection initialization failed. Is the PGI runtime version greater than 15.7?
I am trying to venture into accelerating my Fortran 2003 programs with OpenACC directives on my Ubuntu 18.04. workstation with Nvidia GeForce RTX 2070 card. To that end, I have installed Nvidia HPC-...
1
vote
1
answer
857
views
Achieved Occupancy column is not shown is Nsight Profiling result
I have faced a problem that is very weird to me. I can not see the achieved occupancy column in Nsight Performance Analysis output. I am using Geforce 920M GPU, NVIDIA driver of version 425.31, Nsight ...
1
vote
0
answers
224
views
Can a GLSL shader be made to store 'intermediate' results for debugging purposes with tools like NSight?
I am writing a rendering system with OpenGL 4.6 and debugging it using NVIDIA's NSight. It has been frustrating to debug issues with my lighting, normal maps / textures, etc. My usual solution to this ...
1
vote
1
answer
491
views
Can I debug CUDA on the device that drives the display output?
I develop on VS2012. I have 3 monitors connected to my pc with one GTX 960 graphic card.
I knew that it's impossible to debug CUDA on the same device that drives the display output. Maybe I'm reading ...
0
votes
1
answer
931
views
nsight EE and nvvp both crash during startup on Ubuntu 16.10
Whenenevr I start both applications they crash after the splash-screen appears. A small dialog appears with the message an error has occurred. see the log file null (I don't know where to find said ...
2
votes
1
answer
2k
views
Installing CUB in nvidia nsight
I want to use CUB with NVIDIA Nsight. I looked for tutorials on the internet for doing that, but I didn't find anything, even in the official pages pf CUB.
What do I need to do in order to use CUB in ...
1
vote
1
answer
944
views
Attach a running OpenGL to NSight for graphics debugging
How do i attach a running OpenGL program to NSight for graphics debugging?
This link talks about how to attach a CUDA application but the same procedure is not applicable for performing graphics ...
1
vote
1
answer
7k
views
How to debug (GLSL) shaders using Nsight?
How can I debug glsl shaders using Nsight?
I am using Nsight Visual Studio Edition 5.2. I've tried using Nsight Visual Studio Edition 5.1. These both don't work. What I mean is that I've tried using ...
0
votes
0
answers
260
views
"Object is owned by other context" error
I am testing some OpenGL code.Using OpenGL 4.3.Nvidia GTX960M GPU.Windows10 64bit.Driver version:364.72
Trying to debug with NVidia NSight (version:4.7).The app is running fine untill I open NSight ...
1
vote
1
answer
2k
views
What is the correct CUDA project configuration when profiling in Nsight Eclipse 7.5 in order to use NVTX?
I am trying to profile a CUDA program because I want to verify the sequential performance by using NVTX tools and compare it against it corresponding heterogeneous performance.
I recently found this ...
0
votes
1
answer
1k
views
Upper Limit on Matrix Size for Multiplication using cublas gemm function (cublasSgemm)
This is the first time ever that I have not been able to get help from answers of previously posted questions.
I have been using cublasSgemm quite successfully for multiplying square matrices.
But, ...
1
vote
2
answers
4k
views
Visual Studio Nsight "Cuda Toolkit V7.5 directory does not exist" Error
I am trying to start programming CUDA in windows 10. I have installed Visual Studio 2013 community version and I have also downloaded and installed the CUDA toolkit 7.5 for windows platform from ...
2
votes
1
answer
575
views
Remote Development with NSight 6.5 with "indirect" ssh
Suppose I can log in to a gpu sever named gpu1.sp.sw, and there are gpu2.sp.sw and gpu3.sp.sw to which I cannot log in directly but can be reached by ssh gpu-2, ssh gpu-3, after I am already on gpu1....
0
votes
1
answer
71
views
Does Nsight Visual Studio Edition support NVIDIA Quadro K5100M
It is not listed in "Nsight Visual Studio Edition Supported GPUs (Full List)", but I think it is strange if Nsight does not support this card. Does anybody know for sure whether it is supported or not?...
2
votes
1
answer
655
views
how to make a meaning of memory statistics section of Nsight profiling?
i'm using Geforce 820m with
GPU Clock rate: 1124 MHz (1.12 GHz)
memory Clock rate: 900 Mhz
Memory Bus Width: ...
0
votes
1
answer
92
views
Visual Studio 2010, NVIDIA AndroidWorks and some questions about debugging Android Apps
I have a Visual Studio 2010 Professional.I've installed NVIDIA AndroidWorks.I have a tablet with Android 4.1.1 also.I`ve been tried to debug simple android NDK application in Visual Studio 2010 ...
-1
votes
1
answer
442
views
IntelliSense: identifier "IDXGISwapChain1" is undefined
I have saved a captured frame using NVIDIA Nsight and when I open the saved solution file I get the following itellisense error:
IntelliSense: identifier "IDXGISwapChain1" is undefined [..]
I have ...
0
votes
2
answers
3k
views
CUDA missing host_defines.h centos 7
I'm trying to compile some samples of the CUDA toolkit V6.5 in the environnement Nsight Eclipse edition 6.5 under centos 7.0.
My Nvidia Card is a Quadro K2000.
So my problem is when I try to build ...
3
votes
2
answers
363
views
Can I use NVIDIA nsight to troubleshoot WPF performance?
I have a WPF application with a bottleneck on the GPU. I thought I could use NVIDIA nsight to see what WPF is doing, but the setup documentation says I should disable WPF hardware acceleration. ...
1
vote
1
answer
1k
views
Running CUDA GUI samples from a passive (inactive) GPU
I managed to successfully run CUDA programs on a GeForce GTX 750 Ti while using a AMD Radeon HD 7900 as the rendering device (actually connected to the display) using this guide; for instance, the ...
2
votes
1
answer
5k
views
Nvidia Nsight connection to localhost failed
After many hours working on this trouble, I ask for help here.
I installed the latest Nvidia Nsight VS Edition 4.2 and I'm not able to connect to the localhost for local debugging. I always got this ...
0
votes
1
answer
305
views
Nsight Graphics Debugging "completed successfully" error
When attempting to "Start Graphics Debugging" in Nsight for a DLL that launches an EXE, I get the following output and no other things happen:
The operation completed successfully (System....
1
vote
1
answer
641
views
Eclipse Nsight: Insisting synchronization error
I'm getting really often the following message from Eclipse Nsight when I try to compile my code on a remote target system (in particular, a Jetson TK1):
I guess it happened because the remote system ...
0
votes
2
answers
2k
views
NSight (NVIDIA) does not work correctly using 'Pause and Capture frame' functionality with Visual Studio
I installed NSight for Visual Studio 2012 several days ago. But today there is something wrong with the 'Pause and Capture frame' functionality. Actually, when I click on the icon as showing below I ...
3
votes
1
answer
4k
views
Can I connect NVIDIA Nsight to a remote machine?
I don't have a graphics card which supports CUDA on my computer. Can I connect NVIDIA Nsight to a remote machine using ssh (or anything else)?
1
vote
2
answers
2k
views
Nvidia Nsight 4.0 cannot profile code in OpenGL 4.3
I am using Visual Studio 13 with Nvidia NSights 4.0. In my application I am doing a mix of different types of rendering but, for the purpose of testing the proiler, I did a simple rendering of a scene....
0
votes
1
answer
835
views
Active warps in Cuda programming
I am trying to do performance analysis for my code using Nsight IDE.
I have taken a simple example of matrix addition.
I am calling my kernel like :
VecAdd<<<1,BLOCK_SIZEBLOCK_SIZE>>>(dA,...
2
votes
1
answer
350
views
CUDA architecture -sm_11 compile issue in NSight
As my GPU device Quadro FX 3700 doesn't support arch>sm_11. I was not able to use relocatable device code (rdc). Hence i combined all the utilities needed into 1 large file (say x.cu).
To give a ...
2
votes
1
answer
1k
views
Using Nsight to debug CUDA codes in non-startup project which exports DLL
Is there any chance I can debug a Non-startup project which outputs DLL file inside a solution? I am using CUDA 5.0, GeForce GTX 670, VS2010, Nsight 3.0.013150, local host.
Currently, I got ...