Debugging CUDA can be a challenge on Windows. When a program crashes due to invalid memory access, the NSight debugger does not show where the error occurs. All CUDA threads are exited and no output is produced. You can try to use printf within the CUDA kernel to narrow down where and why the kernel …
Continue reading “Debugging invalid memory access in CUDA programs on Windows”