Illegal Memory Access Problem CUDA - GPU - Julia Discourse

Illegal memory access problem CUDA Specific Domains GPU November 20, 2021, 5:28pm 1

I am creating some dynamic shared memory boolean arrays in kernel, and it give me consistently

ERROR: LoadError: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) Stacktrace: [1] throw_api_error(res::CUDA.cudaError_enum) @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\error.jl:105 [2] query @ C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:102 [inlined] [3] synchronize(stream::CuStream; blocking::Bool) @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:130 [4] synchronize (repeats 2 times) @ C:\Users\1\.julia\packages\CUDA\9T5Sq\lib\cudadrv\stream.jl:117 [inlined] [5] unsafe_copyto!(dest::Vector{UInt16}, doffs::Int64, src::CuArray{UInt16, 1, CUDA.Mem.DeviceBuffer}, soffs::Int64, n::Int64) @ CUDA C:\Users\1\.julia\packages\CUDA\9T5Sq\src\array.jl:389 [6] copyto! @ C:\Users\1\.julia\packages\CUDA\9T5Sq\src\array.jl:349 [inlined] [7] getindex(xs::CuArray{UInt16, 1, CUDA.Mem.DeviceBuffer}, I::Int64) @ GPUArrays C:\Users\1\.julia\packages\GPUArrays\3sW6s\src\host\indexing.jl:89 [8] top-level scope @ c:\GitHub\GitHub\NuclearMedEval\src\playgrounds\convolutionsPlay.jl:71

I am wondering what is wrong here - my assumption is that true size of boolean array in bytes is amount of its entries divided by 8 so 1 bit per entry - am I correct?

using CUDA dataBdim= (32,24,32) fp = CUDA.zeros(UInt16,1) sumInBits = (dataBdim[1]+2)+(dataBdim[2]+2)+(dataBdim[3]+2)+dataBdim[1]+dataBdim[2]+dataBdim[3] shmemSum = cld(sumInBits,8)#in bytes function testKernelA(dataBdim,fp) resShmem = @cuDynamicSharedMem(Bool,((dataBdim[1]+2),(dataBdim[2]+2),(dataBdim[3]+2))) sourceShmem = @cuDynamicSharedMem(Bool,(dataBdim[1],dataBdim[2],dataBdim[3])) # naive loop just for presentation of problem for i in 1:(dataBdim[1]+2),j in 1:(dataBdim[2]+2), n in 1:(dataBdim[3]+2) resShmem[i,j,n]=false end for i in 1:(dataBdim[1]),j in 1:(dataBdim[2]), n in 1:(dataBdim[3]) sourceShmem[i,j,n]=false end fp[1]=1 return end @cuda threads=(32,5) blocks=(2) shmem=shmemSum testKernelA(dataBdim,fp) fp[1] November 22, 2021, 7:04am 2 Jakub_Mitura:

sumInBits = (dataBdim[1]+2)+(dataBdim[2]+2)+(dataBdim[3]+2)+dataBdim[1]+dataBdim[2]+dataBdim[3]

How is this ‘in bits’ if you’re nowhere multiplying by sizeof(UInt16) (or 8*sizeof if you actually want this size to be bits)?

November 22, 2021, 6:07pm 3

You are right still changing it to

sumInBits = (dataBdim[1]+2)*(dataBdim[2]+2)*(dataBdim[3]+2)+dataBdim[1]*dataBdim[2]*dataBdim[3]

do not solve the problem, but is size of needed if this is boolean array? is it not just bitarray?

November 23, 2021, 6:55am 4

There’s still no sizeof in that expression? And it is needed, CuArray{Bool} doesn’t have the same bitarray-like optimization implemented.

Also, try out CUDA.jl#master, there the dynamic memory accesses are bounds checked so will throw a BoundsError instead of crashing CUDA with an illegal memory access.

November 23, 2021, 9:23am 5

Ok, so can I use bit type in shared memory ?

And what do you mean by master, I suppose you deduced that I am using some branch, what is not intended by me , I had found somewhere that shared memory initialization macro should now be a function - this is what you mean ?

Thanks !

November 23, 2021, 10:01am 6 Jakub_Mitura:

And what do you mean by master

The master branch on GitHub, i.e. ] add CUDA#master

You’re probably on the latest stable release (if you didn’t do anything fancy).

1 Like November 23, 2021, 10:09am 7

Ok , thanks

November 24, 2021, 6:00am 8

So I already understand it i suppose :slightly_smiling_face:, still is there a way to use 3 dimensional bit array in shared memory ? It would be extremely usefull .

November 24, 2021, 7:22am 9

No, the BitArray optimization has not been implemented for CuArray. Just use a regular Bool array. If space is a problem, you’ll need to look into implementing BitArray’s packed layout.

1 Like
Topic Replies Views Activity
CuDynamicSharedArray error GPU gpu 2 540 November 25, 2021
@cuDynamicSharedMem : allocating beforehand? GPU 2 1371 January 2, 2018
I don't understand why it is slower with CuStaticSharedArray New to Julia gpu , cuda , sharedarrays , cudajl 9 342 March 17, 2025
@cuStaticSharedMem multidimensional indexing seems not to work GPU question 2 477 October 30, 2020
Simple CUDA sum kernel General Usage 0 338 January 19, 2021
Unfortunately, your browser is unsupported. Please switch to a supported browser to view rich content, log in and reply.

Tag » Code 700 Reason An Illegal Memory Access Was Encountered