-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out-of-memory in gpuNUFFT/CUDA/inc/cuda_utils.hpp at line 40 #68
Comments
I'm sorry that you are experiencing crashes. Can you give me some more details about your setup and data structure (OS, data/grid dimensions)? Are you using the Matlab wrapper or a standalone C++ application? |
Of course! I'm using the Matlab wrapper to grid about 36 radial spokes at the time to a 128 matrix. If by OS you mean oversampling it's 1.5 and if you mean operating system it's Ubuntu 14.04. I used a kernel size 3 and when I looked into it I noticed I was using an unusually small sector width of 3. I ran a batch job with 40216 images reconstructed with 16 iterations of CG-SENSE, so that would be 1286912 gridding operations (forward and adjoint). In that run I experienced the "out of memory" error about 5 times, so that should give an idea of how often it occurs. |
Thanks for sharing the information. Sector width of 3 is indeed small, but should basically work. The batch job is quite large - are you reusing the Gridding operator instance inside Matlab? I've also experienced issues with huge batch jobs where GPU memory fragmentation caused these "out of memory" errors, although enough device memory had been available. |
No, 40216 unique gridding operators. I do clear them in between though. I was thinking it might be a memory leak, but you're saying about memory fragmentation makes sense. Granted this is not a normal use case, we might be able to write it off as "shit happens" and close the issue? |
Ok, no, I wouldn't just close it ;) Is every trajectory unique in your 40k measurements? So what I would try is to reuse existing operators or to allocate multiple operators at once... |
They are all unique. Binned reconstructions, the sheer amount is due to the fact that I'm batching a year worth of acquisitions. New error that is more descriptive;
|
Was there ever a resolution to this? I have encountered a potentially similar out of memory issue recently when reconstructing a single 2D slice from 3D data (imageDim goes from [128 128 128] to [128 1 128]). This seems to work in most cases, however, when I repeatedly call the gpuNUFFT object it crashes and closes Matlab. I have discovered this happens when mex_gpuNUFFT_adj_atomic_f is called, but unfortunately because it quickly crashes Matlab I do not get any feedback on the what the problem is. I have tested it using the same data with different sizes and it seems to happen arbitrarily. |
No, I didn't have the time to tackle the problem. |
I am having the same problem when reconstructing several datasets after each other. GPU memory is not fully cleared after each reconstruction. It adds up and after several datasets reaches the maximum of the GPU memory, which then leads to a crash. |
I falsely believed that running |
Thanks a lot. That solved my problem. |
Thanks for the follow-up. To be consistent, one should also clear
|
Working through a similar issue: curt@green:~/ngfn_simulation_recon/matlab$ nice matlab At this point trying to figure out if the options settings have any effect (note OS = 1.25, KW = 2.5, imageDim = [480 490 480]):
It should not be an actual out of memory, although it is a big 3d dataset and gridded k-space (estimate total approaching 1GB). It seems that the memory usage is a fair amount larger than just the data and oversampled gridding space (multiple copies made in gpu?). any ideas appreciated, curt@green:~$ nvidia-smi curt@green:~$ cat matlab_crash_dump.97123-1
Configuration: Fault Count: 1 Abnormal termination: Register State (from fault): R8 = 0000000000000000 R9 = 00007fe0ea0adb10 RIP = 00007fe1062a4f47 EFL = 0000000000000246 CS = 0033 FS = 0000 GS = 0000 Stack Trace (from fault): This error was detected while a MEX-file was running. If the MEX-file |
Slightly different behavior after cleaning up arrays used to all be single. This probably does not save anything on the gpu (since being loaded as single?) but frees more memory on the cpu side. curt@green:~/ngfn_simulation_recon/matlab$ nice matlab
Configuration: Fault Count: 1 Abnormal termination: Register State (from fault): R8 = 00007f9dfc6ba974 R9 = 00007f9fac05fa90 RIP = 00007f9f00000001 EFL = 0000000000010202 CS = 0033 FS = 0000 GS = 0000 Stack Trace (from fault): This error was detected while a MEX-file was running. If the MEX-file
Configuration: Fault Count: 2 Abnormal termination: Register State (from fault): R8 = 00007f9dfc6ba974 R9 = 00007f9fac05fa90 RIP = 00007f9f00000001 EFL = 0000000000010202 CS = 0033 FS = 0000 GS = 0000 Stack Trace (from fault): Abnormal termination: Register State (captured): R8 = 000000000000ffff R9 = 29286574616e696d RIP = 00007f9fbbfe70be EFL = 00007f9fbc1bcbc0 CS = 001a FS = 0000 GS = 0000 Stack Trace (captured): |
I'm experiencing sporadic crashes with an out-of-memory error message that points towards cuda_utils.hpp line 40. Running on Nvidia Tesla K40M (12Gb GDDR5) and I'm working on 2D data, so I'm nowhere near capacity. Could this be a memory leak?
out of memory in /home/alex/gpuNUFFT/CUDA/inc/cuda_utils.hpp at line 40
The text was updated successfully, but these errors were encountered: