You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 3, 2024. It is now read-only.
Hello everyone,
I get a memory error when benchmarking execution time with hipCaffe.
This only occurs with large input data: shape: { dim: 10 dim: 3 dim: 1024 dim: 2048 }.
However, there is no error with the default input data size: shape: { dim: 10 dim: 3 dim: 224 dim: 224 }
Error
I0914 12:29:28.475584 218425 net.cpp:228] conv2/3x3 does not need backward computation.
I0914 12:29:28.475591 218425 net.cpp:228] conv2/relu_3x3_reduce does not need backward computation.
I0914 12:29:28.475597 218425 net.cpp:228] conv2/3x3_reduce does not need backward computation.
I0914 12:29:28.475603 218425 net.cpp:228] pool1/norm1 does not need backward computation.
I0914 12:29:28.475610 218425 net.cpp:228] pool1/3x3_s2 does not need backward computation.
I0914 12:29:28.475616 218425 net.cpp:228] conv1/relu_7x7 does not need backward computation.
I0914 12:29:28.475622 218425 net.cpp:228] conv1/7x7_s2 does not need backward computation.
I0914 12:29:28.475628 218425 net.cpp:228] data does not need backward computation.
I0914 12:29:28.475632 218425 net.cpp:270] This network produces output prob
I0914 12:29:28.475728 218425 net.cpp:283] Network initialization done.
I0914 12:29:28.476884 218425 caffe.cpp:355] Performing Forward
I0914 12:30:55.425948 218425 caffe.cpp:360] Initial loss: 0
I0914 12:30:55.426497 218425 caffe.cpp:361] Performing Backward
I0914 12:30:55.426555 218425 caffe.cpp:369] *** Benchmark begins ***
I0914 12:30:55.426565 218425 caffe.cpp:370] Testing for 2 iterations.
error: 'hipErrorMemoryAllocation'(1002) at src/caffe/syncedmem.cpp:56
Steps to reproduce
hipCaffe with Makefile.config parameters:
USE_MIOPEN := 1
USE_ROCBLAS := 1
Then execute the network: /home/intel/hipCaffe/build/tools/caffe time -gpu 0 -iterations 2 -model /home/intel/hipCaffe/models/bvlc_googlenet/deploy.prototxt
@xxgtxx - Thanks for the error report. Looks like it is running out of memory due to the chosen larger data size. Can you please try dropping the batch size (specifically this parameter: dim: 10) and see if that helps.
Based on a standard bvlc/caffe GoogLeNet test I did today on another HW vendor's platform, it didn't look like your specific configuration fits into that device's (quite large) memory either. Have you observed something different?
Can't agree more with parallelo. When the training image is too large to allocate the GPU memory, several ways you can do, 1) reduce the batch size (seems you already tried), 2) resize the image to a small dimension, 3) if the object feature will vanish after resizing the image, you need to crop the image to small ones. I think that is data pre-processing related issue, not directly issue on caffe.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Issue summary
Hello everyone,
I get a memory error when benchmarking execution time with hipCaffe.
This only occurs with large input data: shape: { dim: 10 dim: 3 dim: 1024 dim: 2048 }.
However, there is no error with the default input data size: shape: { dim: 10 dim: 3 dim: 224 dim: 224 }
Error
I0914 12:29:28.475584 218425 net.cpp:228] conv2/3x3 does not need backward computation.
I0914 12:29:28.475591 218425 net.cpp:228] conv2/relu_3x3_reduce does not need backward computation.
I0914 12:29:28.475597 218425 net.cpp:228] conv2/3x3_reduce does not need backward computation.
I0914 12:29:28.475603 218425 net.cpp:228] pool1/norm1 does not need backward computation.
I0914 12:29:28.475610 218425 net.cpp:228] pool1/3x3_s2 does not need backward computation.
I0914 12:29:28.475616 218425 net.cpp:228] conv1/relu_7x7 does not need backward computation.
I0914 12:29:28.475622 218425 net.cpp:228] conv1/7x7_s2 does not need backward computation.
I0914 12:29:28.475628 218425 net.cpp:228] data does not need backward computation.
I0914 12:29:28.475632 218425 net.cpp:270] This network produces output prob
I0914 12:29:28.475728 218425 net.cpp:283] Network initialization done.
I0914 12:29:28.476884 218425 caffe.cpp:355] Performing Forward
I0914 12:30:55.425948 218425 caffe.cpp:360] Initial loss: 0
I0914 12:30:55.426497 218425 caffe.cpp:361] Performing Backward
I0914 12:30:55.426555 218425 caffe.cpp:369] *** Benchmark begins ***
I0914 12:30:55.426565 218425 caffe.cpp:370] Testing for 2 iterations.
error: 'hipErrorMemoryAllocation'(1002) at src/caffe/syncedmem.cpp:56
Steps to reproduce
hipCaffe with Makefile.config parameters:
USE_MIOPEN := 1
USE_ROCBLAS := 1
Change data input size to:
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 10 dim: 3 dim: 1024 dim: 2048 } }
Then execute the network:
/home/intel/hipCaffe/build/tools/caffe time -gpu 0 -iterations 2 -model /home/intel/hipCaffe/models/bvlc_googlenet/deploy.prototxt
Your system configuration
Operating system: Ubuntu 16.04
Kernel: 4.11.0-kfd-compute-rocm-rel-1.6-148
CPU: Intel Skylake
GPU: AMD Radeon Vega Frontier Edition @ 16GB
The text was updated successfully, but these errors were encountered: