GPU Memory cost too much with platoon #84

mingxuan · 2016-12-08T02:52:14Z

I write a neural machine translation system with platoon.
The batch size is 80 and sync every 10 mini-batches.
I found that the memory cost about 4 times larger than the same system without platoon.
Does someone else have the same experience?

I have also test the "lstm" example, which cost about 5GB memory with 16 batch size and 1024 hidden size.
Could some else help me to find the problem?

nouiz · 2016-12-10T20:14:35Z

Is it CPU or GPU memory? How do you see that 4x difference? How many GPUs are used in parallel? Normally, it should not use more memory on the GPU. But it could use more memory on the CPU depending how you use it. Each process/GPU use extra CPU memory.

…

On Thu, Dec 8, 2016 at 3:52 AM, mingxuan ***@***.***> wrote: I write a neural machine translation system with platoon. The batch size is 80 and sync every 10 mini-batches. I found that the memory cost about 4 times larger than the same system without platoon. Does someone else have the same experience? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#84>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-wAprY7wXTSZFxX8IQ0OHanPhwNdks5rF3DegaJpZM4LHYb0> .

mingxuan · 2016-12-12T10:41:37Z

It's GPU memory. I use the command "nvidia-smi" to see the GPU memory cost.
I found that when use platoon, the memory cost is stable and the "GPU-util" is very closing to 100%.
While without platoon, my gpu cost will change rapidly during training and the "CPU-Util" will also vary form about 30% to 100%.
Would Platoon change the default config of Theano?
Thanks for your help.

mouna99 · 2017-04-19T09:37:05Z

I meet the same problem, and it is worse for me to have the "out of memory" error, so my nmt system can not train with platoon at all. Have you finally solved this problem?

Thanks for your help.

mingxuan · 2017-06-21T02:41:46Z

The problem may comes from NCCL and pygpu. I find that theano built with NCCL and pygpu cost much more memory than previous version.

cshanbo · 2017-06-21T05:31:06Z

Yes. The more memory cost does caused by the new back-end of Theano. We prefer to use THEANO_FLAGS=gpuarray.preallocate=0.95,... to pre-allocate GPU memory, where you could set 0.95 to any other digit \in (0, 1). See this issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Memory cost too much with platoon #84

GPU Memory cost too much with platoon #84

mingxuan commented Dec 8, 2016 •

edited

Loading

nouiz commented Dec 10, 2016 via email

mingxuan commented Dec 12, 2016

mouna99 commented Apr 19, 2017

mingxuan commented Jun 21, 2017

cshanbo commented Jun 21, 2017

GPU Memory cost too much with platoon #84

GPU Memory cost too much with platoon #84

Comments

mingxuan commented Dec 8, 2016 • edited Loading

nouiz commented Dec 10, 2016 via email

mingxuan commented Dec 12, 2016

mouna99 commented Apr 19, 2017

mingxuan commented Jun 21, 2017

cshanbo commented Jun 21, 2017

mingxuan commented Dec 8, 2016 •

edited

Loading