Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in the server / CPU Training with low batch size #97

Open
jindrahelcl opened this issue Sep 30, 2016 · 4 comments
Open

Memory leak in the server / CPU Training with low batch size #97

jindrahelcl opened this issue Sep 30, 2016 · 4 comments

Comments

@jindrahelcl
Copy link
Member

During the run of the server, its memory consumption should be constant. Now, it seems to be growing with the amount of the received data.

@jindrahelcl jindrahelcl changed the title Fix memory leak in the server Memory leak in the server Sep 30, 2016
@jindrahelcl jindrahelcl changed the title Memory leak in the server Memory leak in the server / CPU Training with low batch size Apr 17, 2017
@kocmitom
Copy link
Contributor

Narazil jsem na tenhle problem mnohokrat, naposledy kdyz jsem si hral s velikosti batche tim padem se provadelo logovani a validace casteji. Netusim v cem je problem ani si nepamatuju vsechny doby kdy to padalo.

Napadlo me, ze by to mohlo byt i zapricenene rostoucim logem z qsub, ale to mohu vyvratit, protoze u jobu co mi spadnul naposled mel 12MB.

Osobne podezrivam validaci. Ale nemam k tomu zadne podklady.

@jindrahelcl
Copy link
Member Author

Logem z qsub to rozhodně není, protože tu paměť to žere ať to pouštíš kde chceš.

@jindrahelcl
Copy link
Member Author

Musí se někde ukládat mezivýsledky, který se nemažou. Nevím, jestli je v naší moci to opravit, nebo jestli je to bug v tensorflow.

@jlibovicky
Copy link
Contributor

Probably cased by summaries: tensorflow/tensorflow#8265

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants