You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I find your code very helpful, but too much memory is consumed when the meta optimizer updates parameters of the model. On my computer, it always raises an error 'out of memory' when executes Line 140 of meta_optimizer.py.
I think it could consume less memory if the MetaModel class holds a flat version of parameters instead of wrapping a model. In this way, the MetaModel reshapes the parameters and computes result through nn.functional.conv/linear, so that the meta optimizer can directly use this flat version of parameters, without allocating extra memory for flatted parameters.
The text was updated successfully, but these errors were encountered:
I have kind of the same issue.
On the line of code: flat_params = self.f * flat_params - self.i * Variable(flat_grads), my computer take a lot of time (making the computation graph for 25000 parameters) and then I can't print flat_params (in normal running or in debugger mode).
I think my mac just don't have enought memory. A GPU is required to train meta-optimizer.
Nevermind that was not the problem, the problem was certainly version change in pytorch and so the operation: flat_params = self.f * flat_params - self.i * Variable(flat_grads) produce a 25450*25450 matrix (not support by my computer). I change to:
Hello, I find your code very helpful, but too much memory is consumed when the meta optimizer updates parameters of the model. On my computer, it always raises an error 'out of memory' when executes Line 140 of meta_optimizer.py.
I think it could consume less memory if the MetaModel class holds a flat version of parameters instead of wrapping a model. In this way, the MetaModel reshapes the parameters and computes result through nn.functional.conv/linear, so that the meta optimizer can directly use this flat version of parameters, without allocating extra memory for flatted parameters.
The text was updated successfully, but these errors were encountered: