You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, thanks for putting this together and sharing it. I have a PR to make this change to torch_intermediate_layer_getter.py:
try:
if self.keep_output:
output = self._model(*args, **kwargs)
else:
self._model(*args, **kwargs)
output = None
finally:
for h in handles:
h.remove()
In the happy path the code works great but if you try to use it in code that backs off your batch size post OOM, the hooks that are registered with the model's layers retain a reference to the ret tensor dictionary preventing it from being GCd (resulting in less available GPU space). Happy to push my branch with this change for you to review (though I think you might need to enable it somehow?). Thanks!
The text was updated successfully, but these errors were encountered:
Hey, thanks for putting this together and sharing it. I have a PR to make this change to
torch_intermediate_layer_getter.py
:In the happy path the code works great but if you try to use it in code that backs off your batch size post OOM, the hooks that are registered with the model's layers retain a reference to the
ret
tensor dictionary preventing it from being GCd (resulting in less available GPU space). Happy to push my branch with this change for you to review (though I think you might need to enable it somehow?). Thanks!The text was updated successfully, but these errors were encountered: