-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training with GPU, inference on CPU with a pickled model #244
Comments
You're probably seeing the effect of this change. Can you try setting |
I was referring to the same problem as mentioned here. The option you mentioned would not solve this, or would it (can't test it right now)? |
That thread is a little confusing, with multiple issues. Can you paste the traceback that you're getting? |
I get the same |
I have a solution that seems to work: class PortableNeuralNet(NeuralNet):
def __setstate__(self, state): # BBB for pickles that don't have the graph
with tempfile.TemporaryDirectory() as tmpdirname:
filename = os.path.join(tmpdirname, 'tmp_weights.pkl')
with open(filename, 'wb') as f:
pickle.dump(state['_params_temp_save'], f, -1)
del state['_params_temp_save']
self.__dict__.update(state)
self.initialize()
self.load_params_from(filename)
def __getstate__(self):
state = dict(self.__dict__)
params = self.get_all_params_values()
for key in list(state.keys()): # to avoid RuntimeError
if key == 'train_history_':
continue
if key.endswith('_'):
del state[key]
del state['_output_layer']
del state['_initialized']
state['_params_temp_save'] = params
return state I don't know whether this is worth integrating or not. Instead of a new class, it could be a switch in the NeuralNet class. What do you think, Daniel? |
@BenjaminBossan Do you mind describing the difference between this and the implementation that was removed in #228? |
I looked at the problem a little bit more, and understand the issue now better. The code that was removed in #228 had the same issue since it did not do anything with the layer instances ( So then I tried to come up with my own variation of the code that you proposed: class YetAnotherPortableNeuralNet(NeuralNet):
def __setstate__(self, state):
params = state.pop('__params__', None)
self.__dict__.update(state)
self.initialize()
if params is not None:
self.load_params_from(params)
def __getstate__(self):
state = dict(self.__dict__)
if self._initialized:
params = self.get_all_params_values()
else:
params = None
for attr in (
'train_iter_',
'eval_iter_',
'predict_iter_',
'_initialized',
'_get_output_fn_cache',
'_output_layer',
'layers_',
'layers',
):
if attr in state:
del state[attr]
state['__params__'] = params
return state Because I thought your proposal was good, just needed a bit of refactoring (to remove writing out the file, and I also wanted to be more explicit about attributes removed on the way out). But then I found out that this approach has its own problems. Namely, if I'm thinking that as long as we can't fix this in the general case, we shouldn't put code like this into nolearn.lasagne itself. But we can point people to solutions that might work for them. One such solution might be the this script inside of pylearn2 which I'm about to try out. |
Right, I did not think about that possibility. We could raise an error in that case but it is not a satisfying solution.
I believe @alattner tried that to no avail.
I agree but it would be nice to be able to somehow use this kludge by checking out a specific nolearn branch or something. |
No, I haven't tried that script inside pylearn2. I tried the |
I tried the script and it failed with some weird recursion error. |
|
OK just let me know if that's a joke or if it actually works. ;-) |
I would not try it :) Anyway, do you see a working solution for this? |
I'll take another look next week. So far didn't have much luck. |
So much for not breaking code in this PR :) For those who use this snippet, in the part shown below, change
|
Is there any update on how to train on GPU, save, and load on CPU for inference? |
@kungfujam Note that as per the original post, you can always do this:
This issue is about not being able to use a 'pickled' network trained on a GPU and use it in a CPU environment, which is sometimes more convenient. |
Thanks for clarification. Am using that method currently. Not a huge deal but would be nice to pickle. |
Training with Cuda on a GPU machine, it is not possible to load the model on a machine without GPU in a straightforward fashion because some theano parameters are Cuda Arrays. My reading of this is that this is a theano thing and that there are only workarounds. One way is to
save_params_to
on the GPU machineload_params_from
on that machine.However, this requires quite some extra effort in some situations, e.g. if the NeuralNet is part of an sklearn Pipeline. You have to
I wonder if there is a better way. Maybe it is possible to add a method to NeuralNet that converts Cuda Arrays to normal theano shared variables? Does someone know of a better approach?
The text was updated successfully, but these errors were encountered: