Training with GPU, inference on CPU with a pickled model #244

BenjaminBossan · 2016-04-01T16:39:37Z

Training with Cuda on a GPU machine, it is not possible to load the model on a machine without GPU in a straightforward fashion because some theano parameters are Cuda Arrays. My reading of this is that this is a theano thing and that there are only workarounds. One way is to

save_params_to on the GPU machine
initialize an identical NeuralNet on the CPU machine
load_params_from on that machine.

However, this requires quite some extra effort in some situations, e.g. if the NeuralNet is part of an sklearn Pipeline. You have to

save the Pipeline's steps except for the one containing the NeuralNet on the GPU machine
save the latter's parameters separately
load the Pipeline on the CPU machine
add a fresh NeuralNet to the right place in the Pipeline
load the parameters to that net.

I wonder if there is a better way. Maybe it is possible to add a method to NeuralNet that converts Cuda Arrays to normal theano shared variables? Does someone know of a better approach?

The text was updated successfully, but these errors were encountered:

dnouri · 2016-04-01T16:49:28Z

You're probably seeing the effect of this change.

Can you try setting config.reoptimize_unpickled_function to True?

BenjaminBossan · 2016-04-01T18:02:13Z

I was referring to the same problem as mentioned here. The option you mentioned would not solve this, or would it (can't test it right now)?

dnouri · 2016-04-01T21:04:30Z

That thread is a little confusing, with multiple issues. Can you paste the traceback that you're getting?

BenjaminBossan · 2016-04-02T08:36:27Z

I get the same Cuda not found. Cannot unpickle CudaNdarray error. Shared variables are saved in Cuda Arrays that you can't load on a machine without Cuda. One suggestion would be a method that somehow implements the solution proposed in the thread. I'm not sure though whether this would cover all use cases and whether there might not be a better solution.

BenjaminBossan · 2016-04-04T09:49:49Z

I have a solution that seems to work:

class PortableNeuralNet(NeuralNet):
    def __setstate__(self, state):  # BBB for pickles that don't have the graph
        with tempfile.TemporaryDirectory() as tmpdirname:
            filename = os.path.join(tmpdirname, 'tmp_weights.pkl')
            with open(filename, 'wb') as f:
                pickle.dump(state['_params_temp_save'], f, -1)

            del state['_params_temp_save']
            self.__dict__.update(state)

            self.initialize()
            self.load_params_from(filename)

    def __getstate__(self):
        state = dict(self.__dict__)
        params = self.get_all_params_values()

        for key in list(state.keys()):  # to avoid RuntimeError
            if key == 'train_history_':
                continue
            if key.endswith('_'):
                del state[key]
        del state['_output_layer']
        del state['_initialized']
        state['_params_temp_save'] = params
        return state

I don't know whether this is worth integrating or not. Instead of a new class, it could be a switch in the NeuralNet class. What do you think, Daniel?

dnouri · 2016-04-06T14:07:40Z

@BenjaminBossan Do you mind describing the difference between this and the implementation that was removed in #228?

dnouri · 2016-04-06T16:07:36Z

I looked at the problem a little bit more, and understand the issue now better. The code that was removed in #228 had the same issue since it did not do anything with the layer instances (layers_) in __getstate__.

So then I tried to come up with my own variation of the code that you proposed:

class YetAnotherPortableNeuralNet(NeuralNet):
    def __setstate__(self, state):
        params = state.pop('__params__', None)
        self.__dict__.update(state)
        self.initialize()
        if params is not None:
            self.load_params_from(params)

    def __getstate__(self):
        state = dict(self.__dict__)
        if self._initialized:
            params = self.get_all_params_values()
        else:
            params = None

        for attr in (
            'train_iter_',
            'eval_iter_',
            'predict_iter_',
            '_initialized',
            '_get_output_fn_cache',
            '_output_layer',
            'layers_',
            'layers',
                ):
            if attr in state:
                del state[attr]
        state['__params__'] = params
        return state

Because I thought your proposal was good, just needed a bit of refactoring (to remove writing out the file, and I also wanted to be more explicit about attributes removed on the way out).

But then I found out that this approach has its own problems. Namely, if self.layers is already a list of layer instances, it won't work, since those instances will then contain the cuda arrays which will then be pickled. Deleting those instances on the way out also doesn't work for obvious reasons.

I'm thinking that as long as we can't fix this in the general case, we shouldn't put code like this into nolearn.lasagne itself. But we can point people to solutions that might work for them. One such solution might be the this script inside of pylearn2 which I'm about to try out.

BenjaminBossan · 2016-04-06T16:26:09Z

But then I found out that this approach has its own problems. Namely, if self.layers is already a list of layer instances

Right, I did not think about that possibility. We could raise an error in that case but it is not a satisfying solution.

One such solution might be this script inside of pylearn2 which I'm about to try out.

I believe @alattner tried that to no avail.

I'm thinking that as long as we can't fix this in the general case, we shouldn't put code like this into nolearn.lasagne itself.

I agree but it would be nice to be able to somehow use this kludge by checking out a specific nolearn branch or something.

alattner · 2016-04-06T18:52:47Z

I believe @alattner tried that to no avail.

No, I haven't tried that script inside pylearn2. I tried the config.experimental.unpickle_gpu_on_cpu option with no success.

dnouri · 2016-04-06T19:09:45Z

I tried the script and it failed with some weird recursion error.

BenjaminBossan · 2016-04-06T20:17:47Z

sys.setrecursionlimit(10 ** 999)

dnouri · 2016-04-06T20:19:06Z

OK just let me know if that's a joke or if it actually works. ;-)

BenjaminBossan · 2016-04-07T16:49:22Z

I would not try it :)

Anyway, do you see a working solution for this?

dnouri · 2016-04-07T18:26:45Z

I'll take another look next week. So far didn't have much luck.

BenjaminBossan · 2016-09-26T16:50:15Z

So much for not breaking code in this PR :)

For those who use this snippet, in the part shown below, change '_output_layer' to '_output_layers':

        for attr in (
            'train_iter_',
            'eval_iter_',
            'predict_iter_',
            '_initialized',
            '_get_output_fn_cache',
            '_output_layer',
            'layers_',
            'layers',
                ):
            if attr in state:
                del state[attr]

JamesOwers · 2017-01-25T14:03:42Z

Is there any update on how to train on GPU, save, and load on CPU for inference?

dnouri · 2017-01-25T14:44:21Z

@kungfujam Note that as per the original post, you can always do this:

save_params_to on the GPU machine
initialize an identical NeuralNet on the CPU machine
load_params_from on that machine.

This issue is about not being able to use a 'pickled' network trained on a GPU and use it in a CPU environment, which is sometimes more convenient.

JamesOwers · 2017-01-25T15:12:47Z

Thanks for clarification. Am using that method currently. Not a huge deal but would be nice to pickle.

dnouri changed the title ~~Training with GPU, inference on CPU~~ Training with GPU, inference on CPU with a pickled model Jan 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training with GPU, inference on CPU with a pickled model #244

Training with GPU, inference on CPU with a pickled model #244

BenjaminBossan commented Apr 1, 2016

dnouri commented Apr 1, 2016

BenjaminBossan commented Apr 1, 2016

dnouri commented Apr 1, 2016

BenjaminBossan commented Apr 2, 2016

BenjaminBossan commented Apr 4, 2016

dnouri commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 6, 2016

alattner commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 7, 2016

dnouri commented Apr 7, 2016

BenjaminBossan commented Sep 26, 2016

JamesOwers commented Jan 25, 2017

dnouri commented Jan 25, 2017

JamesOwers commented Jan 25, 2017

Training with GPU, inference on CPU with a pickled model #244

Training with GPU, inference on CPU with a pickled model #244

Comments

BenjaminBossan commented Apr 1, 2016

dnouri commented Apr 1, 2016

BenjaminBossan commented Apr 1, 2016

dnouri commented Apr 1, 2016

BenjaminBossan commented Apr 2, 2016

BenjaminBossan commented Apr 4, 2016

dnouri commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 6, 2016

alattner commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 6, 2016

dnouri commented Apr 6, 2016

BenjaminBossan commented Apr 7, 2016

dnouri commented Apr 7, 2016

BenjaminBossan commented Sep 26, 2016

JamesOwers commented Jan 25, 2017

dnouri commented Jan 25, 2017

JamesOwers commented Jan 25, 2017