Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM example rise a dtype difference error #34

Open
sjhddh opened this issue Jul 19, 2015 · 2 comments
Open

LSTM example rise a dtype difference error #34

sjhddh opened this issue Jul 19, 2015 · 2 comments

Comments

@sjhddh
Copy link

sjhddh commented Jul 19, 2015

Hello,
I just used exactly the same example in Passage/mnist.py
The only modification is to change GatedRecurrent into LstmRecurrent:

import ...
...

trX, teX, trY, teY = load_mnist()

#Use generic layer - RNN processes a size 28 vector at a time scanning from left to right
layers = [
    Generic(size=28),
    LstmRecurrent(size=512, p_drop=0.2),
    Dense(size=10, activation='softmax', p_drop=0.5)
]

#A bit of l2 helps with generalization, higher momentum helps convergence
updater = NAG(momentum=0.95, regularizer=Regularizer(l2=1e-4))

#Linear iterator for real valued data, cce cost for softmax
model = RNN(layers=layers, updater=updater, iterator='linear', cost='cce')
model.fit(trX, trY, n_epochs=20)

tr_preds = model.predict(trX[:len(teY)])
te_preds = model.predict(teX)

tr_acc = np.mean(trY[:len(teY)] == np.argmax(tr_preds, axis=1))
te_acc = np.mean(teY == np.argmax(te_preds, axis=1))

# Test accuracy should be between 98.9% and 99.3%
print 'train accuracy', tr_acc, 'test accuracy', te_acc

However, there arose an error:

Traceback (most recent call last):
  File "/.../ex2.py", line 24, in <module>
    model = RNN(layers=layers, updater=updater, iterator='linear', cost='cce')
  File "/.../models.py", line 44, in __init__
    self.y_tr = self.layers[-1].output(dropout_active=True)
  File "/.../layers.py", line 297, in output
    X = self.l_in.output(dropout_active=dropout_active)
  File "/.../layers.py", line 190, in output
    truncate_gradient=self.truncate_gradient
  File "/.../theano/scan_module/scan.py", line 1042, in scan
    scan_outs = local_op(*scan_inputs)
  File "/.../theano/gof/op.py", line 507, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/.../theano/scan_module/scan_op.py", line 374, in make_node
    inner_sitsot_out.type.dtype))
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (`outputs_info` in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 4) has dtype float32, while the result of the inner function (`fn`) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.

How could I fix this? or is there anything I can do to make the program run smoothly?

@naeemulhassan
Copy link

I was also having the same problem. Running like below worked for me.

THEANO_FLAGS='floatX=float32'  python myprogram.py

@madisonmay
Copy link
Contributor

You can also configure these settings in your ~/.theanorc file.

[global]
floatX = float32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants