Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastQA Throws InvalidArgumentError on ConcatOp #390

Open
antonyscerri opened this issue Sep 7, 2018 · 2 comments
Open

FastQA Throws InvalidArgumentError on ConcatOp #390

antonyscerri opened this issue Sep 7, 2018 · 2 comments

Comments

@antonyscerri
Copy link

Hi

Happy to try and provide more details/data as required. The minimal details of the problem are that while training a FastQA model using Tensorflow with latest code as of 24th August. With some mixes of data we are getting the following error (stack trace below). We are using TensorFlow 1.10.0 due to the machine build, so it maybe some incompatibility.

We can run on other similar datasets without any problem. I've looked for empty inputs but nothing obvious has jumped out. We are inputting quite short questions with supporting content of about 2000 characters long. We have seen the error with much longer content. I wasn't sure if it was some mix of the words in the question or answer and the vocabulary (using Glove 6B word vecs) causing an empty tensor. It looks like it not so straightforward and its in the internal graph computation but i'd like to get some assistance or hear if anyone else has experienced anything similar.

I was able to reduce one example to two input records which i could put through a newly initialised model and using the prediction call not training which would cause the same problem. Yet if you pass the records in individually both would return a prediction without any error.

Thanks

Tony

Traceback (most recent call last):
File ".../lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File ".../lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File ".../lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Expected concatenating dimensions in the range [0, 0), but got 0
[[Node: jtreader/fast_qa/cond_1/segment_top_k/concat = ConcatV2[N=2, T=DT_INT64, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](jtreader/fast_qa/cond_1/segment_top_k/Squeeze, jtreader/fast_qa/cond_1/segment_top_k/sub_1, jtreader/fast_qa/cond_2/GatherV2/axis)]]

@dirkweissenborn
Copy link
Collaborator

Hi,
I think I know what the bug is but cannot fix it right now. If my suspicions are correct, for a quick fix you should make sure that the number of examples in the dataset should not leave rest 2 when dividing by the batch-size. So either you change the batchsize or change the size of the dataset slightly (e.g., add another example or remove one). Please let me know if this works.

@antonyscerri
Copy link
Author

Hi

Thanks for that hint. I had a whole series of runs over various data splits and i can confirm that all the ones which failed had a modulo of 2 on the test set portion. It seems its ok to have a mod 2 in the training portion.

Thanks

Tony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants