You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 30, 2022. It is now read-only.
Model training fails with below exceptions as part of the log...
(Seems like I cannot simply download the whole log and share as zip file?)
Do you need additional information to help?
2018-07-20 10:46:38.633 CEST worker-replica-1 gapic-google-cloud-logging-v2 0.91.3 has requirement google-gax<0.16dev,>=0.15.7, but you'll have google-gax 0.12.5 which is incompatible.
2018-07-20 10:46:38.634 CEST worker-replica-1 google-cloud-logging 1.0.0 has requirement google-cloud-core<0.25dev,>=0.24.0, but you'll have google-cloud-core 0.28.1 which is incompatible.
2018-07-20 10:46:39.018 CEST worker-replica-1 The script chardetect is installed in '/root/.local/bin' which is not on PATH.
The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 505, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=317506745947&resource=ml_job%2Fjob_id%2Fequipmentparts_1_1532076159&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22equipmentparts_1_1532076159%22
The text was updated successfully, but these errors were encountered:
I followed all steps from the tutorial (https://cloud.google.com/solutions/automating-iot-machine-learning). I also updated gcloud on my Mac before executing the steps...
Model training fails with below exceptions as part of the log...
(Seems like I cannot simply download the whole log and share as zip file?)
Do you need additional information to help?
2018-07-20 10:46:38.633 CEST worker-replica-1 gapic-google-cloud-logging-v2 0.91.3 has requirement google-gax<0.16dev,>=0.15.7, but you'll have google-gax 0.12.5 which is incompatible.
2018-07-20 10:46:38.634 CEST worker-replica-1 google-cloud-logging 1.0.0 has requirement google-cloud-core<0.25dev,>=0.24.0, but you'll have google-cloud-core 0.28.1 which is incompatible.
2018-07-20 10:46:39.018 CEST worker-replica-1 The script chardetect is installed in '/root/.local/bin' which is not on PATH.
The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 505, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 570, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 329, in main run(model, argv) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 465, in run dispatch(args, model, cluster, task) File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 509, in dispatch Trainer(args, model, cluster, task).run_training() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 206, in run_training self.args.batch_size) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 307, in build_train_graph return self.build_graph(data_paths, batch_size, GraphMod.TRAIN) File "/root/.local/lib/python2.7/site-packages/trainer/model.py", line 231, in build_graph num_epochs=None if is_training else 2) File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 47, in read_examples filename_queue = tf.train.string_input_producer(files, num_epochs, shuffle) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 217, in string_input_producer raise ValueError(not_null_err) ValueError: string_input_producer requires a non-null input tensor To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=317506745947&resource=ml_job%2Fjob_id%2Fequipmentparts_1_1532076159&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22equipmentparts_1_1532076159%22
The text was updated successfully, but these errors were encountered: