All scripts assume you are in the mlengine folder.
python trainer/task.py
or
gcloud ml-engine local train --module-name trainer.task --package-path trainer
or with custom hyperparameters
gcloud ml-engine local train --module-name trainer.task --package-path trainer -- --hp-iterations 3000 --hp-dropout 0.5
Do not forget the empty -- between gcloud parameters and your own parameters. The list of tunable hyperparameters is displayed on each run. Check the code to add more.
(jobXXX, jobs/jobXXX, <project> and <bucket> must be replaced with your own values)
gcloud ml-engine jobs submit training jobXXX --job-dir gs://<bucket>/jobs/jobXXX --project <project> --config config.yaml --module-name trainer.task --package-path trainer --runtime-version 1.4
--runtime-version specifies the version of Tensorflow to use.
(jobXXX, jobs/jobXXX, <project> and <bucket> must be replaced with your own values)
gcloud ml-engine jobs submit training jobXXX --job-dir gs://<bucket>/jobs/jobXXX --project <project> --config config-hptune.yaml --module-name trainer.task --package-path trainer --runtime-version 1.4
--runtime-version specifies the version of Tensorflow to use.
Use the Cloud ML Engine UI to create a model and a version from the saved data from your training run. You will find it in folder:
gs://<bucket>/jobs/jobXXX/export/Servo/XXXXXXXXXX
Set your version of the model as the default version, then create the JSON payload. You can use the script:
python digits.py > digits.json
Then call the online predictions service, replacing <model_name> with the name you have assigned:
gcloud ml-engine predict --model <model_name> --json-instances digits.json
It should return a perfect scorecard:
CLASSES | PREDICTIONS |
---|---|
8 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0] |
7 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] |
7 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] |
5 | [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] |
5 | [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] |
You can also simulate the prediction service locally, replace XXXXX with the # of your saved model:
gcloud ml-engine local predict --model-dir checkpoints/export/Servo/XXXXX --json-instances digits.json
You can read more about batch norm here.
If you want to experiment with TF Records, the standard Tensorflow data format, you can run this script ((availble in the tensorflow distribution) to reformat the MNIST dataset into TF Records. It is not necessary for this sample though.
python <YOUR-TF-DIR>/tensorflow/examples/how_tos/reading_data/convert_to_records.py --directory=data --validation_size=0