This project provides a setup script to deploy All-in-One PredictionIO environment for development.
- Java 8
Run the following command to deploy PredictionIO with Spark and Elasticsearch:
./bin/pio-setup deploy
This command creates predictionio, spark and elasticsearch directories.
To start PredictionIO, run as below:
./bin/pio-setup start
To stop PredictionIO, run as below:
./bin/pio-setup stop
Remove all deployed directories:
./bin/pio-setup clean
target, predictionio, spark and elasticsearch directories are removed.
pio-setup provides template commands.
To get template repository from GitHub, run pio-setup template get [user] [repository]
.
For example, to clone recommender template from apache/incubator-predictionio-template-recommender, the command is:
./bin/pio-setup template get apache incubator-predictionio-template-recommender
You can also omit it.
./bin/pio-setup template get recommender
Cloned repositories are put into templates directory.
To use Template on PredictionIO, you need to register it as PredictionIO App. init sub-command deals with it.
./bin/pio-setup template init recommender
import sub-command invokes Python script to import training data. (Note: for recommender template, you need to put sample_movielens_data.txt to data directory)
./bin/pio-setup template import recommender
build sub-command run pio build
to compile Template.
(Note: for recommender template, you need to put scalaVersion := "2.11.0"
to build.sbt, and also uncomment sc.setCheckpointDir("checkpoint")
in ALSAlgorithm.scala if StackOverflowException occurs)
./bin/pio-setup template build recommender
To run train process, use train sub-command.
./bin/pio-setup template train recommender
To launch predict API, run deploy sub-command:
./bin/pio-setup template deploy recommender
For recommender template, to check predict API response, send the following request:
curl -H "Content-Type: application/json" -d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json
To stop predict API, run undeploy sub-command:
./bin/pio-setup template undeploy recommender
The following repositories are used:
- Meta Data: Elasticsearch
- Event Data: Elasticsearch
- Model Data: Local FS
This project contains the following directories:
- bin: Executable files
- predictionio: PredictionIO
- spark: Spark
- elasticsearch: Elasticsearch
- target: Temporary files
- templates: Template Repositories
This section describes Template development with pio-setup via predictionio-template-iris.
First of all, download and build PredictionIO environment including Spark and Elasticsearch.
git clone https://github.com/jpioug/predictionio-setup.git
cd predictionio-setup
./bin/pio-setup deploy
Run start command if the above deploy command is success.
./bin/pio-setup start
You can check a status of PredictionIO by the following command:
./bin/pio-setup status
In this case, we use predictionio-template-iris. get sub-command downloads it from jpioug/predictionio-template-iris in GitHub.
./bin/pio-setup template get iris
To use it on PredictionIO, you need to register it as PredictionIO application.
init sub-command invokes pio app new
command.
./bin/pio-setup template init iris
To fit a learning model from training dataset, you need to insert data to PredictionIO event server.
import sub-command runs python data/import_eventserver.py
with app's access key.
./bin/pio-setup template import iris
Move to Template directory:
cd templates/predictionio-template-iris/
In this template, Jupyter notebook is available.
So, you can launch jupyter
by the following command.
(If not using pyenv, modify environment variables)
PYSPARK_PYTHON=$PYENV_ROOT/shims/python PYSPARK_DRIVER_PYTHON=$PYENV_ROOT/shims/jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" ../../predictionio/bin/pio-shell --with-pyspark
This tempalte contains eda.ipynb to run sample data analysis and create a learning model. After finishing your work, download it as python code and copy&paste it to train.py.
Move back to predictionio-setup directory:
cd ../..
This template contains some Scala code.
build sub-command runs pio build
on the template directory.
./bin/pio-setup template build iris
To fit a learning model on the template, run train sub-command.
This command invokes pio train
.
./bin/pio-setup template train iris
deploy sub-command launches Predict Rest API.
./bin/pio-setup template deploy iris
You can check it by the following request:
curl -s -H "Content-Type: application/json" -d '{"attr0":5.1,"attr1":3.5,"attr2":1.4,"attr3":0.2}' http://localhost:8000/queries.json
To stop Predict API, run undeploy sub-command:
./bin/pio-setup template undeploy iris
docker build --rm -t jpioug/pio-setup .
docker-compose up --abort-on-container-exit