-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate integration with vetiver #163
Comments
To get appropriate versioning support, I imagine this will require rstudio/pins-r#572 to be implemented. The deployment piece alone on its own doesn't necessarily require the model object to be stored as a pin. |
A setup script is here: library(parsnip)
library(workflows)
data(Sacramento, package = "modeldata")
rf_spec <- rand_forest(mode = "regression")
rf_form <- price ~ type + sqft + beds + baths
rf_fit <-
workflow(rf_form, rf_spec) %>%
fit(Sacramento)
library(vetiver)
v <- vetiver_model(rf_fit, "sacramento_rf")
root <- file.path("inst","vetiver")
library(pins)
model_board <- board_folder(file.path(root,"plumber/pins"))
model_board %>% vetiver_pin_write(v)
library(googleCloudRunner)
# the docker takes a long time to install arrow so build it first to cache
repo <- cr_buildtrigger_repo("MarkEdmondson1234/googleCloudRunner",
branch = "vetiver")
#cr_buildtrigger_delete("docker-vetiver")
cr_deploy_docker_trigger(repo, "vetiver",
location = "inst/vetiver/docker/",
includedFiles = "inst/vetiver/**",
projectId_target = "gcer-public",
timeout = 3600)
cr_deploy_plumber(file.path(root,"plumber")) I changed the plumber deploiyment pr <- plumber::plumb("api.R")
pr <- vetiver::vetiver_pr_predict()
pr$run(host = "0.0.0.0", port = as.numeric(Sys.getenv("PORT")), swagger = TRUE) The main bottleneck at the moment is getting a Docker image with |
The |
There has been some discussion of making the arrow dependency optional. You might want to check out rstudio/pins-r#537 and see if anything in there helps. FWIW arrow isn't really needed for the model publishing use case. |
Makes sense, yes it seemed a lot of installation for features not used. I've left a comment to see if there is a way though since it would be nice to have an arrow image available. |
The docker built in about 20mins now so available at I haven't seen modifying the actual plumber router before so made a new script file to load that in, this would be fairly boilerplate though I think: #server.r
pr <- plumber::plumb("api.R")
v <- vetiver::vetiver_pin_read(pins::board_folder("pins"), name = "sacramento_rf")
pr <- vetiver::vetiver_pr_predict(pr, v, debug = TRUE)
pr$run(host = "0.0.0.0", port = as.numeric(Sys.getenv("PORT")), swagger = TRUE) Its built on top of the example plumber script I have so endpoints at /plot and /hello too - I think it would be nice to make a PubSub target for it. How would vetiver work within an api.R script? This successfully deployed with this simple Docker - I guess in real life some more dependencies or renv: lockfiles could be involved.
Example endpoint live at Runs the example from the vetiver docs: data(Sacramento, package = "modeldata")
new_sac <- Sacramento %>%
slice_sample(n = 20) %>%
select(type, sqft, beds, baths)
endpoint <- vetiver::vetiver_endpoint("https://vetiver-ewjogewawq-ew.a.run.app/predict")
predict(endpoint, new_sac)
# A tibble: 20 x 1
.pred
<dbl>
1 236325.
2 427492.
3 417112.
4 258001.
5 339775.
... In real life you could also add a build trigger for any changes to the R script the model is doing, to update the deployment as needed. With the pins integration calling outside services such as GCS, this would be needed less often. The full setup script below: library(parsnip)
library(workflows)
data(Sacramento, package = "modeldata")
rf_spec <- rand_forest(mode = "regression")
rf_form <- price ~ type + sqft + beds + baths
rf_fit <-
workflow(rf_form, rf_spec) %>%
fit(Sacramento)
library(vetiver)
v <- vetiver_model(rf_fit, "sacramento_rf")
root <- file.path("inst","vetiver")
library(pins)
model_board <- board_folder(file.path(root,"plumber/pins"))
model_board %>% vetiver_pin_write(v)
library(googleCloudRunner)
# the docker takes a long time to install arrow so build it first to cache
repo <- cr_buildtrigger_repo("MarkEdmondson1234/googleCloudRunner",
branch = "vetiver")
#cr_buildtrigger_delete("docker-vetiver")
cr_deploy_docker_trigger(repo, "vetiver",
location = "inst/vetiver/docker/",
includedFiles = "inst/vetiver/**",
projectId_target = "gcer-public",
timeout = 3600)
# use the vetiver docker image built above to deploy a Cloud Run instance of the model
# deploys folder with api.R, Dockerfile, pins/ and server.R contained
run <- cr_deploy_plumber(file.path(root,"plumber"), remote = "vetiver")
# on succesful deployment
endpoint <- vetiver::vetiver_endpoint(paste0(run$status$url, "/predict"))
library(tidyverse)
data(Sacramento, package = "modeldata")
new_sac <- Sacramento %>%
slice_sample(n = 20) %>%
select(type, sqft, beds, baths)
predict(endpoint, new_sac)
# A tibble: 20 x 1
.pred
<dbl>
1 236325.
2 427492.
3 417112.
4 258001.
5 339775.
... |
Folder structure of working deployment here https://github.com/MarkEdmondson1234/googleCloudRunner/tree/vetiver/inst/vetiver |
I've been working lately on generating Docker containers more, if you'd like to take a look and give any feedback. This demo might be helpful for how I am setting things up. |
Thanks very much will take a look |
https://vetiver.tidymodels.org/
The text was updated successfully, but these errors were encountered: