Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long overdue refactoring for quality of life and performance improvements. #140

Merged
merged 213 commits into from
Aug 24, 2023

Conversation

n8mellis
Copy link
Contributor

This PR represents a near-total refactoring of the existing Chassis code to fix bugs and add some long-awaited features. Even though this is a significant refactor, there should be no* breaking changes for users of the Chassis SDK.

Notable improvements:

  1. Chassis now supports local Docker builds. Using a remote build server is still supported but is no longer required.
  2. The dependency on MLFlow has been removed. Any function that can be cloudpickle'd is now supported (MLFlow also uses cloudpickle under the hood.)
  3. Models now support multiple inputs and outputs per inference.
  4. The number of PIP dependencies to install the Chassis SDK has been dramatically reduced. For some features (like kserve support), there are optional dependencies that can be used to bring in those extra dependencies.
  5. The use of conda to satisfy dependencies is no longer required. This, combined with the removal of MLFlow results is an up to 10x size reduction in the generated model container.
  6. Due to all of the above, container build times are now 2-5x faster.

In addition, the remote build server has been completely rewritten (in Rust). It no longer has any dependencies on object storage and has much better support for parallel builds. In addition, kaniko has been replaced by BuildKit which enables multi-platform images to be built. Finally, the Helm chart has been updated to allow changing all Kubernetes values like the amount of resources to allocate to the BuildKit pods, various timeouts, etc.

bmunday3 and others added 26 commits August 17, 2023 09:35
Automatically use CPU-only version of `torch*` packages if no CUDA version is specified.
# Conflicts:
#	chassisml_sdk/examples/fastai/fastai_tabular_training.ipynb
#	chassisml_sdk/examples/hugging-face/huggingface_distilbert_text_classification.ipynb
#	chassisml_sdk/examples/lightgbm/lightgbm_classification.ipynb
#	chassisml_sdk/examples/mxnet/mxnet_mobilenet_image_classification.ipynb
#	chassisml_sdk/examples/onnx/onnx_mobilenet_image_classification.ipynb
#	chassisml_sdk/examples/onnx/onnx_transformer_text_generation.ipynb
#	chassisml_sdk/examples/pmml/pmml_iris_forest_classification.ipynb
#	chassisml_sdk/examples/pmml/pmml_linear_regression.ipynb
#	chassisml_sdk/examples/pytorch/pytorch_deeplab_resnet50_semantic_segmentation.ipynb
#	chassisml_sdk/examples/pytorch/pytorch_fastrcnn_object_detection.ipynb
#	chassisml_sdk/examples/pytorch/pytorch_resnet50_image_classification.ipynb
#	chassisml_sdk/examples/pytorch/pytorch_resnet50_image_classification_batch_gpu.ipynb
#	chassisml_sdk/examples/pytorch/pytorch_tabular_shelter_animal_outcome.ipynb
#	chassisml_sdk/examples/sklearn/data/random_forest.joblib
#	chassisml_sdk/examples/sklearn/sklearn_logreg_image_classification.ipynb
#	chassisml_sdk/examples/sklearn/sklearn_svm_image_classification.ipynb
#	chassisml_sdk/examples/sklearn/sklearn_tree_wine_classification.ipynb
@n8mellis n8mellis requested a review from bmunday3 August 24, 2023 18:45
@n8mellis n8mellis merged commit 2567337 into main Aug 24, 2023
@n8mellis n8mellis deleted the nathan/overhaul branch August 24, 2023 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants