Skip to content

Commit

Permalink
Module 4 Dagster
Browse files Browse the repository at this point in the history
  • Loading branch information
truskovskiyk committed Jul 20, 2024
1 parent 6b363e3 commit 26b778b
Show file tree
Hide file tree
Showing 8 changed files with 419 additions and 360 deletions.
48 changes: 28 additions & 20 deletions .github/workflows/module-4.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,37 @@ env:
jobs:
dagster-image:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2

- name: Echo
run: |
echo 123

- name: Login to Docker Hub
uses: docker/login-action@v1
permissions:
contents: read
packages: write

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to the Container registry
uses: docker/login-action@65b78e6e13532edd9afa3aa52ac7964289d1a9c1
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
with:
username: ${{ secrets.DOCKER_HUB_USERNAME }}
password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
images: ghcr.io/kyryl-opens-ml/dagster-pipeline

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
# See explanation: https://github.com/orgs/community/discussions/25678
- name: Clean disk
run: |
rm -rf /opt/hostedtoolcache
- name: Build
uses: docker/build-push-action@v2
- name: Build and push Docker image
uses: docker/build-push-action@f2a1d5e99d037542a71f64918e516c093c6f3fc4
with:
context: week-3/nlp-sample
file: week-3/nlp-sample/Dockerfile
context: module-4/dagster_pipelines/
push: true
tags: ${{ secrets.DOCKER_HUB_USERNAME }}/${{ env.IMAGE_MAIN_NAME }}:${{ env.IMAGE_MAIN_TAG }}
cache-from: type=registry,ref=${{ secrets.DOCKER_HUB_USERNAME }}/${{ env.IMAGE_MAIN_NAME }}:buildcache
cache-to: type=registry,ref=${{ secrets.DOCKER_HUB_USERNAME }}/${{ env.IMAGE_MAIN_NAME }}:buildcache,mode=max
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
10 changes: 10 additions & 0 deletions module-4/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,13 @@ python kubeflow_pipelines/kfp_inference_pipeline.py http://0.0.0.0:3000

- [Create, use, pass, and track ML artifacts](https://www.kubeflow.org/docs/components/pipelines/v2/data-types/artifacts/#new-pythonic-artifact-syntax)
- [Vertex AI](https://cloud.google.com/vertex-ai/docs/pipelines/introduction)


# Dagster


```bash
mkdir ./dagster_pipelines/dagster-home
export DAGSTER_HOME=$PWD/dagster_pipelines/dagster-home
dagster dev -f dagster_pipelines/text2sql_pipeline.py -p 3000 -h 0.0.0.0
```
1 change: 1 addition & 0 deletions module-4/dagster_pipelines/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dagster-home/
37 changes: 37 additions & 0 deletions module-4/dagster_pipelines/modal_functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import os

import modal
from modal import Image

app = modal.App("ml-in-production-practice")
env = {
"WANDB_PROJECT": os.getenv("WANDB_PROJECT"),
"WANDB_API_KEY": os.getenv("WANDB_API_KEY"),
}
custom_image = Image.from_registry("ghcr.io/kyryl-opens-ml/generative-example:pr-11").env(env)


@app.function(image=custom_image, gpu="A100", timeout=10 * 60 * 60)
def run_generative_example():
from pathlib import Path

from generative_example.data import load_sql_data
from generative_example.predictor import run_evaluate_on_json
from generative_example.train import train
from generative_example.utils import load_from_registry, upload_to_registry

load_sql_data(path_to_save=Path("/tmp/data"))
train(config_path=Path("/app/conf/example-modal.json"))
upload_to_registry(model_name="modal_generative_example", model_path=Path("/tmp/phi-3-mini-lora-text2sql"))
load_from_registry(model_name="modal_generative_example:latest", model_path=Path("/tmp/loaded-model"))
run_evaluate_on_json(json_path=Path("/tmp/data/test.json"), model_load_path=Path("/tmp/loaded-model"), result_path=Path("/tmp/data/results.json"))


def main():
fn = modal.Function.lookup("ml-in-production-practice", "run_generative_example")
fn_id = fn.spawn()
print(f"Run training object: {fn_id}")


if __name__ == "__main__":
main()
Loading

0 comments on commit 26b778b

Please sign in to comment.