Skip to content

Commit

Permalink
Module 2
Browse files Browse the repository at this point in the history
  • Loading branch information
truskovskiyk committed Jul 1, 2024
1 parent a75eac5 commit b4c1678
Show file tree
Hide file tree
Showing 19 changed files with 378 additions and 407 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/module-2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Module 2


on:
workflow_dispatch:
# push:
pull_request:
branches:
- main


jobs:
ci-test-bash-code:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2

- name: Test echo
run: |
echo 'test'
- name: Test ls
run: |
ls -all .
app-ml-docker-but-with-cli:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1

- uses: actions/setup-python@v5
with:
python-version: '3.10'

- name: Run minio
run: |
docker run -it -p 9000:9000 -p 9001:9001 quay.io/minio/minio server /data --console-address ":9001"
- name: Setup env
run: |
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
export AWS_ENDPOINT_URL=http://127.0.0.1:9000
pip install -r module-2/requirements.txt
- name: Run test
run: |
pytest -ss ./module-2/minio_storage/test_minio_client.py
4 changes: 4 additions & 0 deletions module-2/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
random-data
.lancedb/
.ruff_cache
cache
13 changes: 5 additions & 8 deletions module-2/PRACTICE.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
# Practice
# Practice

***
***

# H3: Data storage & processing

## Reading list:

## Reading list:

- [Data engineer roadmap](https://github.com/datastacktv/data-engineer-roadmap)
- [Minio using Kubernetes](https://github.com/kubernetes/examples/tree/master/staging/storage/minio)
Expand All @@ -24,7 +23,6 @@
- [Course: CMU Database Systems](https://15445.courses.cs.cmu.edu/fall2023/)
- [Course: Advanced Database Systems](https://15721.courses.cs.cmu.edu/spring2024/)


## Task:

- PR1: Write README instructions detailing how to deploy MinIO with the following options: Local, Docker, Kubernetes (K8S)-based.
Expand All @@ -35,10 +33,9 @@
- PR7: Write code for transforming your dataset into a vector format, and utilize VectorDB for ingestion and querying.
- Google Doc: Update your proposal by adding a section on data storage and processing.

## Criteria:

## Criteria:

- 7 PRs are merged
- 7 PRs are merged
- Description of data section, storage and processing, in the google doc.


Expand Down
Loading

0 comments on commit b4c1678

Please sign in to comment.