Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add orchestration module #917

Merged
merged 276 commits into from
Aug 22, 2024
Merged
Show file tree
Hide file tree
Changes from 249 commits
Commits
Show all changes
276 commits
Select commit Hold shift + click to select a range
850ab79
🐛 Fixed bug in `viadot-lite.Dockerfile`
djagoda881 Jun 5, 2024
dd999e5
🔖 Upgraded version to `2.0.0-alpha.1`
djagoda881 Jun 5, 2024
f95694d
👷 Updated `docker-publish.yml`
djagoda881 Jun 5, 2024
cf22f58
🚚 Moved `orchiestration` folder into `src/viadot`
djagoda881 Jun 6, 2024
e1415d0
🚚 Renamed path from `prefect-viadot-test` to `prefect-test`
djagoda881 Jun 6, 2024
5f50f97
🔖 Bumped version to `2.0.0-alpha.2`
djagoda881 Jun 6, 2024
d7922f0
♻️ Synchronized `prefect-viadot` with `orchiestration/prefect`
djagoda881 Jun 6, 2024
c39ff5c
🐛 Fixed import in `test_git.py`
djagoda881 Jun 6, 2024
674343b
🧱 Updated `docker-compose.yml`
djagoda881 Jun 6, 2024
df3e4ce
🚚 Moved `prefect_viadot` to `src/viadot/orchestration`
djagoda881 May 16, 2024
bb7f65c
🚚 Changes imports in prefect-viadot
djagoda881 May 16, 2024
bec9d3e
⬆️ Added prefect-viadot dependencies to viadot
djagoda881 May 16, 2024
fb62dc2
⬆️ Upgraded `prefect` dependencie
djagoda881 May 17, 2024
deedb07
🔧 Updated `Dockerfile`
djagoda881 May 20, 2024
d89f1e4
⬆️ Upgraded dependecies
djagoda881 May 27, 2024
f97c97c
🔥 Depreacted `datahub.py`
djagoda881 May 27, 2024
aee4d0b
➕ Added `viadot-azure` and `viadot-aws` dependecies
djagoda881 May 28, 2024
0a148c7
🧱 Added `viadot-azure.Dockerfile`
djagoda881 May 28, 2024
035589e
🐛 Added import error handlig to all optional sources
djagoda881 May 28, 2024
e3e7c17
🐛 Fixed adls import
djagoda881 May 28, 2024
cadf4ea
🧱 Added `viadot-aws.Dockerfile`
djagoda881 May 28, 2024
363541c
🐛 Fixed import errors in `prefect-viadot`
djagoda881 May 28, 2024
7ba3915
✅ Added prefect-viadot test and refactored viadot tests
djagoda881 Jun 3, 2024
b1619e2
🙈 Updated .gitignore file
djagoda881 Jun 3, 2024
1d49a83
➕ Added new dev dependencies
djagoda881 Jun 3, 2024
4c19a04
🧱 Removed not needed packages from `viadot-azure.Dockerfile`
djagoda881 Jun 3, 2024
a39c1d9
➕ Added dependecies to `pyproject.toml`
djagoda881 Jun 3, 2024
7dcc38f
⬆️ Upgraded `viadot-azure` packages
djagoda881 Jun 4, 2024
b066f6e
🐛 Fixed imports in viadot integration tests
djagoda881 Jun 4, 2024
e24a6d4
🧱 Refacroed `viadot-azure.Dockerfile`
djagoda881 Jun 4, 2024
e59dc28
⬆️ Upgraded aws dependecies in `pyproject.toml`
djagoda881 Jun 5, 2024
e5941a7
⬆️ Upgraded dependecies
djagoda881 Jun 5, 2024
7323b7f
🧱 Added viadot-lite image
djagoda881 Jun 5, 2024
da48de5
♻️ Refactored viadot-aws image
djagoda881 Jun 5, 2024
763e4ca
🧱 Updated `docker-compose.yml`
djagoda881 Jun 5, 2024
8189c41
🐛 Fixed bug in `viadot-lite.Dockerfile`
djagoda881 Jun 5, 2024
7bc5882
🔖 Upgraded version to `2.0.0-alpha.1`
djagoda881 Jun 5, 2024
030b010
👷 Updated `docker-publish.yml`
djagoda881 Jun 5, 2024
f4988f8
🚚 Moved `orchiestration` folder into `src/viadot`
djagoda881 Jun 6, 2024
39a9a26
🚚 Renamed path from `prefect-viadot-test` to `prefect-test`
djagoda881 Jun 6, 2024
86d006a
🔖 Bumped version to `2.0.0-alpha.2`
djagoda881 Jun 6, 2024
f9f79f9
♻️ Synchronized `prefect-viadot` with `orchiestration/prefect`
djagoda881 Jun 6, 2024
bbbdc7a
🐛 Fixed import in `test_git.py`
djagoda881 Jun 6, 2024
260579c
🧱 Updated `docker-compose.yml`
djagoda881 Jun 6, 2024
bec901b
➕ Added docs dependencies
djagoda881 Jun 11, 2024
53e9e18
🎨 Fixed rye formatting
djagoda881 Jun 11, 2024
ebdeb9e
➖ Removed duplicated dependecies
djagoda881 Jun 12, 2024
c74474f
🐛 Fixed mkdocs config bug
djagoda881 Jun 12, 2024
1de5fc5
Merge branch '2.0-new-repository-structure' of https://github.com/dyv…
Diego-H-S Jun 13, 2024
24507e0
🧱 Moved images into one multistage `Dockerfile` (#932)
djagoda881 Jun 19, 2024
e0ec929
🔖 Bumped version to `2.0.0-alpha.3`
djagoda881 Jun 20, 2024
36641a0
Merge branch '2.0-new-repository-structure' of https://github.com/dyv…
Diego-H-S Jun 21, 2024
baec9e5
⬇️ Downgraded `requests` package
djagoda881 Jun 24, 2024
fad8396
🔖 Bumped to `2.0.0-alpha.4` version
djagoda881 Jun 24, 2024
ccb2024
🧱 Upgraded images in `docker-compose.yml`
djagoda881 Jun 24, 2024
ee1a0c2
Add documentation for viadot 2.0 with new repository structure (#929)
djagoda881 Jun 24, 2024
46861ba
Merge branch '2.0-new-repository-structure' of https://github.com/dyv…
Diego-H-S Jun 25, 2024
2dda7fd
🚚 Moved `prefect_viadot` to `src/viadot/orchestration`
djagoda881 May 16, 2024
473f87b
🚚 Changes imports in prefect-viadot
djagoda881 May 16, 2024
9996ca1
⬆️ Added prefect-viadot dependencies to viadot
djagoda881 May 16, 2024
e991f7c
⬆️ Upgraded `prefect` dependencie
djagoda881 May 17, 2024
a347965
🔧 Updated `Dockerfile`
djagoda881 May 20, 2024
fd085a7
⬆️ Upgraded dependecies
djagoda881 May 27, 2024
c826c88
🔥 Depreacted `datahub.py`
djagoda881 May 27, 2024
9db8eb3
➕ Added `viadot-azure` and `viadot-aws` dependecies
djagoda881 May 28, 2024
7f185a7
🧱 Added `viadot-azure.Dockerfile`
djagoda881 May 28, 2024
3d235a3
🐛 Added import error handlig to all optional sources
djagoda881 May 28, 2024
cba0ae0
🐛 Fixed adls import
djagoda881 May 28, 2024
2feb0f4
🧱 Added `viadot-aws.Dockerfile`
djagoda881 May 28, 2024
c86a9a4
🐛 Fixed import errors in `prefect-viadot`
djagoda881 May 28, 2024
c57ad62
✅ Added prefect-viadot test and refactored viadot tests
djagoda881 Jun 3, 2024
8b2f0b7
🙈 Updated .gitignore file
djagoda881 Jun 3, 2024
471273b
➕ Added new dev dependencies
djagoda881 Jun 3, 2024
3b4146d
🧱 Removed not needed packages from `viadot-azure.Dockerfile`
djagoda881 Jun 3, 2024
d992a31
➕ Added dependecies to `pyproject.toml`
djagoda881 Jun 3, 2024
b2a83f8
⬆️ Upgraded `viadot-azure` packages
djagoda881 Jun 4, 2024
ce43343
🐛 Fixed imports in viadot integration tests
djagoda881 Jun 4, 2024
025b9c9
🧱 Refacroed `viadot-azure.Dockerfile`
djagoda881 Jun 4, 2024
d2326d1
⬆️ Upgraded aws dependecies in `pyproject.toml`
djagoda881 Jun 5, 2024
398c6c5
⬆️ Upgraded dependecies
djagoda881 Jun 5, 2024
1ae1661
🧱 Added viadot-lite image
djagoda881 Jun 5, 2024
1254151
♻️ Refactored viadot-aws image
djagoda881 Jun 5, 2024
61262fc
🧱 Updated `docker-compose.yml`
djagoda881 Jun 5, 2024
36960d5
🐛 Fixed bug in `viadot-lite.Dockerfile`
djagoda881 Jun 5, 2024
a67406b
🔖 Upgraded version to `2.0.0-alpha.1`
djagoda881 Jun 5, 2024
840e5b2
👷 Updated `docker-publish.yml`
djagoda881 Jun 5, 2024
860bdec
🚚 Moved `orchiestration` folder into `src/viadot`
djagoda881 Jun 6, 2024
14d9284
🚚 Renamed path from `prefect-viadot-test` to `prefect-test`
djagoda881 Jun 6, 2024
3f67aa6
🔖 Bumped version to `2.0.0-alpha.2`
djagoda881 Jun 6, 2024
b592f72
♻️ Synchronized `prefect-viadot` with `orchiestration/prefect`
djagoda881 Jun 6, 2024
a8a2844
🐛 Fixed import in `test_git.py`
djagoda881 Jun 6, 2024
e0d8e6c
🧱 Updated `docker-compose.yml`
djagoda881 Jun 6, 2024
0c99d9e
➕ Added docs dependencies
djagoda881 Jun 11, 2024
1f02e79
🎨 Fixed rye formatting
djagoda881 Jun 11, 2024
3f1da80
➖ Removed duplicated dependecies
djagoda881 Jun 12, 2024
25ef425
🧱 Moved images into one multistage `Dockerfile` (#932)
djagoda881 Jun 19, 2024
62d56a3
🔖 Bumped version to `2.0.0-alpha.3`
djagoda881 Jun 20, 2024
17f221f
⬇️ Downgraded `requests` package
djagoda881 Jun 24, 2024
db919fb
🔖 Bumped to `2.0.0-alpha.4` version
djagoda881 Jun 24, 2024
bb167ec
🧱 Upgraded images in `docker-compose.yml`
djagoda881 Jun 24, 2024
a50320c
Add documentation for viadot 2.0 with new repository structure (#929)
djagoda881 Jun 24, 2024
83ce192
✨ Added new param to `sharepoint_to_readshift_spectrum`
djagoda881 Jul 2, 2024
26f1fc5
✨ Added new param to `sharepoint.py`
djagoda881 Jul 2, 2024
e21af5b
✨ Added `basename_template` to MinIO source
Jul 3, 2024
b588854
✨ Added `SQLServer` source and tasks for it
Jul 3, 2024
2c5bf7d
✨ Added handling for `DatabaseCredentials` and `Secret` in get_creden…
Jul 3, 2024
c987afa
✨ Added `df_to_minio` task for prefect
Jul 3, 2024
783355f
Added `sql_server_to_minio` flow for prefect
Jul 3, 2024
e2f0fbf
✅ Added tests sql_server_to_minio
Jul 3, 2024
f0f4337
📝 Updated changelog with `sql_server_to_mino` and related functions
Jul 3, 2024
0404004
🐛 Added missing package to Dockerfile
djagoda881 Jul 5, 2024
beb93b2
⬆️ Upgraded `prefect` version to `2.19.7`
djagoda881 Jul 5, 2024
908341e
🔖 Bumped viadot version to `2.0.0-alpha.5`
djagoda881 Jul 5, 2024
8ce8f89
✅ Added tests
Jul 8, 2024
6bc3512
🎨 Updated credentials options
Jul 8, 2024
d4ed0d9
🔧 Updated docker setup
Jul 8, 2024
53e9a5a
🎨 Updated data type
Jul 8, 2024
f5e0870
🎨 Added contexlib for MinIO
Jul 8, 2024
97d640a
📝 Updated requirements.lock `s
Jul 9, 2024
95437e1
📝 Updates SQL Server docs
Jul 9, 2024
724b38e
🎨 Added whitespaces
Jul 9, 2024
442745b
Merge pull request #941 from dyvenia/sql_server_to_minio
djagoda881 Jul 9, 2024
506ca20
⬇️ Downgraded dependecies
djagoda881 Jul 9, 2024
75f3d1f
🔖 Bumped viadot to version `2.0.0-alpha.6`
djagoda881 Jul 9, 2024
8141c1c
Merge branch '2.0-new-repository-structure' of https://github.com/dyv…
Diego-H-S Jul 10, 2024
25ef74a
📝 updated CHANGELOG.md
Diego-H-S Jun 25, 2024
ffaacf6
✨ updated Outlook connector version 1.
Diego-H-S Jun 25, 2024
2f398b8
✨ updated Outlook connector version 2.
Diego-H-S Jun 26, 2024
a095f35
📝 updated docstrings.
Diego-H-S Jun 26, 2024
128964f
✅ added outlook test file.
Diego-H-S Jun 26, 2024
53f1b56
👔 updated some files to aling the rebase.
Diego-H-S Jul 10, 2024
ca1dc5e
📝 updated CHANGELOG.md
Diego-H-S Jun 25, 2024
d568c4c
✨ added Hubspot connector version 1.
Diego-H-S Jun 25, 2024
dee7ae8
✅ added hubspot test file.
Diego-H-S Jun 25, 2024
b144ef7
📝 updated docstrings.
Diego-H-S Jun 25, 2024
f536c00
✅ updated local lock file.
Diego-H-S Jun 25, 2024
0cb3f00
🔊 updated logger in source.
Diego-H-S Jun 25, 2024
e8f19ec
👔 updated some files to aling the rebase.
Diego-H-S Jul 10, 2024
9fd66c3
👔 updated some more files to aling the rebase.
Diego-H-S Jul 10, 2024
96e23a1
📝 updated CHANGELOG.
Diego-H-S Jun 7, 2024
cf201a8
✨ added Mindful to __init__ files.
Diego-H-S Jun 7, 2024
150e01a
✨ created new Minsful connector.
Diego-H-S Jun 7, 2024
30ff4c8
🎨 updated mindful flow and task connector.
Diego-H-S Jun 10, 2024
66963e2
✅ added mindful test file.
Diego-H-S Jun 11, 2024
e3e2bcf
📝 updated mindful docstrings.
Diego-H-S Jun 11, 2024
9a2e132
⚡️ added sep parameter in adls task.
Diego-H-S Jun 11, 2024
9206a8b
🔊 updated logs.
Diego-H-S Jun 13, 2024
e5f5cdd
📝 updated docstrings.
Diego-H-S Jun 13, 2024
7ea818e
🔊 updated logger in source.
Diego-H-S Jun 25, 2024
bc22643
👔 updated some files to aling the rebase.
Diego-H-S Jul 10, 2024
419218f
📝 update CHANGELOG.md and __init__ files.
Diego-H-S Jun 13, 2024
4a0cda2
✨ added Genesys file structure version 1.
Diego-H-S Jun 13, 2024
0347bf4
📝 updated rebased files.
Diego-H-S Jun 13, 2024
d2c3721
✨ added Genesys file structure version 2.
Diego-H-S Jun 17, 2024
56c20de
✨ added Genesys file structure version 3.
Diego-H-S Jun 18, 2024
eaea63a
📝 adding some extra log information.
Diego-H-S Jun 18, 2024
2b6c90e
✨ added Genesys file structure version 4.
Diego-H-S Jun 20, 2024
7812bcf
✅ added genesys test files.
Diego-H-S Jun 20, 2024
dc9482e
✅ upsted genesys test file.
Diego-H-S Jun 21, 2024
2d0fa75
🔊 updated logger in source.
Diego-H-S Jun 25, 2024
3fbb27c
👔 updated some files to aling the rebase.
Diego-H-S Jul 10, 2024
5831ac0
📝 updated docstring.
Diego-H-S Jul 10, 2024
0eba969
🎨 implemented flake8 and pylint tests.
Diego-H-S Jul 16, 2024
eaed4b0
💄 added prints to source level.
Diego-H-S Jul 16, 2024
2d57ab8
📝 updated variable names.
Diego-H-S Jul 16, 2024
267c6e8
Duckdb connectors (#945)
angelika233 Jul 16, 2024
eeb4d8c
Delete .python_history
trymzet Jul 16, 2024
fb6360c
✅ updated test file.
Diego-H-S Jul 17, 2024
fdc83ff
🎨 updated code performance.
Diego-H-S Jul 17, 2024
6a50834
✅ updated test file.
Diego-H-S Jul 17, 2024
83cb1cc
c4c code checker passed and tests coverage passed
fdelgadodyvenia Jul 17, 2024
e819f99
🎨 updated code performance.
Diego-H-S Jul 17, 2024
4cd6e2c
✅ updated test file.
Diego-H-S Jul 17, 2024
b182aa6
🎨 updated code performance.
Diego-H-S Jul 17, 2024
ac87397
✅ updated test file.
Diego-H-S Jul 17, 2024
4c8542b
flows_tasks_for c4c
fdelgadodyvenia Jul 17, 2024
57a6b64
✅ updated test file to reach 80% coverage.
Diego-H-S Jul 19, 2024
40acc95
✏️ corrected a typo.
Diego-H-S Jul 19, 2024
932b3df
✅ updated test file to reach 80% coverage.
Diego-H-S Jul 19, 2024
27809b7
✅ updated test file.
Diego-H-S Jul 19, 2024
ebd119e
✏️ fixed a typo.
Diego-H-S Jul 19, 2024
00fe031
✏️ fixed another typo.
Diego-H-S Jul 19, 2024
798f542
✨ Added sap_to_parquet flow (#947)
judynah Jul 19, 2024
b442372
✅ updated test file to reach 80% coverage.
Diego-H-S Jul 22, 2024
2cfab52
✅ updated test file.
Diego-H-S Jul 22, 2024
927ac35
✅ updated test file.
Diego-H-S Jul 22, 2024
054fbe3
Merge branch '2.0-new-repository-structure' into mindful_migration
Diego-H-S Jul 22, 2024
f06e6c4
Merge branch '2.0-new-repository-structure' into hubspot_migration
Diego-H-S Jul 22, 2024
c83a41d
✅ updated test file to reach 80% coverage.
Diego-H-S Jul 22, 2024
d4d99b6
Merge branch '2.0-new-repository-structure' into outlook_migration
Diego-H-S Jul 22, 2024
30fbe16
🦺 added `return` in flow file.
Diego-H-S Jul 22, 2024
b898ee0
Merge branch 'outlook_migration' of https://github.com/Diego-H-S/viad…
Diego-H-S Jul 22, 2024
baf9440
🦺 added `return` in flow file.
Diego-H-S Jul 22, 2024
9989cd9
🦺 added `return` in flow file.
Diego-H-S Jul 22, 2024
26b1307
🦺 added `return` in flow file.
Diego-H-S Jul 22, 2024
20b31c1
Merge branch '2.0-new-repository-structure' into genesys_migration
Diego-H-S Jul 22, 2024
965f508
✅ added test integration file.
Diego-H-S Jul 23, 2024
b0cda1f
Merge branch 'genesys_migration' of https://github.com/Diego-H-S/viad…
Diego-H-S Jul 23, 2024
f505640
✅ added test integration file.
Diego-H-S Jul 23, 2024
214c375
✅ added test integration file.
Diego-H-S Jul 23, 2024
baafabd
📝 updated credential typo.
Diego-H-S Jul 23, 2024
3c71189
✅ added test integration file.
Diego-H-S Jul 23, 2024
d9d67f8
Merge pull request #923 from Diego-H-S/mindful_migration
fdelgadodyvenia Jul 24, 2024
2130a4f
Merge branch '2.0-new-repository-structure' into hubspot_migration
fdelgadodyvenia Jul 24, 2024
05d2546
Merge pull request #936 from Diego-H-S/hubspot_migration
fdelgadodyvenia Jul 24, 2024
19214ab
Merge branch '2.0-new-repository-structure' into outlook_migration
fdelgadodyvenia Jul 24, 2024
c404979
Merge pull request #939 from Diego-H-S/outlook_migration
fdelgadodyvenia Jul 24, 2024
ce6dbec
Merge branch '2.0-new-repository-structure' into genesys_migration
fdelgadodyvenia Jul 24, 2024
e609b03
Merge pull request #934 from Diego-H-S/genesys_migration
fdelgadodyvenia Jul 24, 2024
ee568e9
➕ Added `duckdb` to dependecies
djagoda881 Jul 25, 2024
4d286c1
➕ Added `prefect-aws` dependecy
djagoda881 Jul 25, 2024
c65cd9c
Merge pull request #960 from dyvenia/duckdb_dependency_bug_fix
djagoda881 Jul 25, 2024
27535c2
🚀 Relase 2.0.0-beta.1
djagoda881 Jul 25, 2024
dd4878c
Merge pull request #961 from dyvenia/2.0.0-beta.1
djagoda881 Jul 25, 2024
e8068f9
cloud for customer improvement
fdelgadodyvenia Jul 25, 2024
99faec2
recover gitignore
fdelgadodyvenia Jul 25, 2024
4ff50c6
removing unuseless files
fdelgadodyvenia Jul 25, 2024
029c90a
docker initial
fdelgadodyvenia Jul 25, 2024
78d54fc
rollback gitignore
fdelgadodyvenia Jul 25, 2024
cd05bba
update ignore
fdelgadodyvenia Jul 25, 2024
b4fc75b
rollback gitignore
fdelgadodyvenia Jul 25, 2024
f667ab6
remove unuseless file
fdelgadodyvenia Jul 25, 2024
22a4957
Merge pull request #962 from fdelgadodyvenia/c4c_test
fdelgadodyvenia Jul 25, 2024
0c43153
Sharepoint orchestration code refactor (#950)
Rafalz13 Aug 6, 2024
e08fc0a
Sharepoint - multiple files logic applied to the source class (#942)
Rafalz13 Aug 6, 2024
3303435
✨ Added 0365 (#969)
Rafalz13 Aug 7, 2024
ebc1a79
Orchestration last changes (#953)
trymzet Aug 13, 2024
923b95f
✨ Add GitHub release step
trymzet Aug 14, 2024
a37ce7b
📝 Document the new release process
trymzet Aug 14, 2024
51af7d7
📌 Bump version
trymzet Aug 14, 2024
5c81b74
♻️ Add last changes from other branches
trymzet Aug 14, 2024
38dde53
♻️ Update some sources' test configuration to match rest of lib
trymzet Aug 14, 2024
fac1183
📝 Add more docs on contributing
trymzet Aug 14, 2024
662ad38
📝 Update a link
trymzet Aug 14, 2024
17e91a5
🐛 Update lock files, removing optional deps
trymzet Aug 14, 2024
f752987
Merge branch '2.0' into 2.0-new-repository-structure
trymzet Aug 14, 2024
91d185d
⬆️ Update dependencies
trymzet Aug 19, 2024
04912f4
🚨 Linting
trymzet Aug 19, 2024
c2c9d41
🐛 Add TOML support to coverage
trymzet Aug 19, 2024
5b81385
✅ Fix `_cast_df()` test failing on datetimes in pandas 2.0
trymzet Aug 19, 2024
090d6bc
⬆️ Run CI on Python 3.12
trymzet Aug 19, 2024
1972760
➖ Remove unused `pytest-cov`
trymzet Aug 19, 2024
5ecad24
⬆️ Upgrade Python version so Rye CI action uses 3.12
trymzet Aug 19, 2024
ee3b66e
⬆️ Upgrade Python to 3.12 in the images
trymzet Aug 19, 2024
bf32428
📝 Improve container env docs
trymzet Aug 19, 2024
795c895
⬇️ Rollback `pyarrow` to v10.x
trymzet Aug 19, 2024
51f8c2d
♻️ Use a `skip_test_on_missing_extra()` utils to simplify life
trymzet Aug 19, 2024
ad42983
🧑‍💻 Install dev dependencies in local containers
trymzet Aug 19, 2024
9db5a06
🐛 Fix for broken `numpy` version
trymzet Aug 19, 2024
075b333
🚧 RedshiftSpectrum source unit tests - WIP
trymzet Aug 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 24 additions & 2 deletions .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,34 @@ jobs:
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build
- name: Build and publish viadot-lite image
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
platforms: linux/amd64
push: true
tags: ghcr.io/${{ github.repository }}/viadot:${{ github.event.inputs.tag }}
target: viadot-lite
tags: ghcr.io/${{ github.repository }}/viadot-lite:${{ github.event.inputs.tag }}

- name: Build and publish viadot-aws image
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
platforms: linux/amd64
push: true
target: viadot-aws
tags: ghcr.io/${{ github.repository }}/viadot-aws:${{ github.event.inputs.tag }}

- name: Build and publish viadot-azure image
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
platforms: linux/amd64
push: true
target: viadot-azure
tags: ghcr.io/${{ github.repository }}/viadot-azure:${{ github.event.inputs.tag }}
build-args: INSTALL_DATABRICKS=${{ github.event.inputs.if_databricks }}

21 changes: 18 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ target/
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
Expand Down Expand Up @@ -126,13 +129,14 @@ dmypy.json
# Pyre type checker
.pyre/

# Linux
# OS files
.DS_Store
.bash_history
.bashrc
.viminfo
.netrwhist
.ssh

.python_history

# Azure
.azure
Expand Down Expand Up @@ -162,6 +166,17 @@ profiles.yaml
# DataHub
.datahub

# local/env
*.prefect/
*.config/
*.local/

# VS Code
.vscode-server/

# Jupyter notebook
*.ipynb

# Git
.gitconfig

Expand All @@ -171,4 +186,4 @@ profiles.yaml
*.secret

# AWS
.aws
.aws
20 changes: 20 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]


- Added new version of `Genesys` connector and test files.
- Added new version of `Outlook` connector and test files.
- Added new version of `Hubspot` connector and test files.
- Added `Mindful` connector and test file.


### Added

- Added `sap_to_parquet` Prefect flow.
- Added `duckdb_to_sql_server`, `duckdb_to_parquet`, `duckdb_transform` Prefect flows.
- Added `bcp` and `duckdb_query` Prefect tasks.
- Added `DuckDB` source class.
- Added `sql_server_to_minio` flow for prefect.
- Added `df_to_minio` task for prefect
- Added handling for `DatabaseCredentials` and `Secret` blocks in `prefect/utlis.py:get_credentials`
- Added `SQLServer` source and tasks `create_sql_server_table`, `sql_server_to_df`,`sql_server_query`
- Added `basename_template` to `MinIO` source
- Added `_empty_column_to_string` and `_convert_all_to_string_type` to convert data types to string.
- Added `na_values` parameter to `Sharepoint` class to parse `N/A` values coming from the excel file columns.
- Added `get_last_segment_from_url` function to sharepoint file.
Expand All @@ -35,6 +51,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed

- Changed location of `task_utils.py` and removed unused/prefect1-related tasks.
- Changed the way of handling `NA` string values and mapped column types to `str` for `Sharepoint` source.
- Added `SQLServerToDF` task
- Added `SQLServerToDuckDB` flow which downloads data from SQLServer table, loads it to parquet file and then uploads it do DuckDB
Expand All @@ -53,6 +70,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Removed Prefect dependency from the library (Python library, Docker base image)
- Removed `catch_extra_separators()` from `SAPRFCV2` class

### Fixed
- Fixed the typo in credentials in `SQLServer` source

## [0.4.3] - 2022-04-28

### Added
Expand Down
38 changes: 37 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,43 @@ We provide the extensions, settings, and tasks for VSCode in the `.vscode` folde

### Development Docker container

If you wish to develop in a Docker container, viadot comes with a VSCode task to make that simple. You can easily spin up a terminal in the container with the `Ctrl+Shift+B` shortcut. The container will have all of the contents of the root `viadot` directory mapped to `/home/viadot`.
#### Bulding of containers

To build all available containers, run the following command:

**NOTE**: All the following commands must be execured from within the `viadot/docker/` directory.

```bash
docker compose up -d
```
If you want to build a specific one, add its name at the end of the command:

```bash
docker compose up -d viadot-azure
```

#### Building docker images

All necessary Docker images are released in `ghcr.io` and are included in the `docker-compose.yml` file, but if you want to create your own custom Docker image, follow the following instructions.

In the repository, we have three possible images to build:

- `viadot-lite`
- `viadot-azure`
- `viadot-aws`

To build an image, you have to be in the root directory of the repository and run the following command with selected target:

```bash
docker build --target viadot-azure -t <name of your image>:<version of your image> -f docker/Dockerfile .
```


#### Start of work inside the container

```bash
docker exec -it viadot-azure bash
```

### Environment variables

Expand Down
105 changes: 83 additions & 22 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.10-slim-bullseye
FROM python:3.10-slim-bullseye AS base

# Add user
RUN useradd --non-unique --uid 1000 --create-home viadot && \
Expand All @@ -15,15 +15,28 @@ SHELL ["/bin/sh", "-c"]
RUN groupadd docker && \
usermod -aG docker viadot

# Release File Error
# https://stackoverflow.com/questions/63526272/release-file-is-not-valid-yet-docker
RUN echo "Acquire::Check-Valid-Until \"false\";\nAcquire::Check-Date \"false\";" | cat > /etc/apt/apt.conf.d/10no--check-valid-until
Comment on lines -18 to -20
Copy link
Contributor

@trymzet trymzet Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed in the past to be able to reliably build the image on WSL, since WSL hardware clock would sometimes get out of sync and cause random issues such as this one.


# System packages
RUN apt update -q && yes | apt install -q gnupg vim unixodbc-dev build-essential \
curl python3-dev libboost-all-dev libpq-dev python3-gi sudo git software-properties-common
RUN apt update -q && yes | apt install -q gnupg vim curl git unixodbc
ENV PIP_NO_CACHE_DIR=1
RUN pip install --upgrade cffi

# This one's needed for the SAP RFC connector.
# It must be installed here as the SAP package does not define its dependencies,
# so `pip install pyrfc` breaks if all deps are not already present.
RUN pip install cython==0.29.24

# Python env
RUN pip install --upgrade pip


ENV USER viadot
ENV HOME="/home/$USER"
ENV PATH="$HOME/.local/bin:$PATH"
ENV RYE_HOME="$HOME/rye"
ENV PATH="$RYE_HOME/shims:$PATH"

# Install Rye and uv.
RUN curl -sSf https://rye.astral.sh/get | RYE_TOOLCHAIN_VERSION="3.10" RYE_INSTALL_OPTION="--yes" bash && \
rye config --set-bool behavior.use-uv=true

# Fix for old SQL Servers still using TLS < 1.2
RUN chmod +rwx /usr/lib/ssl/openssl.cnf && \
Expand All @@ -40,6 +53,13 @@ RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \

COPY docker/odbcinst.ini /etc


####################
### viadot-azure ###
####################

FROM base as viadot-azure

ARG INSTALL_DATABRICKS=false

# Databricks source setup
Expand All @@ -56,31 +76,72 @@ RUN if [ "$INSTALL_DATABRICKS" = "true" ]; then \

ENV SPARK_HOME /usr/local/lib/python3.10/site-packages/pyspark

# This one's needed for the SAP RFC connector.
# It must be installed here as the SAP package does not define its dependencies,
# so `pip install pyrfc` breaks if all deps are not already present.
RUN pip install cython==0.29.24

# Python env
RUN pip install --upgrade pip

ENV USER viadot
ENV HOME="/home/$USER"
ENV PATH="$HOME/.local/bin:$PATH"
ARG INSTALL_DATABRICKS=false

WORKDIR ${HOME}

COPY --chown=${USER}:${USER} . ./viadot

RUN rye lock --reset --features viadot-azure --pyproject viadot/pyproject.toml
RUN sed '/-e/d' ./viadot/requirements.lock > ./viadot/requirements.txt
RUN pip install --no-cache-dir -r ./viadot/requirements.txt

# Dependecy install
RUN if [ "$INSTALL_DATABRICKS" = "true" ]; then \
pip install ./viadot/.[databricks]; \
else \
pip install ./viadot; \
fi

# Dependecy install
RUN pip install ./viadot/.[viadot-azure]

# Cleanup.
RUN rm -rf ./viadot

USER ${USER}


###################
### viadot-lite ###
###################

FROM base as viadot-lite

WORKDIR ${HOME}

COPY --chown=${USER}:${USER} . ./viadot

RUN rye lock --reset --features viadot-lite --pyproject viadot/pyproject.toml
RUN sed '/-e/d' ./viadot/requirements.lock > ./viadot/requirements.txt
RUN pip install --no-cache-dir -r ./viadot/requirements.txt

# Dependecy install
RUN pip install ./viadot/

# Cleanup.
RUN rm -rf ./viadot

USER ${USER}

##################
### viadot-aws ###
##################

FROM base as viadot-aws


WORKDIR ${HOME}

COPY --chown=${USER}:${USER} . ./viadot

RUN rye lock --reset --features viadot-aws --pyproject viadot/pyproject.toml
RUN sed '/-e/d' ./viadot/requirements.lock > ./viadot/requirements.txt
RUN pip install --no-cache-dir -r ./viadot/requirements.txt

# Dependecy install
RUN pip install ./viadot/.[viadot-aws]

# Cleanup.
RUN rm -rf ./viadot

USER ${USER}


27 changes: 24 additions & 3 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,34 @@
version: "3"

services:
viadot_2:
image: ghcr.io/dyvenia/viadot/viadot:2.0-latest
container_name: viadot_2
viadot_2_lite:
image: ghcr.io/dyvenia/viadot/viadot-lite:2.0.0-beta.1
container_name: viadot_2_lite
volumes:
# - ${HOME}/.databricks-connect:/home/viadot/.databricks-connect
# - ${HOME}/.config/viadot/config.yaml:/home/viadot/.config/viadot/config.yaml
- ../:/home/viadot
shm_size: "4gb"
command: sleep infinity
restart: "unless-stopped"
viadot_2_azure:
image: ghcr.io/dyvenia/viadot/viadot-azure:2.0.0-beta.1
container_name: viadot_2_azure
volumes:
# - ${HOME}/.databricks-connect:/home/viadot/.databricks-connect
# - ${HOME}/.config/viadot/config.yaml:/home/viadot/.config/viadot/config.yaml
- ../:/home/viadot
shm_size: "4gb"
command: sleep infinity
restart: "unless-stopped"
viadot_2_aws:
image: ghcr.io/dyvenia/viadot/viadot-aws:2.0.0-beta.1
container_name: viadot_2_aws
volumes:
# - ${HOME}/.databricks-connect:/home/viadot/.databricks-connect
# - ${HOME}/.config/viadot/config.yaml:/home/viadot/.config/viadot/config.yaml
- ../:/home/viadot
shm_size: "4gb"
command: sleep infinity
restart: "unless-stopped"

5 changes: 5 additions & 0 deletions docs/advanced_usage/docker_containers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Currently there are tree available containers to build:

- `viadot-lite` - It has installed default dependencies and supports only non-cloud-specific sources.
- `viadot-azure` - It has installed default and viadot-azure dependencies. Supports Azure-based sources and non-cloud-specific ones.
- `viadot-aws` - It has installed default and aws-azure dependencies. Supports AWS-based sources and non-cloud-specific ones.
24 changes: 24 additions & 0 deletions docs/getting_started/getting_started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@

### Prerequisites

We assume that you have [Rye](https://rye-up.com/) installed:

```console
curl -sSf https://rye-up.com/get | bash
```

### Installation

Clone the `2.0` branch, and set up and run the environment:

```console
git clone https://github.com/dyvenia/viadot.git -b 2.0 && \
cd viadot && \
rye sync
```

### Configuration

In order to start using sources, you must configure them with required credentials. Credentials can be specified either in the viadot config file (by default, `~/.config/viadot/config.yaml`), or passed directly to each source's `credentials` parameter.

You can find specific information about each source's credentials in [the documentation](../references/sources/sql_sources.md).
Loading