-
Notifications
You must be signed in to change notification settings - Fork 180
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add performance integration tests (#827)
## Description This PR adds a step to our CI to measure how quickly Cosmos can run models. This is part of a larger initiative to make the project more performant now that it's reached a certain level of maturity. How it works: - We now have [a test that generates a dbt project with a certain number of sequential models](https://github.com/astronomer/astronomer-cosmos/blob/performance-int-tests/tests/perf/test_performance.py) (based on a parameter that gets passed in), runs a simple DAG, and measures task throughput (measured in terms of models run per second - I've extended our CI to run this test for 1, 10, 50, and 100 models to start - This CI reports out a GitHub Actions output that gets shown in the actions summary, [at the bottom](https://github.com/astronomer/astronomer-cosmos/actions/runs/7894490582) While this isn't perfect, it's a step in the right direction - we now have some general visibility! Note that these numbers may not be indicative of a production Airflow environment running something like the Kubernetes Executor, because this runs a local executor on GH Actions runners. Still, it's meant as a benchmark to demonstrate whether we're moving in the right direction or not. As part of this, I've also refactored our tests to call a script from the pyproject file instead of embedding the scripts directly in the file. This should make it easier to maintain and track changes. <!-- Add a brief but complete description of the change. --> ## Related Issue(s) <!-- If this PR closes an issue, you can use a keyword to auto-close. --> <!-- i.e. "closes #0000" --> #800 ## Breaking Change? <!-- If this introduces a breaking change, specify that here. --> ## Checklist - [ ] I have made corresponding changes to the documentation (if required) - [ ] I have added tests that prove my fix is effective or that my feature works
- Loading branch information
Showing
24 changed files
with
351 additions
and
104 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,10 +11,8 @@ concurrency: | |
cancel-in-progress: true | ||
|
||
jobs: | ||
|
||
Authorize: | ||
environment: | ||
${{ github.event_name == 'pull_request_target' && | ||
environment: ${{ github.event_name == 'pull_request_target' && | ||
github.event.pull_request.head.repo.full_name != github.repository && | ||
'external' || 'internal' }} | ||
runs-on: ubuntu-latest | ||
|
@@ -30,8 +28,8 @@ jobs: | |
|
||
- uses: actions/setup-python@v3 | ||
with: | ||
python-version: '3.9' | ||
architecture: 'x64' | ||
python-version: "3.9" | ||
architecture: "x64" | ||
|
||
- run: pip3 install hatch | ||
- run: hatch run tests.py3.9-2.7:type-check | ||
|
@@ -294,6 +292,55 @@ jobs: | |
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres | ||
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH | ||
|
||
Run-Performance-Tests: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: ["3.11"] | ||
airflow-version: ["2.7"] | ||
num-models: [1, 10, 50, 100] | ||
|
||
steps: | ||
- uses: actions/checkout@v3 | ||
with: | ||
ref: ${{ github.event.pull_request.head.sha || github.ref }} | ||
- uses: actions/cache@v3 | ||
with: | ||
path: | | ||
~/.cache/pip | ||
.nox | ||
key: perf-test-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.airflow-version }}-${{ hashFiles('pyproject.toml') }}-${{ hashFiles('cosmos/__init__.py') }} | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Install packages and dependencies | ||
run: | | ||
python -m pip install hatch | ||
hatch -e tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }} run pip freeze | ||
- name: Run performance tests against against Airflow ${{ matrix.airflow-version }} and Python ${{ matrix.python-version }} | ||
id: run-performance-tests | ||
run: | | ||
hatch run tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }}:test-performance-setup | ||
hatch run tests.py${{ matrix.python-version }}-${{ matrix.airflow-version }}:test-performance | ||
# read the performance results and set them as an env var for the next step | ||
# format: NUM_MODELS={num_models}\nTIME={end - start}\n | ||
cat /tmp/performance_results.txt > $GITHUB_STEP_SUMMARY | ||
env: | ||
AIRFLOW_HOME: /home/runner/work/astronomer-cosmos/astronomer-cosmos/ | ||
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres | ||
AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT: 90.0 | ||
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH | ||
MODEL_COUNT: ${{ matrix.num-models }} | ||
|
||
env: | ||
AIRFLOW_HOME: /home/runner/work/astronomer-cosmos/astronomer-cosmos/ | ||
AIRFLOW_CONN_AIRFLOW_DB: postgres://postgres:[email protected]:5432/postgres | ||
PYTHONPATH: /home/runner/work/astronomer-cosmos/astronomer-cosmos/:$PYTHONPATH | ||
|
||
Code-Coverage: | ||
if: github.event.action != 'labeled' | ||
|
@@ -309,7 +356,7 @@ jobs: | |
- name: Set up Python 3.11 | ||
uses: actions/setup-python@v3 | ||
with: | ||
python-version: '3.11' | ||
python-version: "3.11" | ||
- name: Install coverage | ||
run: | | ||
pip3 install coverage | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
|
||
target/ | ||
dbt_packages/ | ||
logs/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
dbt project for running performance tests. | ||
|
||
The `models` directory gets populated by an integration test defined in `tests/perf`. |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Name your project! Project names should contain only lowercase characters | ||
# and underscores. A good package name should reflect your organization's | ||
# name or the intended use of these models | ||
name: "perf" | ||
version: "1.0.0" | ||
config-version: 2 | ||
|
||
model-paths: ["models"] | ||
analysis-paths: ["analyses"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seeds"] | ||
macro-paths: ["macros"] | ||
snapshot-paths: ["snapshots"] | ||
|
||
clean-targets: # directories to be removed by `dbt clean` | ||
- "target" | ||
- "dbt_packages" |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
simple: | ||
target: dev | ||
outputs: | ||
dev: | ||
type: sqlite | ||
threads: 1 | ||
database: "database" | ||
schema: "main" | ||
schemas_and_paths: | ||
main: "{{ env_var('DBT_SQLITE_PATH') }}/imdb.db" | ||
schema_directory: "{{ env_var('DBT_SQLITE_PATH') }}" |
Empty file.
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
""" | ||
A DAG that uses Cosmos to render a dbt project for performance testing. | ||
""" | ||
|
||
import airflow | ||
from datetime import datetime | ||
import os | ||
from pathlib import Path | ||
|
||
from cosmos import DbtDag, ProjectConfig, ProfileConfig, RenderConfig | ||
|
||
DEFAULT_DBT_ROOT_PATH = Path(__file__).parent / "dbt" | ||
DBT_ROOT_PATH = Path(os.getenv("DBT_ROOT_PATH", DEFAULT_DBT_ROOT_PATH)) | ||
DBT_SQLITE_PATH = str(DEFAULT_DBT_ROOT_PATH / "data") | ||
|
||
profile_config = ProfileConfig( | ||
profile_name="simple", | ||
target_name="dev", | ||
profiles_yml_filepath=(DBT_ROOT_PATH / "simple/profiles.yml"), | ||
) | ||
|
||
cosmos_perf_dag = DbtDag( | ||
project_config=ProjectConfig( | ||
DBT_ROOT_PATH / "perf", | ||
env_vars={"DBT_SQLITE_PATH": DBT_SQLITE_PATH}, | ||
), | ||
profile_config=profile_config, | ||
render_config=RenderConfig( | ||
dbt_deps=False, | ||
), | ||
# normal dag parameters | ||
schedule_interval=None, | ||
start_date=datetime(2024, 1, 1), | ||
catchup=False, | ||
dag_id="performance_dag", | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,16 +9,8 @@ description = "Orchestrate your dbt projects in Airflow" | |
readme = "README.rst" | ||
license = "Apache-2.0" | ||
requires-python = ">=3.8" | ||
authors = [ | ||
{ name = "Astronomer", email = "[email protected]" }, | ||
] | ||
keywords = [ | ||
"airflow", | ||
"apache-airflow", | ||
"astronomer", | ||
"dags", | ||
"dbt", | ||
] | ||
authors = [{ name = "Astronomer", email = "[email protected]" }] | ||
keywords = ["airflow", "apache-airflow", "astronomer", "dags", "dbt"] | ||
classifiers = [ | ||
"Development Status :: 3 - Alpha", | ||
"Environment :: Web Environment", | ||
|
@@ -56,48 +48,23 @@ dbt-all = [ | |
"dbt-spark", | ||
"dbt-vertica", | ||
] | ||
dbt-athena = [ | ||
"dbt-athena-community", | ||
"apache-airflow-providers-amazon>=8.0.0", | ||
] | ||
dbt-bigquery = [ | ||
"dbt-bigquery", | ||
] | ||
dbt-databricks = [ | ||
"dbt-databricks", | ||
] | ||
dbt-exasol = [ | ||
"dbt-exasol", | ||
] | ||
dbt-postgres = [ | ||
"dbt-postgres", | ||
] | ||
dbt-redshift = [ | ||
"dbt-redshift", | ||
] | ||
dbt-snowflake = [ | ||
"dbt-snowflake", | ||
] | ||
dbt-spark = [ | ||
"dbt-spark", | ||
] | ||
dbt-vertica = [ | ||
"dbt-vertica<=1.5.4", | ||
] | ||
openlineage = [ | ||
"openlineage-integration-common", | ||
"openlineage-airflow", | ||
] | ||
all = [ | ||
"astronomer-cosmos[dbt-all]", | ||
"astronomer-cosmos[openlineage]" | ||
] | ||
docs =[ | ||
dbt-athena = ["dbt-athena-community", "apache-airflow-providers-amazon>=8.0.0"] | ||
dbt-bigquery = ["dbt-bigquery"] | ||
dbt-databricks = ["dbt-databricks"] | ||
dbt-exasol = ["dbt-exasol"] | ||
dbt-postgres = ["dbt-postgres"] | ||
dbt-redshift = ["dbt-redshift"] | ||
dbt-snowflake = ["dbt-snowflake"] | ||
dbt-spark = ["dbt-spark"] | ||
dbt-vertica = ["dbt-vertica<=1.5.4"] | ||
openlineage = ["openlineage-integration-common", "openlineage-airflow"] | ||
all = ["astronomer-cosmos[dbt-all]", "astronomer-cosmos[openlineage]"] | ||
docs = [ | ||
"sphinx", | ||
"pydata-sphinx-theme", | ||
"sphinx-autobuild", | ||
"sphinx-autoapi", | ||
"apache-airflow-providers-cncf-kubernetes>=5.1.1" | ||
"apache-airflow-providers-cncf-kubernetes>=5.1.1", | ||
] | ||
tests = [ | ||
"packaging", | ||
|
@@ -137,9 +104,7 @@ Documentation = "https://astronomer.github.io/astronomer-cosmos" | |
path = "cosmos/__init__.py" | ||
|
||
[tool.hatch.build.targets.sdist] | ||
include = [ | ||
"/cosmos", | ||
] | ||
include = ["/cosmos"] | ||
|
||
[tool.hatch.build.targets.wheel] | ||
packages = ["cosmos"] | ||
|
@@ -175,51 +140,20 @@ matrix.airflow.dependencies = [ | |
[tool.hatch.envs.tests.scripts] | ||
freeze = "pip freeze" | ||
type-check = "mypy cosmos" | ||
test = 'pytest -vv --durations=0 . -m "not integration" --ignore=tests/test_example_dags.py --ignore=tests/test_example_dags_no_connections.py' | ||
test-cov = """pytest -vv --cov=cosmos --cov-report=term-missing --cov-report=xml --durations=0 -m "not integration" --ignore=tests/test_example_dags.py --ignore=tests/test_example_dags_no_connections.py""" | ||
# we install using the following workaround to overcome installation conflicts, such as: | ||
# apache-airflow 2.3.0 and dbt-core [0.13.0 - 1.5.2] and jinja2>=3.0.0 because these package versions have conflicting dependencies | ||
test-integration-setup = """pip uninstall -y dbt-postgres dbt-databricks dbt-vertica; \ | ||
rm -rf airflow.*; \ | ||
airflow db init; \ | ||
pip install 'dbt-core' 'dbt-databricks' 'dbt-postgres' 'dbt-vertica' 'openlineage-airflow'""" | ||
test-integration = """rm -rf dbt/jaffle_shop/dbt_packages; | ||
pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv)'""" | ||
test-integration-expensive = """pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
-k 'example_cosmos_python_models or example_virtualenv'""" | ||
test-integration-sqlite-setup = """pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \ | ||
rm -rf airflow.*; \ | ||
airflow db init; \ | ||
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4' """ | ||
test-integration-sqlite = """ | ||
pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
-k 'example_cosmos_sources or sqlite'""" | ||
test = 'sh scripts/test/unit.sh' | ||
test-cov = 'sh scripts/test/unit-cov.sh' | ||
test-integration-setup = 'sh scripts/test/integration-setup.sh' | ||
test-integration = 'sh scripts/test/integration.sh' | ||
test-integration-expensive = 'sh scripts/test/integration-expensive.sh' | ||
test-integration-sqlite-setup = 'sh scripts/test/integration-sqlite-setup.sh' | ||
test-integration-sqlite = 'sh scripts/test/integration-sqlite.sh' | ||
test-performance-setup = 'sh scripts/test/performance-setup.sh' | ||
test-performance = 'sh scripts/test/performance.sh' | ||
|
||
[tool.pytest.ini_options] | ||
filterwarnings = [ | ||
"ignore::DeprecationWarning", | ||
] | ||
filterwarnings = ["ignore::DeprecationWarning"] | ||
minversion = "6.0" | ||
markers = [ | ||
"integration", | ||
"sqlite" | ||
] | ||
markers = ["integration", "sqlite", "perf"] | ||
|
||
###################################### | ||
# DOCS | ||
|
@@ -233,7 +167,7 @@ dependencies = [ | |
"sphinx-autobuild", | ||
"sphinx-autoapi", | ||
"openlineage-airflow", | ||
"apache-airflow-providers-cncf-kubernetes>=5.1.1" | ||
"apache-airflow-providers-cncf-kubernetes>=5.1.1", | ||
] | ||
|
||
[tool.hatch.envs.docs.scripts] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
--ignore=tests/perf \ | ||
-k 'example_cosmos_python_models or example_virtualenv' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# we install using the following workaround to overcome installation conflicts, such as: | ||
# apache-airflow 2.3.0 and dbt-core [0.13.0 - 1.5.2] and jinja2>=3.0.0 because these package versions have conflicting dependencies | ||
pip uninstall -y dbt-postgres dbt-databricks dbt-vertica; \ | ||
rm -rf airflow.*; \ | ||
airflow db init; \ | ||
pip install 'dbt-core' 'dbt-databricks' 'dbt-postgres' 'dbt-vertica' 'openlineage-airflow' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \ | ||
rm -rf airflow.*; \ | ||
airflow db init; \ | ||
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
--ignore=tests/perf \ | ||
-k 'example_cosmos_sources or sqlite' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
rm -rf dbt/jaffle_shop/dbt_packages; | ||
pytest -vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m integration \ | ||
--ignore=tests/perf \ | ||
-k 'not (sqlite or example_cosmos_sources or example_cosmos_python_models or example_virtualenv)' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
pip uninstall -y dbt-core dbt-sqlite openlineage-airflow openlineage-integration-common; \ | ||
rm -rf airflow.*; \ | ||
airflow db init; \ | ||
pip install 'dbt-core==1.4' 'dbt-sqlite<=1.4' 'dbt-databricks<=1.4' 'dbt-postgres<=1.4' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pytest -vv \ | ||
-s \ | ||
-m 'perf' \ | ||
--ignore=tests/test_example_dags.py \ | ||
--ignore=tests/test_example_dags_no_connections.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
pytest \ | ||
-vv \ | ||
--cov=cosmos \ | ||
--cov-report=term-missing \ | ||
--cov-report=xml \ | ||
--durations=0 \ | ||
-m "not (integration or perf)" \ | ||
--ignore=tests/perf \ | ||
--ignore=tests/test_example_dags.py \ | ||
--ignore=tests/test_example_dags_no_connections.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
pytest \ | ||
-vv \ | ||
--durations=0 \ | ||
-m "not (integration or perf)" \ | ||
--ignore=tests/perf \ | ||
--ignore=tests/test_example_dags.py \ | ||
--ignore=tests/test_example_dags_no_connections.py |
Oops, something went wrong.