- Python implements Transport interface - HTTP and Kafka transports are available @mobuchowski
- Airflow: custom extractors lookup uses only get_operator_classnames method @mobuchowski
- README.md created at OpenLineage/integrations for compatibility matrix @howardyoo
- CI: add integration tests for Airflow's SnowflakeOperator and dbt-snowflake @mobuchowski
- Introduce DatasetVersion facet in spec @pawel-big-lebowski
- Airflow: add external query id facet @mobuchowski
- Complete Fix of Snowflake Extractor get_hook() Bug @denimalpaca
- Update artwork @rossturk
- Catch possible failures when emitting events and log them @mobuchowski
- dbt: jinja2 code using do extensions does not crash @mobuchowski
- Extract source code of PythonOperator code similar to SQL facet @mobuchowski
- Add DatasetLifecycleStateDatasetFacet to spec @pawel-big-lebowski
- Airflow: extract source code from BashOperator @mobuchowski
- Add generic facet to collect environmental properties (EnvironmentFacet) @harishsune
- OpenLineage sensor for OpenLineage-Dagster integration @dalinkim
- Java-client: make generator generate enums as well @pawel-big-lebowski
- Added
UnknownOperatorAttributeRunFacet
to Airflow integration to record operators that don't produce lineage @collado-mike
- Airflow: increase import timeout in tests, fix exit from integration @mobuchowski
- Reduce logging level for import errors to info @rossturk
- Remove AWS secret keys and extraneous Snowflake parameters from connection uri @collado-mike
- Convert to LifecycleStateChangeDatasetFacet @pawel-big-lebowski
- Proxy backend example using
Kafka
@wslulciuc - Support Databricks Delta Catalog naming convention with DatabricksDeltaHandler @wjohnson
- Add javadoc as part of build task @mobuchowski
- Include TableStateChangeFacet in non V2 commands for Spark @mr-yusupov
- Support for SqlDWRelation on Databricks' Azure Synapse/SQL DW Connector @wjohnson
- Implement input visitors for v2 commands @pawel-big-lebowski
- Enabled SparkListenerJobStart events to trigger open lineage events @collado-mike
- dbt: job namespaces for given dbt run match each other @mobuchowski
- Fix Breaking SnowflakeOperator Changes from OSS Airflow @denimalpaca
- Made corrections to account for DeltaDataSource handling @collado-mike
- Support for dbt-spark adapter @mobuchowski
- New
backend
to proxy OpenLineage events to one or more event streams 🎉 @mandy-chessell @wslulciuc - Add Spark extensibility API with support for custom Dataset and custom facet builders @collado-mike
- airflow: fix import failures when dependencies for bigquery, dbt, great_expectations extractors are missing @lukaszlaszko
- Fixed openlineage-spark jar to correctly rename bundled dependencies @collado-mike
0.4.0 - 2021-12-13
- Spark output metrics @OleksandrDvornik
- Separated tests between Spark 2 & 3 @pawel-big-lebowski
- Databricks install README and init scripts @wjohnson
- Iceberg integration with unit tests @pawel-big-lebowski
- Kafka read and write support @OleksandrDvornik / @collado-mike
- Arbitrary parameters supported in HTTP URL construction @wjohnson
- Increased visitor coverage for Spark commands @mobuchowski / @pawel-big-lebowski
- dbt: column descriptions are properly filled from metadata.json @mobuchowski
- dbt: allow parsing artifacts with version higher than officially supported @mobuchowski
- dbt: dbt build command is supported @mobuchowski
- dbt: fix crash when build command is used with seeds in dbt 1.0.0rc3 @mobuchowski
- spark: increase logical plan visitor coverage @mobuchowski
- spark: fix logical serialization recursion issue @OleksandrDvornik
- Use URL#getFile to fix build on Windows @mobuchowski
0.3.1 - 2021-10-21
- fix import in spark3 visitor @mobuchowski
0.3.0 - 2021-10-21
- Spark3 support @OleksandrDvornik / @collado-mike
- LineageBackend for Airflow 2 @mobuchowski
- Adding custom spark version facet to spark integration @OleksandrDvornik
- Adding dbt version facet @mobuchowski
- Added support for Redshift profile @AlessandroLollo
- Sanitize JDBC URLs @OleksandrDvornik
- strip openlineage url in python client @OleksandrDvornik
- deploy spec if spec file changes @mobuchowski
0.2.3 - 2021-10-07
- Add dbt
v3
manifest support @mobuchowski
0.2.2 - 2021-09-08
- Implement OpenLineageValidationAction for Great Expectations @collado-mike
- facet: add expectations assertions facet @mobuchowski
- airflow: pendulum formatting fix, add tests @mobuchowski
- dbt: do not emit events if run_result file was not updated @mobuchowski
0.2.1 - 2021-08-27
- Default
--project-dir
argument to current directory indbt-ol
script @mobuchowski
0.2.0 - 2021-08-23
-
Parse dbt command line arguments when invoking
dbt-ol
@mobuchowski. For example:$ dbt-ol run --project-dir path/to/dir
-
Set
UnknownFacet
for spark (captures metadata about unvisited nodes from spark plan not yet supported) @OleksandrDvornik
- Remove
model
from dbt job name @mobuchowski - Default dbt job namespace to output dataset namespace @mobuchowski
- Rename
openlineage.spark.*
toio.openlineage.spark.*
@OleksandrDvornik
- Remove instance references to extractors from DAG and avoid copying log property for serializability @collado-mike
0.1.0 - 2021-08-12
OpenLineage is an Open Standard for lineage metadata collection designed to record metadata for a job in execution. The initial public release includes:
- An inital specification. The the inital version
1-0-0
of the OpenLineage specification defines the core model and facets. - Integrations that collect lineage metadata as OpenLineage events:
Apache Airflow
with support for BigQuery, Great Expectations, Postgres, Redshift, SnowflakeApache Spark
dbt
- Clients that send OpenLineage events to an HTTP backend. Both
java
andpython
are initially supported.