Skip to content

Commit

Permalink
Merge pull request #92 from bitol-io/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
pflooky authored Oct 20, 2024
2 parents c125b04 + 13b2d3d commit 51bffe2
Show file tree
Hide file tree
Showing 43 changed files with 10,595 additions and 4,236 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/docs-site-deploy.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
name: docs-site-deploy
on:
push:
tags:
- *
branches:
- main
permissions:
Expand All @@ -26,10 +28,8 @@ jobs:
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-open-in-new-tab mike
- run: pip install mkdocs-material
- run: pip install "mkdocs-material[imaging]"
- run: bash script/build_docs.sh
- run: pip install mkdocs-material mkdocs-open-in-new-tab "mkdocs-material[imaging]" mkdocs-awesome-pages-plugin mike
- run: bash src/script/build_docs.sh
- run: |
latest_tag=$(git describe --tags --abbrev=0)
mike deploy --push --update-aliases "$latest_tag" latest
18 changes: 18 additions & 0 deletions .github/workflows/validate-examples.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: validate-examples
on:
push:
branches: ["*"]

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: latest
- name: Install ajv
run: |
npm i -g ajv-cli ajv-formats
- name: Validate examples
run: bash src/script/validate-examples.sh
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,10 @@
.idea
site
*.iml
.cache

docs/changelog.md
docs/contributing.md
docs/home.md
docs/vendors.md
docs/examples/**/*.md
88 changes: 81 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,76 @@ image: "https://raw.githubusercontent.com/bitol-io/artwork/main/horizontal/color

This document tracks the history and evolution of the **Open Data Contract Standard**.

# v3.0.0 - 2024-10-05 - IN REVIEW

* **New section**: Support & communication channels.
* **New section**: Servers.
* **Changes** to fundamentals :
* Rename `uuid` to `id`.
* Add `name`.
* Rename `quantumName` to `dataProduct` and make it optional.
* Rename `datasetDomain` to `domain` (we avoid the dataset prefix).
* Drop `datasetKind` (example: `virtualDataset`, was optional, have not seen any usage).
* Drop `userConsumptionMode` (examples: `analytical`, was optional, already deprecated in v2.).
* Drop `sourceSystem` (example: `bigQuery`, information will be encoded in servers).
* Drop `sourcePlatform` (example: `googleCloudPlatform`, information will be encoded in servers).
* Drop `productSlackChannel` (will move to support channels).
* Drop `productFeedbackUrl` (will move to support channels).
* Drop `productDl` (will move to support channels).
* Drop `username` (credentials should not be stored in the data contract).
* Drop `password` (credentials should not be stored in the data contract).
* Drop `driverVersion` (will move to servers if needed).
* Drop `driver` (will move to servers if needed).
* Drop `server` (will move to servers if needed).
* Drop `project` (BigQuery-specific, will move to servers).
* Drop `datasetName` (BigQuery-specific, will move to servers).
* Drop `database` (BigQuery-specific, will move to servers).
* Drop `schedulerAppName` (not part of the contract).
* **Changes** to Schema:
* Major changes, check spec.
* Adds support for non table formats, hierarchies, and arrays.
* `name` is a new field
* `items` is a new field
* `priorTableName` is not supported anymore, if needed, consider a custom property.
* `table` is not supported anymore, if needed, consider using `name`.
* `columns` is now `properties`
* `dataGranularity` is now `dataGranularityDescription`.
* `encryptedColumnName`is now `encryptedName`.
* `partitionStatus` is now `partitioned`.
* `clusterStatus` is not supported anymore, if needed, consider a custom property.
* `clusterKeyPosition` is not supported anymore, if needed, consider a custom property.
* `sampleValues` is now `examples`.
* `isNullable` is now `required`.
* `isUnique` is now `unique`.
* `isPrimaryKey` is now `primaryKey`.
* `criticalDataElementStatus` is now `criticalDataElement`.
* `clusterKeyPosition` is not supported anymore, if needed, consider a custom property.
* `transformSourceTables` is now `transformSourceObjects`
* Restrict `schema.*.logicalType` to be one of `string`, `date`, `number`, `integer`, `object`, `array`, `boolean`.
* Add `schema.*.logicalTypeOptions`.
* **Changes** to Data Quality:
* Significant changes have been applied to support more tools and use cases. Please review the new section.
* If needed, `templateName` is a custom property.
* `toolName` is obsolete, replaced by `type=custom; engine: <engine name>`.
* `scheduleCronExpression` is replaced by `schedule` and `scheduler`. `scheduleCronExpression: 0 20 * * *` becomes `schedule: 0 20 * * *` and `scheduler: cron`.
* Pricing:
* No changes.
* **Changes** to team (fka stakeholders):
* Replaces `stakeholders`. Content stays the same.
* **Changes** to Role:
* Added `description`
* Changed `access` is not required anymore
* Security:
* No changes.
* **Changes** to SLA:
* Starting with v3, the schema is not purely tables and columns, hence minor modifications: columns are now elements.
* `slaDefaultColumn` is now `slaDefaultElement`.
* `column` is now `element`.
* Explicit reference to Data QoS.
* **Changes** to custom and other properties:
* `systemInstance` is not supported anymore, if needed, consider a custom property.


# v2.2.2 - 2024-05-23 - APPROVED

* In JSON schema validation:
Expand All @@ -14,12 +84,13 @@ This document tracks the history and evolution of the **Open Data Contract Stand
* Change `price.priceAmount` data type from `string` to `number`.
* Change `slaProperties.value` data type from `string` to `oneOf[string, number]`.
* Change `slaProperties.valueExt` data type from `string` to `oneOf[string, number]`.
* Update [examples](docs/examples) to adhere to JSON schema.
* Full example from README directs to [full-example.yaml](docs/examples/all/full-example.yaml).
* Update [examples](docs/examples/README.md) to adhere to JSON schema.
* Full example from README directs to [full-example.yaml](docs/examples/all/full-example.odcs.yaml).
* Add in mkdocs for creating a [documentation website](https://bitol-io.github.io/open-data-contract-standard/). Check [building-doc.md](building-doc.md).
* Add vendors page [vendors.md](vendors.md). Feel free to add anyone there.

# v2.2.1 - 2023-12-18 - APPROVED

# v2.2.1 - 2023-12-18 - REPLACED BY V2.2.2

* Reformat quality examples to be valid YAML.
* Type of definition for authority have standard values: `businessDefinition`, `transformationImplementation`, `videoTutorial`, `tutorial`, and `implementation`.
Expand All @@ -28,7 +99,8 @@ This document tracks the history and evolution of the **Open Data Contract Stand
* Integrated as part of [Bitol](https://lfaidata.foundation/projects/bitol/).
* Reformat Markdown tables.

# v2.2.0 - 2023-07-27 - APPROVED

# v2.2.0 - 2023-07-27 - REPLACED BY V2.2.1

* New name to Open Data Contract Standard.
* `templateName` is now called `standardVersion`, v2.2.0 parsers should account for this change and support both to avoid a breaking change.
Expand All @@ -37,12 +109,14 @@ This document tracks the history and evolution of the **Open Data Contract Stand
* Various improvements and typo corrections.
* Finalization of fork under AIDA User Group.

# v2.1.1 - 2023-04-26 - APPROVED

# v2.1.1 - 2023-04-26 - REPLACED BY V2.2.0

* Open source version.
* Additional value field `valueExt` in SLA.

# v2.1.0 - 2023-03-23 - APPROVED

# v2.1.0 - 2023-03-23 - REPLACED BY V2.1.1

## Data Quality
The data contract adds elements specifically for interfacing with the Data Quality tooling.
Expand All @@ -68,7 +142,7 @@ The service-level agreements not previously used are more detailed to follow the
## Other
Removed the weight for system ratings from the data contract. Their default values remain.

# v2.0.0 - SUPERSEED BY V2.1.0
# v2.0.0 - REPLACED BY V2.1.0

## Guidelines & Evolution
* [Type case](https://google.github.io/styleguide/jsoncstyleguide.xml?showone=Property_Name_Format#Property_Name_Format)
Expand Down
85 changes: 2 additions & 83 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,88 +4,7 @@ description: "How you can contribute to the Open Data Contract Standard (ODCS)."
image: "https://raw.githubusercontent.com/bitol-io/artwork/main/horizontal/color/Bitol_Logo_color.svg"
---

# Open Data Contract Standard
# Contributing to Open Data Contract Standard

## Executive summary
First off, thanks for taking the time to contribute! ❤️
Thank you for your interest in contributing to Open Data Contract Standard (ODCS). Please refer to the [TSC contributing guidelines](https://github.com/bitol-io/tsc/blob/main/CONTRIBUTING.md).

All types of contributions are encouraged and valued. See the [Table of Contents](#table-of-contents) for different ways to help and details about how this project handles them. Please make sure to read the relevant section before making your contribution. It will make it a lot easier for us maintainers and smooth out the experience for all involved. The community looks forward to your contributions. 🎉

You do not have to be a member of AIDA User Group to contribute, although becoming a member is free. Strength is always in the number. Check [it out](https://aidausergroup.org/join/).

> And if you like the project, but just don't have time to contribute, that's fine. There are other easy ways to support the project and show your appreciation, which we would also be very happy about:
> - Star the project.
> - Tweet about it.
> - Refer this project in your project's readme.
> - Mention the project at local meetups and tell your friends/colleagues.
<!-- omit in toc -->
## Table of Contents

- [Code of Conduct](#code-of-conduct)
- [I Have a Question](#i-have-a-question)
- [I Want To Contribute](#i-want-to-contribute)
- [Suggesting Enhancements](#suggesting-enhancements)
- [Improving The Documentation](#improving-the-documentation)
- [Join The Project Team](#join-the-project-team)


## Code of Conduct

This project and everyone participating in it is governed by the
[Open Data Contract Standard Code of Conduct](blob/master/CODE_OF_CONDUCT.md).
By participating, you are expected to uphold this code. Please report unacceptable behavior
to [@jgperrin](https://github.com/jgperrin).


## I Have a Question

** New **

AIDA User Group also opened its Slack for Data Contract discussion. It is an alternate way of contributing to this project. The Slack channel is now [available](https://aidaug.slack.com/archives/C05UZRSBKLY).

You have to be a member of AIDA User Group (it's free) to have access to our Slack channel. All the details are [here](https://aidausergroup.org/welcome/).

> If you want to ask a question, we assume that you have read the available [Documentation](https://github.com/AIDAUserGroup/open-data-contract-standard).
Before you ask a question, it is best to search for existing [Issues](https://github.com/bitol-io/open-data-contract-standard/issues) that might help you. In case you have found a suitable issue and still need clarification, you can write your question in this issue. It is also advisable to search the internet for answers first.

If you then still feel the need to ask a question and need clarification, we recommend the following:

- Open a [New Issue](https://github.com/bitol-io/open-data-contract-standard/issues/new).
- Provide as much context as you can about what you're running into.

We will then take care of the issue as soon as possible.

## I Want To Contribute

> ### Legal Notice <!-- omit in toc -->
> When contributing to this project, you must agree that you have authored 100% of the content, that you have the necessary rights to the content and that the content you contribute may be provided under the project license.
### Suggesting Enhancements

This section guides you through submitting an enhancement suggestion for Data Contract Template, **including completely new features and minor improvements to existing functionality**. Following these guidelines will help maintainers and the community to understand your suggestion and find related suggestions.

<!-- omit in toc -->
#### Before Submitting an Enhancement

- Make sure that you are using the latest version.
- Read the [documentation](https://github.com/AIDAUserGroup/open-data-contract-standard) carefully and find out if the functionality is already covered.
- Perform a [search](https://github.com/bitol-io/open-data-contract-standard/issues) to see if the enhancement has already been suggested. If it has, add a comment to the existing issue instead of opening a new one.
- Find out whether your idea fits with the scope and aims of the project. It's up to you to make a strong case to convince the project's developers of the merits of this feature. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset.

<!-- omit in toc -->
#### How Do I Submit a Good Enhancement Suggestion?

Enhancement suggestions are tracked as [GitHub issues](https://github.com/bitol-io/open-data-contract-standard/issues).

- Use a **clear and descriptive title** for the issue to identify the suggestion.
- Provide a **step-by-step description of the suggested enhancement** in as many details as possible.
- **Describe the current behavior** and **explain which behavior you expected to see instead** and why. At this point you can also tell which alternatives do not work for you.
- **Explain why this enhancement would be useful** to most Open Data Contract Standard users. You may also want to point out the other projects that solved it better and which could serve as inspiration.

### Improving The Documentation
Please contact [@jgperrin](https://github.com/jgperrin). Examples are always welcome.

## Join The Project Team
Please contact [@jgperrin](https://github.com/jgperrin).
45 changes: 15 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Welcome!
Thanks for your interest and for taking the time to come here! ❤️

## Executive summary
This standard describes a structure for a **data contract**. Its current version is v2.2.2. It is available for you as an Apache 2.0 license. Contributions are welcome!
This standard describes a structure for a **data contract**. Its current version is v3.0.0. It is available for you as an Apache 2.0 license. Contributions are welcome!

## Discover the open standard
A reader-friendly version of the standard can be found on its [dedicated site](https://bitol-io.github.io/open-data-contract-standard/).
Expand All @@ -25,14 +25,18 @@ Discover the [Open Data Contract Standard](docs/README.md). This file contains s
### The basics of a data contract
A data contract defines the agreement between a data producer and consumers. A data contract contains several sections:

* [Fundamentals](docs/README.md#demographics).
* [Schema](docs/README.md#dataset-and-schema).
* [Data quality](docs/README.md#data-quality-).
* [Service-level agreement (SLA)](docs/README.md#service-level-agreement).
* [Security & stakeholders](docs/README.md#stakeholders).
* [Custom properties](docs/README.md#other-properties).
* [Fundamentals](docs/README.md#fundamentals).
* [Schema](docs/README.md#schema).
* [Data quality](docs/README.md#data-quality).
* [Support & communication channels](docs/README.md#support-and-communication-channels)
* [Pricing](docs/README.md#pricing)
* [Team](docs/README.md#team)
* [Roles](docs/README.md#roles)
* [Service-level agreement (SLA)](docs/README.md#service-level-agreement-sla).
* [Infrastructures & servers](docs/README.md#infrastructure-and-servers)
* [Custom properties](docs/README.md#custom-properties).

![Data contract schema](docs/img/data-contract-v2.2.1-schema.svg "Data contract schema")
![Data contract schema](docs/img/data-contract-diagram-latest.svg "Data contract schema")

*Figure 1: illustration of a data contract, its principal contributors, sections, and usage.*

Expand All @@ -44,30 +48,11 @@ validation of your YAML files. Links below show how you can import the schema:
- [IntelliJ](https://www.jetbrains.com/help/idea/json.html#ws_json_schema_add_custom)
- [VS Code](https://code.visualstudio.com/docs/languages/json#_json-schemas-and-settings)

## Contributing to the project
Check out the [CONTRIBUTING](./CONTRIBUTING.md) file.

## Articles and Other Resources
Check out the [resources](resources.md) page.

* 2024-07-17 - [Data Contracts in Action: Testing](https://medium.com/@pflooky/data-contracts-in-action-testing-111631338657)
* 2024-06-12 - [The Future of Data Management: An Enabler of AI Development? A Basic Illustration with RAG, Open Standards, and Data Contracts](https://blog.owulveryck.info/2024/06/12/the-future-of-data-management-an-enabler-of-ai-development-a-basic-illustration-with-rag-open-standards-and-data-contracts.html)
* 2024-05-30 - [ODCS Roadmap](https://medium.com/abeadata/odcs-roadmap-9b9a17367af4)
* 2024-05-25 - [Conceptual model of Data Quality of Service as Code by Jarkko Moilanen](https://aidausergroup.org/2024/05/25/aida-user-group-forecaster-pi-day-highlights-data-quality-whats-new/)
* 2024-02-06 - [Getting started with ODCS](https://medium.com/abeadata/getting-started-with-odcs-3ba790707879)
* 2023-12-08 - [Why the Need for Standardizing Data Contracts?](https://medium.com/abeadata/why-the-need-for-standardizing-data-contracts-133bc3491148)
* 2023-11-30 - [Linux Foundation AI & Data - Bitol Joins LF AI & Data as New Sandbox Project](https://lfaidata.foundation/blog/2023/11/30/bitol-joins-lf-ai-data-as-new-sandbox-project/)
* 2023-11-30 - [AIDAUG - Bitol Joins LF AI & Data as New Sandbox Project](https://aidausergroup.org/2023/11/30/bitol-joins-lf-ai-data-as-new-sandbox-project/)
* 2023-11-22 - [What is, and what isn’t, a data contract](https://datacreation.substack.com/p/what-is-and-what-isnt-a-data-contract)
* 2023-10-01 - [Data Contracts: A Bridge Connecting Two Worlds](https://medium.com/@atanas.iliev.ai/data-contracts-a-bridge-connecting-two-worlds-404eff1d970d)
* 2023-09-10 - [Data Contracts 101](https://medium.com/p/568a9adbf9a9)
* 2023-08-10 - [Welcome to the Open Data Contract Standard](https://jgp.ai/2023/08/09/welcome-to-the-open-data-contract-standard/)
* 2023-05-11 - [Data Contracts – Everything You Need to Know](https://www.montecarlodata.com/blog-data-contracts-explained/)
* 2023-05-07 - [Data Engineering Weekly #130 - Data Contract in the Wild with PayPal’s Data Contract Template](https://www.dataengineeringweekly.com/p/data-engineering-weekly-130)
* 2023-05-06 - [PayPal เปิด Data Contract เป็น Open Source Template ให้ไปใช้งานกัน](https://discuss.dataengineercafe.io/t/paypal-data-contract-open-source-template/581/1)
* 2023-05-05 - [Jonathan Neo (j__neo ) on Reddit](https://www.reddit.com/r/dataengineering/comments/137glbo/comment/jixw5hj/?utm_source=reddit&utm_medium=web2x&context=3)
* 2023-05-01 - [PayPal open sources its data contract template](https://jgp.ai/2023/05/01/paypal-open-sources-its-data-contract-template/)

If you spot an article about the Open Data Contract Standard, make a pull request!
## Contributing to the project
Check out the [CONTRIBUTING](./CONTRIBUTING.md) page.

## More

Expand Down
Loading

0 comments on commit 51bffe2

Please sign in to comment.