Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data-classification.md extension #1317

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cloudevents/extensions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ for more information.

- [Auth Context](authcontext.md)
- [BAM](bam.md)
- [Data Classification](data-classification.md)
- [Dataref (Claim Check Pattern)](dataref.md)
- [Deprecation](deprecation.md)
- [Distributed Tracing](distributed-tracing.md)
Expand Down
90 changes: 90 additions & 0 deletions cloudevents/extensions/data-classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Data Classification Extension

CloudEvents might contain payloads which are subjected to data protection
regulations like GDPR or HIPAA. For intermediaries and consumers knowing how
event payloads are classified, which data protection regulation applies and how
payloads are categorized, enables compliant processing of events.

This extension defines attributes to describe to
[consumers](../spec.md#consumer) or [intermediaries](../spec.md#intermediary)
how an event and its payload is classified, category of the payload and any
applicable data protection regulations.

These attributes are intended for classification at an event and payload level
and not at a `data` field level. Classification at a field level is best defined
in the schema specified via the `dataschema` attribute.

## Notational Conventions

As with the main [CloudEvents specification](../spec.md), the key words "MUST",
"MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as
described in [RFC 2119](https://tools.ietf.org/html/rfc2119).

However, the scope of these key words is limited to when this extension is used.
For example, an attribute being marked as "REQUIRED" does not mean it needs to
be in all CloudEvents, rather it needs to be included only when this extension
is being used.

## Attributes

### dataclassification

- Type: `String`
- Description: Data classification level for the event payload within the
context of a `dataregulation`. In situations where `dataregulation` is
undefined or the data protection regulation does not define any labels, then
RECOMMENDED labels are: `public`, `internal`, `confidential`, or
`restricted`.
- Constraints:
- REQUIRED

### dataregulation

- Type: `String`
- Description: A comma-delimited list of applicable data protection regulations.
For example: `GDPR`, `HIPAA`, `PCI-DSS`, `ISO-27001`, `NIST-800-53`, `CCPA`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this potentially goes down a rabbit-hole of trying to maintain catalogs but is there value is formalizing some of the regulation codes or referencing some well-known external catalog (if one exists).

In addition, does the applicability of some of these regulations vary by jurisdiction? if so, does that need to be represented in some fashion ?

- Constraints:
- OPTIONAL
- if present, MUST be a non-empty string without internal spaces. Leading and
trailing spaces around each entry MUST be ignored.

### datacategory

- Type: `String`
- Description: Data category of the event payload within the context of a
`dataregulation` and `dataclassification`. For GDPR personal data typical
labels are: `non-sensitive`, `standard`, `sensitive`, `special-category`. For
US personal data this could be: `sensitive-pii`, `non-sensitive-pii`,
`non-pii`. And for personal health information under HIPAA: `phi`.
- Constraints:
- OPTIONAL
- if present, MUST be a non-empty string

## Usage

When this extension is used, producers MUST set the value of the
`dataclassification` attribute. When applicable the `dataregulation` and
`datacategory` attributes MAY be set to provide additional details on the
classification context.

When an implementation supports this extension, then intermediaries and
consumers MUST take these attributes into account and act accordingly to data
regulations and/or internal policies in processing the event and payload. If
intermediaries or consumers cannot meet such requirements, they MUST reject or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should a client do if they know "support" this extension, but see values they don't know about (e.g. a new dataregulation value)? We may want to adjust this to say "if you don't know you can meet requirements, you should assume you can't".

ignore the event.

Intermediaries SHOULD NOT modify the `dataclassification`, `dataregulation`, and
jskeet marked this conversation as resolved.
Show resolved Hide resolved
`datacategory` attributes.

## Use cases

Examples where data classification of events can be useful are:

- When an event contains PII or restricted information and therefore processing
by intermediaries or consumers need to adhere to certain policies. For example
having separate processing pipelines by sensitivity or having logging,
auditing and access policies based upon classification.
- When an event payload is subjected to regulation and therefore retention
policies apply. For example, having event retention policies based upon data
classification or to enable automated data purging of durable topics.
2 changes: 2 additions & 0 deletions cloudevents/languages/he/extensions/data-classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Data Classification Extension
מסמך זה טרם תורגם. בבקשה תשתמשו [בגרסה האנגלית של המסמך](../../../extensions/data-classification.md) לבינתיים.
6 changes: 6 additions & 0 deletions cloudevents/languages/zh-CN/extensions/data-classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Data Classification Extension

本文档尚未被翻译,请先阅读英文[原版文档](../../../extensions/data-classification.md) 。

如果您迫切地需要此文档的中文翻译,请[提交一个issue](https://github.com/cloudevents/spec/issues) ,
我们会尽快安排专人进行翻译。
Loading