Skip to content

Latest commit

 

History

History
359 lines (275 loc) · 8.74 KB

owners.md

File metadata and controls

359 lines (275 loc) · 8.74 KB

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Ownership

Why Would You Use Users and Groups?

Users and groups are essential for managing ownership of data. By creating or updating user accounts and assigning them to appropriate groups, administrators can ensure that the right people can access the data they need to do their jobs. This helps to avoid confusion or conflicts over who is responsible for specific datasets and can improve the overall effectiveness.

Goal Of This Guide

This guide will show you how to

  • Create: create or update users and groups.
  • Read: read owners attached to a dataset.
  • Add: add user group as an owner to a dataset.
  • Remove: remove the owner from a dataset.

Pre-requisites

For this tutorial, you need to deploy DataHub Quickstart and ingest sample data. For detailed information, please refer to Datahub Quickstart Guide.

:::note In this guide, ingesting sample data is optional. :::

Upsert Users

Save this user.yaml as a local file.

- id: [email protected]
  first_name: The
  last_name: Bar
  email: [email protected]
  slack: "@the_bar_raiser"
  description: "I like raising the bar higher"
  groups:
    - [email protected]
- id: datahub
  slack: "@datahubproject"
  phone: "1-800-GOT-META"
  description: "The DataHub Project"
  picture_link: "https://raw.githubusercontent.com/datahub-project/datahub/master/datahub-web-react/src/images/datahub-logo-color-stable.svg"

Execute the following CLI command to ingest user data. Since the user datahub already exists in the sample data, any updates made to the user information will overwrite the existing data.

datahub user upsert -f user.yaml

If you see the following logs, the operation was successful:

Update succeeded for urn urn:li:corpuser:[email protected].
Update succeeded for urn urn:li:corpuser:datahub.
{{ inline /metadata-ingestion/examples/library/upsert_user.py show_path_as_comment }}

Expected Outcomes of Upserting User

You can see the user The bar has been created and the user Datahub has been updated under Settings > Access > Users & Groups

Upsert Group

Save this group.yaml as a local file. Note that the group includes a list of users who are owners and members. Within these lists, you can refer to the users by their ids or their urns, and can additionally specify their metadata inline within the group description itself. See the example below to understand how this works and feel free to make modifications to this file locally to see the effects of your changes in your local DataHub instance.

id: [email protected]
display_name: Foo Group
owners:
  - datahub
members:
  - [email protected] # refer to a user either by id or by urn
  - id: [email protected] # inline specification of user
    slack: "@joe_shmoe"
    display_name: "Joe's Hub"

Execute the following CLI command to ingest this group's information.

datahub group upsert -f group.yaml

If you see the following logs, the operation was successful:

Update succeeded for group urn:li:corpGroup:[email protected].
{{ inline /metadata-ingestion/examples/library/upsert_group.py show_path_as_comment }}

Expected Outcomes of Upserting Group

You can see the group Foo Group has been created under Settings > Access > Users & Groups

Read Owners

query {
  dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)") {
    ownership {
      owners {
        owner {
          ... on CorpUser {
            urn
            type
          }
          ... on CorpGroup {
            urn
            type
          }
        }
      }
    }
  }
}

If you see the following response, the operation was successful:

{
  "data": {
    "dataset": {
      "ownership": {
        "owners": [
          {
            "owner": {
              "urn": "urn:li:corpuser:jdoe",
              "type": "CORP_USER"
            }
          },
          {
            "owner": {
              "urn": "urn:li:corpuser:datahub",
              "type": "CORP_USER"
            }
          }
        ]
      }
    }
  },
  "extensions": {}
}
curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'Authorization: Bearer <my-access-token>' \
--header 'Content-Type: application/json' \
--data-raw '{ "query": "{ dataset(urn: \"urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)\") { ownership { owners { owner { ... on CorpUser { urn type } ... on CorpGroup { urn type } } } } } }", "variables":{}}'

Expected Response:

{
  "data": {
    "dataset": {
      "ownership": {
        "owners": [
          { "owner": { "urn": "urn:li:corpuser:jdoe", "type": "CORP_USER" } },
          { "owner": { "urn": "urn:li:corpuser:datahub", "type": "CORP_USER" } }
        ]
      }
    }
  },
  "extensions": {}
}
{{ inline /metadata-ingestion/examples/library/dataset_query_owners.py show_path_as_comment }}

Add Owners

mutation addOwners {
    addOwner(
      input: {
        ownerUrn: "urn:li:corpGroup:bfoo",
        resourceUrn: "urn:li:dataset:(urn:li:dataPlatform:hive,fct_users_created,PROD)",
        ownerEntityType: CORP_GROUP,
        type: TECHNICAL_OWNER
			}
    )
}

Expected Response:

{
  "data": {
    "addOwner": true
  },
  "extensions": {}
}
curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'Authorization: Bearer <my-access-token>' \
--header 'Content-Type: application/json' \
--data-raw '{ "query": "mutation addOwners { addOwner(input: { ownerUrn: \"urn:li:corpGroup:bfoo\", resourceUrn: \"urn:li:dataset:(urn:li:dataPlatform:hive,fct_users_created,PROD)\", ownerEntityType: CORP_GROUP, type: TECHNICAL_OWNER }) }", "variables":{}}'
{{ inline /metadata-ingestion/examples/library/dataset_add_owner.py show_path_as_comment }}

Expected Outcomes of Adding Owner

You can now see bfoo has been added as an owner to the fct_users_created dataset.

Remove Owners

mutation removeOwners {
    removeOwner(
      input: {
        ownerUrn: "urn:li:corpuser:jdoe",
        resourceUrn: "urn:li:dataset:(urn:li:dataPlatform:hdfs,SampleHdfsDataset,PROD)",
			}
    )
}

Note that you can also remove owners from multiple entities or subresource using batchRemoveOwners.

mutation batchRemoveOwners {
    batchRemoveOwners(
      input: {
        ownerUrns: ["urn:li:corpuser:jdoe"],
        resources: [
          { resourceUrn:"urn:li:dataset:(urn:li:dataPlatform:hdfs,SampleHdfsDataset,PROD)"} ,
          { resourceUrn:"urn:li:dataset:(urn:li:dataPlatform:hive,fct_users_created,PROD)"} ,]
      }
    )
}

Expected Response:

{
  "data": {
    "removeOwner": true
  },
  "extensions": {}
}
curl --location --request POST 'http://localhost:8080/api/graphql' \
--header 'Authorization: Bearer <my-access-token>' \
--header 'Content-Type: application/json' \
--data-raw '{ "query": "mutation removeOwner { removeOwner(input: { ownerUrn: \"urn:li:corpuser:jdoe\", resourceUrn: \"urn:li:dataset:(urn:li:dataPlatform:hdfs,SampleHdfsDataset,PROD)\" }) }", "variables":{}}'
{{ inline /metadata-ingestion/examples/library/dataset_remove_owner_execute_graphql.py show_path_as_comment }}

Expected Outcomes of Removing Owners

You can now see John Doe has been removed as an owner from the fct_users_created dataset.