Skip to content

Commit

Permalink
feat: add ingestion config map in values (#423)
Browse files Browse the repository at this point in the history
* feat: add ingestion config map in values

Add support for providing the ingestion recipe's
definition in `datahub-ingestion-cron`'s `values.yaml`.

The current datahub helm implementation relies on an
externally managed config map to specify an ingestion
recipe.  This necessarily requires production k8s access
or a separate management system for k8s config maps.

Now, helm can manage the config map for datahub.

This is backwards compatible - if `fileContent` is not
provided, it assumes the config map is managed externally

* fix: improve some of the commented out examples in values.yaml

---------

Co-authored-by: Travis Cook <[email protected]>
  • Loading branch information
travis-cook-sfdc and Travis Cook authored Jan 9, 2024
1 parent 9e3524f commit 5366a37
Show file tree
Hide file tree
Showing 5 changed files with 46 additions and 5 deletions.
4 changes: 2 additions & 2 deletions charts/datahub/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: A Helm chart for LinkedIn DataHub
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
version: 0.3.24
version: 0.3.25
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: 0.12.1
Expand All @@ -26,7 +26,7 @@ dependencies:
repository: file://./subcharts/datahub-mce-consumer
condition: global.datahub_standalone_consumers_enabled
- name: datahub-ingestion-cron
version: 0.2.139
version: 0.2.140
repository: file://./subcharts/datahub-ingestion-cron
condition: datahub-ingestion-cron.enabled
- name: acryl-datahub-actions
Expand Down
2 changes: 1 addition & 1 deletion charts/datahub/subcharts/datahub-ingestion-cron/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ description: A Helm chart for Kubernetes
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
version: 0.2.139
version: 0.2.140
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application.
appVersion: v0.11.0
3 changes: 2 additions & 1 deletion charts/datahub/subcharts/datahub-ingestion-cron/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ A Helm chart for datahub's metadata-ingestion framework with kerberos authentica
| crons.recipe | object | `{}` | Recipe configuration to be executed (required) |
| crons.recipe.configmapName | string | `""` | Name of configmap to be mounted containing recipe to be executed |
| crons.recipe.fileName | string | `""` | Name of property within configMap referenced by `recipe.configName` with the concrete recipe definition |
| crons.recipe.fileContent | object | `{}` | Recipe for ingestion. If not present, assumes an externally managed config map |
| crons.command | array | `["/bin/sh", "-c", "datahub ingest -c /etc/recipe/<crons.recipe.fileName>"]` | Array of strings denoting the crawling command to be invoked in the cron job. By default it will execute the recipe defined in the `crons.recipe` object. Cron crawling customization is possible by having extra volumes with custom logic to be executed. |
| crons.hostAliases | array | `[]` | host aliases |
| crons.env | object | `{}` | Environment variables to add to the cronjob container |
Expand All @@ -38,4 +39,4 @@ A Helm chart for datahub's metadata-ingestion framework with kerberos authentica
| crons.affinity | object | `{}` | Affinity for pod assignment |
| crons.tolerations | list | `[]` | Tolerations for pod assignment |
| crons.extraSidecars | list | `[]` | Add sidecar containers to the pod |
| crons.suspend | boolean | false | Suspend execution of a cron |
| crons.suspend | boolean | false | Suspend execution of a cron |
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{- $labels := include "datahub-ingestion-cron.labels" .}}
{{- range $jobName, $val := .Values.crons }}
{{- if $val.recipe.fileContent }}
apiVersion: {{ include "datahub-ingestion-cron.cronjob.apiVersion" $}}
kind: ConfigMap
metadata:
name: {{ $val.recipe.configmapName }}
labels: {{- $labels | nindent 4 }}
data:
{{ $val.recipe.fileName }}: |-
{{- toYaml $val.recipe.fileContent | nindent 4 }}
---
{{- end }}
{{- end }}

27 changes: 26 additions & 1 deletion charts/datahub/subcharts/datahub-ingestion-cron/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ podSecurityContext: {}
# fsGroup: 2000

crons: {}
#### Example data
#### Example data with externally managed config map
#hive:
## Daily at midnight (we may want to offset this to not conflict with other processes)
#schedule: "0 0 * * *"
Expand All @@ -25,6 +25,31 @@ crons: {}
## Command to be executed
#command: "datahub ingest -c <recipe.fileName>"

# Example data with helm managed config map
# mysql:
# schedule: "0 0 0 0 0"
# recipe:
# configmapName: datahub-mysql-ingestion
# fileName: mysql.yaml
# # Example mysql -> datahub source recipe
# fileContent:
# source:
# type: mysql
# config:
# # Coordinates
# host_port: localhost:3306
# database: dbname
# # Credentials
# username: root
# password: example
# sink:
# type: datahub-rest
# config:
# server: http://localhost:8080

# Command to be executed
# command: "datahub ingest -c <recipe.fileName>"

## Deployment pod host aliases
## https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
##
Expand Down

0 comments on commit 5366a37

Please sign in to comment.