-
Notifications
You must be signed in to change notification settings - Fork 19
/
action.yml
228 lines (212 loc) · 11.1 KB
/
action.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
name: 'Run Databricks Notebook'
description: 'Triggers a one-time run of a Databricks notebook'
author: 'Databricks'
inputs:
local-notebook-path:
description: >
Note: either local-notebook-path or workspace-notebook-path must be specified.
Relative path to the notebook in the current Git repo, e.g. "path/to/my_notebook.py".
If specified, then the specified notebook at local path will be uploaded to a temporary
workspace directory under workspace-temp-dir and executed as a one-time job.
Known limitation: When used with `git-commit`, `git-branch` or `git-tag`, the supplied `local-notebook-path`
must be a Databricks notebook exported as source via the UI (https://docs.databricks.com/notebooks/notebooks-manage.html#export-a-notebook)
or REST API (https://docs.databricks.com/dev-tools/api/latest/workspace.html#export),
rather than an arbitrary Python/Scala/R file.
required: false
workspace-notebook-path:
description: >
Note: either local-notebook-path or workspace-notebook-path must be specified.
Absolute path in the Databricks workspace of an existing notebook to run, e.g.
"/Users/[email protected]/My Notebook", "/Repos/my-repo-name/notebooks/My Notebook".
required: false
databricks-host:
description: >
Hostname of the Databricks workspace in which to run the notebook. If unspecified, the hostname
will be inferred from the DATABRICKS_HOST environment variable. Either this parameter or the
DATABRICKS_HOST environment variable must be set.
required: false
databricks-token:
description: >
Databricks REST API token to use to run the notebook. If unspecified, the token
will be inferred from the DATABRICKS_TOKEN environment variable. Either this parameter or the
DATABRICKS_TOKEN environment variable must be set.
required: false
workspace-temp-dir:
description: >
Optional base directory in the workspace workspace under which to upload the local-notebook-path.
If specified, the Action will create a random subdirectory under the specified directory,
and upload the notebook immediately under the subdirectory.
For example, if /tmp/actions is specified along with a notebook path of
machine-learning/notebooks/my-notebook.py, the Action will upload to tmp/actions/<uuid>/my-notebook.py.
required: false
default: '/tmp/databricks-github-actions'
new-cluster-json:
description: >
Note: either existing-cluster-id or new-cluster-json must be specified.
JSON string describing the cluster on which to which execute the notebook. A new cluster
with the specified attributes will be launched to run the notebook. For example, on Azure
Databricks:
{
"num_workers": 1,
"spark_version": "11.3.x-scala2.12",
"node_type_id": "Standard_D3_v2"
}
On AWS:
{
"num_workers": 1,
"spark_version": "11.3.x-scala2.12",
"node_type_id": "i3.xlarge"
}
On GCP:
{
"num_workers": 1,
"spark_version": "11.3.x-scala2.12",
"node_type_id": "n1-highmem-4"
}
See docs for the "new_cluster" field in
https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunsSubmit for more details
on how to specify this input.
required: false
existing-cluster-id:
description: >
Note: either existing-cluster-id or new-cluster-json must be specified.
The string ID of an existing cluster on which to execute the notebook. We recommend
specifying new-cluster-json instead for greater reliability. See docs for
the "existing_cluster_id" field in
https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunsSubmit for details.
required: false
libraries-json:
description: >
JSON string containing a list of libraries (e.g. [{"pypi": "sklearn"}, {"pypi": "mlflow"}]) to
be installed on the cluster that will execute the notebook as a job.
See docs at
https://docs.databricks.com/dev-tools/api/latest/libraries.html#library
for details on how to specify this input.
required: false
notebook-params-json:
description: >
JSON string containing a list of parameters (e.g. [{"key": "...", "value": "..."}]
passed to the notebook run.
See docs for the "base_parameters" field in
https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunsSubmit
for details on how to specify this input.
required: false
access-control-list-json:
description: >
JSON string containing a list of permissions to set on the job.
The list must have elements of the form:
email address for the user -> permission
or group name -> permmision.
e.g.
[
{"user_name": "[email protected]", "permission_level": "CAN_MANAGE"},
{"group_name": "users", "permission_level": "CAN_VIEW"}
]
Please refer to access_control_list at
https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunsSubmit
for details on how to configure this input.
required: false
timeout-seconds:
description: >
Timeout, in seconds, for the one-time Databricks notebook job run.
If not specified, the Databricks job will continue to run
even if the current GitHub Workflow times out
required: false
run-name:
description: >
An optional name for the one-time notebook job run.
Please refer to run_name at
https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunsSubmit
for details on how to configure this input.
required: false
git-commit:
description: >
If specified, the notebook is run on Databricks within a temporary checkout of
the entire git repo at the specified git commit.
The local-notebook-path input must be specified when using this parameter, and you
must configure Git credentials (see https://docs.databricks.com/repos/index.html
#configure-your-git-integration-with-databricks).
This functionality can be useful for running a notebook that depends on other
files within the same repo (e.g imports Python modules within the same repo) or
that was developed using [Databricks Repos]
(https://docs.databricks.com/repos/index.html). For example, you can set
`git-commit` to `GITHUB_SHA` to run the notebook in the context of its enclosing
repo against the commit that triggered the current GitHub Workflow,
e.g. to run a notebook from the current PR branch.
Known limitation: The supplied `local-notebook-path` must be a Databricks notebook
exported as source via the UI (https://docs.databricks.com/notebooks/notebooks-manage.html#export-a-notebook)
or REST API (https://docs.databricks.com/dev-tools/api/latest/workspace.html#export),
rather than an arbitrary Python/Scala/R file.
required: false
git-branch:
description: >
If specified, the notebook is run on Databricks within a temporary checkout of
the entire git repo at the specified git branch.
The local-notebook-path input must be specified when using this parameter, and you
must configure Git credentials (see https://docs.databricks.com/repos/index.html
#configure-your-git-integration-with-databricks).
This functionality can be useful for running a notebook that depends on other
files within the same repo (e.g imports Python modules within the same repo) or
that was developed using [Databricks Repos]
(https://docs.databricks.com/repos/index.html).
Known limitation: The supplied `local-notebook-path` must be a Databricks notebook
exported as source via the UI (https://docs.databricks.com/notebooks/notebooks-manage.html#export-a-notebook)
or REST API (https://docs.databricks.com/dev-tools/api/latest/workspace.html#export),
rather than an arbitrary Python/Scala/R file.
required: false
git-tag:
description: >
If specified, the notebook is run on Databricks within a temporary checkout of
the entire git repo at the specified git tag.
The local-notebook-path input must be specified when using this parameter, and you
must configure Git credentials (see https://docs.databricks.com/repos/index.html
#configure-your-git-integration-with-databricks).
This functionality can be useful for running a notebook that depends on other
files within the same repo (e.g imports Python modules within the same repo) or
that was developed using [Databricks Repos]
(https://docs.databricks.com/repos/index.html).
Known limitation: The supplied `local-notebook-path` must be a Databricks notebook
exported as source via the UI (https://docs.databricks.com/notebooks/notebooks-manage.html#export-a-notebook)
or REST API (https://docs.databricks.com/dev-tools/api/latest/workspace.html#export),
rather than an arbitrary Python/Scala/R file.
required: false
git-provider:
description: >
If specified, the notebook is run on Databricks within a temporary checkout of
the entire git repo using the specified git provider.
The local-notebook-path input as well as one of git-tag, git-commit, or git-branch must be specified when using this parameter,
and you must configure Git credentials (see https://docs.databricks.com/repos/index.html
#configure-your-git-integration-with-databricks).
The default value for this input is 'gitHub', but if the repository is hosted in Github enterprise
then one can specify 'gitHubEnterprise'.
required: false
default: 'gitHub'
pr-comment-github-token:
description: >
If specified, this action will post a PR comment containing the notebook output on run completion, using the specified GitHub token to post the comment.
Note that no comment will be posted if this action is used in a workflow triggered from a source other than a pull request (e.g. a manually triggered workflow)
The comment will consist of the output of the notebook. If the token is not provided then there will be no attempt to comment on the PR.
For example, you can set the value of this input to 'secrets.GITHUB_TOKEN',
which by default has the necessary permissions to comment on PRs (see 'Default access (permissive)' in https://docs.github.com/en/actions/security-guides/
automatic-token-authentication#permissions-for-the-github_token)
required: false
outputs:
notebook-output:
description: >
The output from the executed notebook.
This output is set from the notebook via dbutils.notebook.exit(...).
Checkout https://docs.databricks.com/dev-tools/databricks-utils.html#exit-command-dbutilsnotebookexit
for further details.
truncated:
description: >
This boolean value indicates whether notebook-outbook has been truncated or not.
run-url:
description: >
The url of the one time job run for the notebook.
run-id:
description: >
The id of the one time job run for the notebook.
runs:
using: 'node16'
main: 'dist/main/index.js'
post: 'dist/post/index.js'