Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs about job schedule batch status #367

Merged
merged 3 commits into from
Jul 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions public-site/docs/guides/jobs/configure-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ spec:
node:
gpu: nvidia-k80
gpuCount: 2
batchStatusRules:
- condition: Any
operator: In
jobStatuses:
- Failed
batchStatus: Failed
```

## Options
Expand All @@ -61,6 +67,7 @@ Jobs have three extra configuration options; `schedulerPort`, `payload` and `tim
- `timeLimitSeconds` (optional) defines maximum running time for a job.
- `backoffLimit` (optional) defines the number of times a job will be restarted if its container exits in error.
- `notifications.webhook` (optional) the Radix application component or job component endpoint, where Radix batch events will be posted when any of its job-component's running jobs or batches changes states.
- `batchStatusRules` - (optional) rules to define batch statuses by their jobs statuses. See [batchStatusRules](/radix-config/index.md#batchstatusrules) for a job for more information.

### schedulerPort

Expand Down
63 changes: 63 additions & 0 deletions public-site/docs/radix-config/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1197,6 +1197,43 @@ spec:

`webhook` is an optional URL to the Radix application component or job component which will be called when any of the job-component's running jobs or batches changes states. Only changes are sent by POST method with a `application/json` `ContentType` in a [batch event format](/guides/jobs/notifications.md#radix-batch-event). Read [more](/guides/jobs/notifications)

### `batchStatusRules`

```yaml
spec:
jobs:
- name: compute
batchStatusRules:
- condition: Any
operator: In
jobStatuses:
- Failed
batchStatus: Failed
- condition: All
operator: NotIn
jobStatuses:
- Waiting
- Active
- Running
batchStatus: Completed
```
`batchStatusRules` - Optional rules to define batch statuses by their jobs statuses.
- `condition` - `Any`, `All`
- `operator` - `In`, `NotIn`
- `jobStatuses` - `Waiting`, `Active`, `Running`, `Succeeded`, `Failed`, `Stopped`
- `batchStatus` - `Waiting`, `Active`, `Running`, `Succeeded`, `Failed`, `Stopping`, `Stopped`, `DeadlineExceeded`, `Completed`

Rules are applied in the order from top to bottom in the rules list. When any rule matches, rules following it are ignored.

If `batchStatusRules` are not defined or no rules match a batch status is set by following rules:
* `Waiting` - no jobs are started
* `Active` - any jobs are in `Active` or `Running` state
* `Completed` - no jobs are in `Waiting`, `Active` or `Running` states

Batch statuses, default or defined by rules, are the same in the Radix console, returned by [job notifications](/guides/jobs/notifications.md) and [Job Manager API](/guides/jobs/job-manager-and-job-api.md). If rules are changed, they will be applied on next deployment of an application environment, also affecting already existing batches statuses in this environment.

`batchStatusRules` [can be overridden](#batchstatusrules-1) for individual environments.

### `monitoring`

```yaml
Expand Down Expand Up @@ -1375,6 +1412,32 @@ spec:

See [notifications](#notifications) for a component for more information.

### `batchStatusRules`

```yaml
spec:
jobs:
- name: compute
batchStatusRules:
- condition: All
operator: NotIn
jobStatuses:
- Waiting
- Active
- Running
batchStatus: Completed
environmentConfig:
- environment: prod
batchStatusRules:
- condition: All
operator: In
jobStatuses:
- Succeeded
batchStatus: Succeeded
```
When `batchStatusRules` is defined for an environment it fully overrides the job's `batchStatusRules`.
See [batchStatusRules](#batchstatusrules) for a job for more information.

#### `monitoring`

```yaml
Expand Down
21 changes: 11 additions & 10 deletions public-site/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 8 additions & 7 deletions public-site/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,20 @@
"typecheck": "tsc"
},
"dependencies": {
"@docusaurus/preset-classic": "^3.2.1",
"@docusaurus/preset-classic": "^3.4.0",
"@mdx-js/react": "^3.0.1",
"clsx": "^2.1.1",
"docusaurus-lunr-search": "^3.3.2",
"docusaurus-lunr-search": "^3.4.0",
"prism-react-renderer": "^2.3.1",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"sass": "^1.75.0"
"sass": "^1.77.6"
},
"devDependencies": {
"@docusaurus/module-type-aliases": "^3.2.1",
"@docusaurus/tsconfig": "^3.2.1",
"@docusaurus/types": "^3.2.1",
"typescript": "~5.4.5"
"@docusaurus/module-type-aliases": "^3.4.0",
"@docusaurus/tsconfig": "^3.4.0",
"@docusaurus/types": "^3.4.0",
"typescript": "~5.5.2"
},
"browserslist": {
"production": [
Expand All @@ -46,3 +46,4 @@
"node": ">=18.0"
}
}