Skip to content

Commit

Permalink
Update monitor.md
Browse files Browse the repository at this point in the history
  • Loading branch information
alexkuzmik committed Apr 25, 2024
1 parent ae4ebb7 commit c5f482b
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion docs/_tutorials/monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ In this tutorial, we introduce the DeepSpeed Monitor and provide examples of its

## Overview

Monitoring model and system metrics during training is vital to ensure hardware resources are fully utilized. The DeepSpeed Monitor enables live logging of metrics through one or more monitoring backends such as PyTorch's [TensorBoard](https://pytorch.org/docs/1.8.0/tensorboard.html), [WandB](https://docs.wandb.ai/quickstart), and simple CSV files.
Monitoring model and system metrics during training is vital to ensure hardware resources are fully utilized. The DeepSpeed Monitor enables live logging of metrics through one or more monitoring backends such as PyTorch's [TensorBoard](https://pytorch.org/docs/1.8.0/tensorboard.html), [WandB](https://docs.wandb.ai/quickstart), [CometML](https://www.comet.com/docs/v2/guides/quickstart/) and simple CSV files.

Below is a live monitoring view for TensorBoard:

Expand All @@ -21,6 +21,10 @@ Below is a live monitoring view for WandB:

![WandB Example Output](/assets/images/wandb_monitor.PNG){: .align-center}

Below is a live monitoring view for CometML

![CometML Example Output](/assets/images/comet_monitor.PNG){: .align-center}

## Usage

The DeepSpeed Monitor is configured within the deepspeed [configuration file](/docs/config-json/#monitoring-module-tensorboard-wandb-csv). DeepSpeed will automatically monitor key training metrics, including those tracked with the `wall_clock_breakdown` configuration option. In addition, users can log their own custom events and metrics.
Expand All @@ -45,6 +49,11 @@ When using DeepSpeed for model training, the Monitor can be configured in the De
"group": "my_group",
"project": "my_project"
}
"comet": {
"enabled": true,
"project": "my_project",
"experiment_name": "my_experiment"
}
"csv_monitor": {
"enabled": true,
"output_path": "output/ds_logs/",
Expand Down
Binary file added docs/assets/images/comet_monitor.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c5f482b

Please sign in to comment.