Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several wandb init #934

Open
fzyzcjy opened this issue Nov 26, 2024 · 0 comments
Open

Several wandb init #934

fzyzcjy opened this issue Nov 26, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@fzyzcjy
Copy link

fzyzcjy commented Nov 26, 2024

Describe the bug

Hi thanks for the lib! When trying the script https://github.com/vllm-project/llm-compressor/blob/main/examples/quantization_kv_cache/llama3_fp8_kv_example.py, I see something weird: Several wandb init happens. IMHO it seems to be only a single run, so at most one init looks reasonable.

2024-11-26T05:25:38.849137+0000 | _check_create_state | INFO - State created for compression lifecycle
ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
wandb: Currently logged in as: ch271828n. Use `wandb login --relogin` to force relogin
Waiting for wandb.init()...
wandb version 0.18.7 is available! To upgrade, please run: $ pip install wandb --upgrade
Tracking run with wandb version 0.17.5
Run data is saved locally in /host_home/research/code/research_mono/notebooks/math_ai/ad_hoc/wandb/run-20241126_052540-ggjblhrs
Syncing run [glorious-mountain-19](https://wandb.ai/ch271828n/uncategorized/runs/ggjblhrs) to [Weights & Biases](https://wandb.ai/ch271828n/uncategorized) ([docs](https://wandb.me/run))
View project at https://wandb.ai/ch271828n/uncategorized
View run at https://wandb.ai/ch271828n/uncategorized/runs/ggjblhrs
2024-11-26T05:25:52.860459+0000 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers
Finishing last run (ID:ggjblhrs) before initializing another...
View run glorious-mountain-19 at: https://wandb.ai/ch271828n/uncategorized/runs/ggjblhrs
View project at: https://wandb.ai/ch271828n/uncategorized
Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
Find logs at: ./wandb/run-20241126_052540-ggjblhrs/logs
The new W&B backend becomes opt-out in version 0.18.0; try it out with `wandb.require("core")`! See https://wandb.me/wandb-core for more information.
Successfully finished last run (ID:ggjblhrs). Initializing new run:
Waiting for wandb.init()...
wandb version 0.18.7 is available! To upgrade, please run: $ pip install wandb --upgrade
Tracking run with wandb version 0.17.5
Run data is saved locally in /host_home/research/code/research_mono/notebooks/math_ai/ad_hoc/wandb/run-20241126_052552-um22ijva
Syncing run [vague-universe-20](https://wandb.ai/ch271828n/uncategorized/runs/um22ijva) to [Weights & Biases](https://wandb.ai/ch271828n/uncategorized) ([docs](https://wandb.me/run))
View project at https://wandb.ai/ch271828n/uncategorized
View run at https://wandb.ai/ch271828n/uncategorized/runs/um22ijva
2024-11-26T05:26:12.416123+0000 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers
Finishing last run (ID:um22ijva) before initializing another...
View run vague-universe-20 at: https://wandb.ai/ch271828n/uncategorized/runs/um22ijva
View project at: https://wandb.ai/ch271828n/uncategorized
Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
Find logs at: ./wandb/run-20241126_052552-um22ijva/logs
The new W&B backend becomes opt-out in version 0.18.0; try it out with `wandb.require("core")`! See https://wandb.me/wandb-core for more information.
Successfully finished last run (ID:um22ijva). Initializing new run:
Waiting for wandb.init()...
wandb version 0.18.7 is available! To upgrade, please run: $ pip install wandb --upgrade
Tracking run with wandb version 0.17.5
Run data is saved locally in /host_home/research/code/research_mono/notebooks/math_ai/ad_hoc/wandb/run-20241126_052612-l8wousmz
Syncing run [fast-water-21](https://wandb.ai/ch271828n/uncategorized/runs/l8wousmz) to [Weights & Biases](https://wandb.ai/ch271828n/uncategorized) ([docs](https://wandb.me/run))
View project at https://wandb.ai/ch271828n/uncategorized
View run at https://wandb.ai/ch271828n/uncategorized/runs/l8wousmz
[2024-11-26 05:26:33,277] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[/opt/conda/lib/python3.11/site-packages/llmcompressor/transformers/finetune/session_mixin.py:95](https://192.168.0.109:8888/opt/conda/lib/python3.11/site-packages/llmcompressor/transformers/finetune/session_mixin.py#line=94): FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `Trainer.__init__`. Use `processing_class` instead.
  super().__init__(**kwargs)
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.5
 [WARNING]  using untested triton version (3.1.0), only 1.0.0 is known to be compatible
[/opt/conda/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py:47](https://192.168.0.109:8888/opt/conda/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py#line=46): FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
  @autocast_custom_fwd
[/opt/conda/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py:66](https://192.168.0.109:8888/opt/conda/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py#line=65): FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
  @autocast_custom_bwd
2024-11-26T05:26:34.104132+0000 | one_shot | INFO - *** One Shot ***
Finishing last run (ID:l8wousmz) before initializing another...
View run fast-water-21 at: https://wandb.ai/ch271828n/uncategorized/runs/l8wousmz
View project at: https://wandb.ai/ch271828n/uncategorized
Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
Find logs at: ./wandb/run-20241126_052612-l8wousmz/logs
The new W&B backend becomes opt-out in version 0.18.0; try it out with `wandb.require("core")`! See https://wandb.me/wandb-core for more information.
Successfully finished last run (ID:l8wousmz). Initializing new run:
Waiting for wandb.init()...
wandb version 0.18.7 is available! To upgrade, please run: $ pip install wandb --upgrade
Tracking run with wandb version 0.17.5
Run data is saved locally in /host_home/research/code/research_mono/notebooks/math_ai/ad_hoc/wandb/run-20241126_052634-6zgyfa3j
Syncing run [desert-firefly-22](https://wandb.ai/ch271828n/uncategorized/runs/6zgyfa3j) to [Weights & Biases](https://wandb.ai/ch271828n/uncategorized) ([docs](https://wandb.me/run))
View project at https://wandb.ai/ch271828n/uncategorized
View run at https://wandb.ai/ch271828n/uncategorized/runs/6zgyfa3j
2024-11-26T05:26:53.409986+0000 | _check_compile_recipe | INFO - Recipe compiled and 1 modifiers created
2024-11-26T05:26:53.535971+0000 | _calibrate | INFO - Running QuantizationModifier calibration with 10 samples...
100%|██████████| 10[/10](https://192.168.0.109:8888/10) [00:05<00:00,  1.86it[/s](https://192.168.0.109:8888/s)]
manager stage: Modifiers initialized
2024-11-26T05:26:58.922362+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
manager stage: Modifiers finalized
2024-11-26T05:26:58.923158+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers

Expected behavior
A clear and concise description of what you expected to happen.

Environment
Include all relevant environment information:

  1. OS [e.g. Ubuntu 20.04]:
  2. Python version [e.g. 3.7]:
  3. LLM Compressor version or commit hash [e.g. 0.1.0, f7245c8]:
  4. ML framework version(s) [e.g. torch 2.3.1]:
  5. Other Python package versions [e.g. vLLM, compressed-tensors, numpy, ONNX]:
  6. Other relevant environment information [e.g. hardware, CUDA version]:

To Reproduce
Exact steps to reproduce the behavior:

Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.

Additional context
Add any other context about the problem here. Also include any relevant files.

@fzyzcjy fzyzcjy added the bug Something isn't working label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant