Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zyda2 tutorial - TypeError when initializing Dask CPU cluster #344

Open
ronjer30 opened this issue Nov 5, 2024 · 0 comments
Open

Zyda2 tutorial - TypeError when initializing Dask CPU cluster #344

ronjer30 opened this issue Nov 5, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@ronjer30
Copy link
Contributor

ronjer30 commented Nov 5, 2024

Describe the bug

In the Zyda2 tutorial, several scripts like the process_dclm.py attempt to start a Dask LocalCluster. These scripts take an environment variable
CPU_WORKERS = os.environ.get("CPU_WORKERS") to setup the cluster with equivalent workers using the following code cluster = LocalCluster(n_workers=CPU_WORKERS, processes=True, memory_limit="48GB"). A TypeError is raised because n_workers is expected to be an integer.

Steps/Code to reproduce bug

  1. Follow steps in tutorial
  2. Run python3 0_processing/process_dclm.py
  3. Script errors with following error
Traceback (most recent call last):
  File "...NeMo-Curator/tutorials/zyda2-tutorial/0_processing/process_dclm.py", line 21, in <module>
    cluster = LocalCluster(n_workers=CPU_WORKERS, processes=True, memory_limit="48GB")
  File "/usr/local/lib/python3.10/dist-packages/distributed/deploy/local.py", line 211, in __init__
    threads_per_worker = max(1, int(math.ceil(CPU_COUNT / n_workers)))
TypeError: unsupported operand type(s) for /: 'int' and 'str' 

Expected behavior

Dask cluster is created and data is processed, script completes successfully

Environment overview (please complete the following information)

  • Environment location: Slurm
  • Method of NeMo-Curator install: docker container, dev image from nvcr.io/nvidia/nemo:dev
@ronjer30 ronjer30 added the bug Something isn't working label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant