Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run_genes.py global_args not defined. #144

Open
kkerns85 opened this issue Aug 26, 2024 · 14 comments
Open

Run_genes.py global_args not defined. #144

kkerns85 opened this issue Aug 26, 2024 · 14 comments

Comments

@kkerns85
Copy link

Hi I am testing Metagenomic Intra-Species Diversity Analysis System (MIDAS), Version 3.0.1.
I am following the run_genes step but keep running into an error during the MIDAS2::multiprocessing_map::start step which gives the error: "NameError: name 'global_args' is not defined"
I don't see the option or flag to define the global_args?
Thanks in advance!

1724583425.9: MIDAS2::multiprocessing_map::start
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/subcommands/run_genes.py", line 306, in compute_pileup_per_chunk
if global_args.debug and os.path.exists(headerless_sliced_path):
^^^^^^^^^^^
NameError: name 'global_args' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/homebrew/anaconda3/bin/midas", line 8, in
sys.exit(main())
^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/main.py", line 25, in main
return subcommand_main(subcommand_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/subcommands/run_genes.py", line 563, in main
run_genes(args)
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/subcommands/run_genes.py", line 557, in run_genes
raise error
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/subcommands/run_genes.py", line 530, in run_genes
list_of_chunks_depth = multiprocessing_map(compute_pileup_per_chunk, args_list, number_of_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/common/utils.py", line 532, in multiprocessing_map
return _multi_map(func, items, num_procs, multiprocessing.Pool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/site-packages/midas/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/pool.py", line 774, in get
raise self._value
NameError: name 'global_args' is not defined

Full Command and Error attatched.
MIDAS3_run_gene_error.txt

@zhaoc1
Copy link
Contributor

zhaoc1 commented Aug 26, 2024

From the error message, it seems like the global variable global_args is not shared by Pool object. I think this might be a python version issue. MIDAS has only been tested up to python 3.9. Can you try downgrade your python version? Thank you.

@kkerns85
Copy link
Author

Thanks for your fast reply. I am currently running python version 3.9.6 are you suggesting 3.9.0 or 3.8.19?

@zhaoc1
Copy link
Contributor

zhaoc1 commented Aug 26, 2024

3.9.6 should be okay. From the error message, there might be two versions of python in your environment: /opt/homebrew/anaconda3/lib/python3.12/multiprocessing/pool.py.

@kkerns85
Copy link
Author

Okay so I have uninstalled and reinstalled and ensured that the correct python env 3.9.6 is working and used to execute midas, but am still getting the same issue with the "multiprocessing_map" step. Please see below:
1724875263.1: MIDAS2::multiprocessing_map::start
1724875265.1: Deleting untrustworthy outputs due to error. Specify --debug flag to keep.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 306, in compute_pileup_per_chunk
if global_args.debug and os.path.exists(headerless_sliced_path):
NameError: name 'global_args' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/Midas3/bin/midas", line 8, in
sys.exit(main())
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/main.py", line 25, in main
return subcommand_main(subcommand_args)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 563, in main
run_genes(args)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 557, in run_genes
raise error
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 530, in run_genes
list_of_chunks_depth = multiprocessing_map(compute_pileup_per_chunk, args_list, number_of_chunks)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/common/utils.py", line 532, in multiprocessing_map
return _multi_map(func, items, num_procs, multiprocessing.Pool)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
NameError: name 'global_args' is not defined
/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 72 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Thanks again for helping resolve this!

@zhaoc1
Copy link
Contributor

zhaoc1 commented Aug 29, 2024

Hi,

Thanks for testing out the python version. I am also new to this bug, and I am share some of my ideas to debug this error.

Under the hood, I used a global variable for a read-only variable global_args, to avoid pickling global_args when using multiprocess, which in our experiences can be time consuming. From the error message, it seems like this mechanism doesn't work in your case. To test out this idea, can you add print(global_args) after Line 300. Then python setup.py build && python setup.py install.

Secondly, MIDAS has only been tested in Linux system. Have you tried the unit test bash tests/test_analysis.sh 8 to see if MIDAS can be ran successfully?

Thanks,
Chunyu

@kkerns85
Copy link
Author

kkerns85 commented Sep 3, 2024

Still no luck. Attached is the same error. I think I might try the docker. I did use the run_species and that did work so not sure why this error is only seen in the run_genes.py?
Thanks.

1725373643.4: MIDAS2::multiprocessing_map::start
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 301, in compute_pileup_per_chunk
print(global_args)
NameError: name 'global_args' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/homebrew/anaconda3/envs/Midas3/bin/midas", line 8, in
sys.exit(main())
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/main.py", line 25, in main
return subcommand_main(subcommand_args)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 564, in main
run_genes(args)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 558, in run_genes
raise error
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/subcommands/run_genes.py", line 531, in run_genes
list_of_chunks_depth = multiprocessing_map(compute_pileup_per_chunk, args_list, number_of_chunks)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/common/utils.py", line 532, in multiprocessing_map
return _multi_map(func, items, num_procs, multiprocessing.Pool)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/site-packages/midas/common/utils.py", line 520, in _multi_map
return p.map(func, items, chunksize=1)
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/homebrew/anaconda3/envs/Midas3/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
NameError: name 'global_args' is not defined

@zhaoc1
Copy link
Contributor

zhaoc1 commented Sep 3, 2024

Hmm have you tried the unit test and does it pass? => bash tests/test_analysis.sh 8

@zhaoc1
Copy link
Contributor

zhaoc1 commented Sep 3, 2024

If the unit test pass, then it might be related to either the reads or the database. I would also wc -l check the CG1-P3/genes/CG1-P3.bam.idxstats file to make sure it is not empty.

@kkerns85
Copy link
Author

kkerns85 commented Sep 3, 2024

Yes. It seems to work.

(Midas3) kris_mega_beast@KristopacStudio MIDAS-3 % bash tests/test_analysis.sh 8

  • '[' 1 -ne 1 ']'
  • num_cores=8
    ++ pwd
  • basedir=/Users/kris_mega_beast/MIDAS-3
  • testdir=/Users/kris_mega_beast/MIDAS-3/tests
  • outdir=/Users/kris_mega_beast/MIDAS-3/tests/midas_output
  • rm -rf /Users/kris_mega_beast/MIDAS-3/tests/midas_output
  • mkdir -p /Users/kris_mega_beast/MIDAS-3/tests/midas_output
  • midas_outdir=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/single_sample
  • merge_midas_outdir=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/across_samples
  • midas_dbname=gtdb
  • midas_db=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/midasdb_gtdb
  • logs_dir=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/logs
  • mkdir -p /Users/kris_mega_beast/MIDAS-3/tests/midas_output/logs
  • samples_fp=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples.txt
  • pool_fp=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples_list.tsv
  • rm -rf /Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples.txt
  • rm -rf /Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples_list.tsv
  • ls /Users/kris_mega_beast/MIDAS-3/tests/reads
  • awk -F _ '{print $1}'
  • echo 'MIDASv3 Unit Testing Start'
    MIDASv3 Unit Testing Start
  • echo -e 'sample_name\tmidas_outdir'
  • cat /Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples.txt
  • awk -v 'OFS=\t' -v dir=/Users/kris_mega_beast/MIDAS-3/tests/midas_output/single_sample '{print $1, dir}'
  • echo 'Testing Single-Sample Species Module'
    Testing Single-Sample Species Module
  • cat /Users/kris_mega_beast/MIDAS-3/tests/midas_output/samples.txt
  • xargs -Ixx bash -c 'midas run_species --sample_name xx -1 /Users/kris_mega_beast/MIDAS-3/tests/reads/xx_R1.fastq.gz --num_cores 8 --midasdb_name gtdb --midasdb_dir /Users/kris_mega_beast/MIDAS-3/tests/midas_output/midasdb_gtdb /Users/kris_mega_beast/MIDAS-3/tests/midas_output/single_sample &> /Users/kris_mega_beast/MIDAS-3/tests/midas_output/logs/xx_species_8.log'
    xargs: command line cannot be assembled, too long

@zhaoc1
Copy link
Contributor

zhaoc1 commented Sep 3, 2024

There seems to be a xargs error message ?

Did you see MIDASv3 Unit Testing SUCCESS printed at the end of the unit test?

@kkerns85
Copy link
Author

kkerns85 commented Sep 3, 2024

Oh no I did not get the Success command.

@zhaoc1
Copy link
Contributor

zhaoc1 commented Sep 3, 2024

One more question: are you testing MIDAS on a linux machine? I don't think it would work on a OS-system, I never tested it.

@kkerns85
Copy link
Author

kkerns85 commented Sep 3, 2024

I am on an OS system with Apple Silicon M2 Chip.
Constant problem it seems lol.
I think the Docker might be the safest option at this point. Is there a Midas3 container I can pull or do I need to build it?
Thanks again for all your help.

@zhaoc1
Copy link
Contributor

zhaoc1 commented Sep 3, 2024

That makes sense.

I just updated the MIDAS version in the Dockerfile due to a recent fix and you might need to build it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@zhaoc1 @kkerns85 and others