Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--deterministic flag should set a random seed for random processes #640

Open
kenibrewer opened this issue Oct 16, 2023 · 3 comments
Open

Comments

@kenibrewer
Copy link

As discussed in #509 in this comment, Flye still produces different assembly outputs when using the --deterministic flag. That is because multi-threading is not the only source of non-determinism. There is also the use of random variables in the process here.

An improved --deterministic flag should set a random seed for these other processes to allow the entire assembly process to occur deterministically.

A fully deterministic Flye would be very helpful for integrating Flye into bioinformatic pipelines that are tested against reference datasets.

@mikolmogorov
Copy link
Owner

Makes sense, will add this to the list of TODOs for the next release, thanks!

@riyasj327
Copy link

@mikolmogorov thank you so much for this amazing tool! Just wondering if this is fixed? As of now, using --deterministic will give the best reproducible assembly?

@mikolmogorov
Copy link
Owner

Not at the moment, unfortunately. True determinism is something very hard to achieve and maintain in assembler, since the algorithm is complex and involves multiple completely different stages. Every stage have to be completely deterministic to make the pipeline deterministic. Given that our resources are limited, not a priority right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants