Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to load a partially recorded benchmark from disk raises ValueError #134

Open
janosh opened this issue May 3, 2022 · 2 comments
Open

Comments

@janosh
Copy link
Member

janosh commented May 3, 2022

This code

benchmark_path = "tmp-benchmark.json"
mbbm = MatbenchBenchmark.from_file(benchmark_path)

raises

ValueError: Cannot validate task matbench_jdft2d unless all folds recorded!; folds [0, 2, 3, 4] not recorded!

if

mbbm.to_file(benchmark_path)

was previously written to disk with only some folds recorded.

For the purpose of splitting folds into slurm array jobs, it would be very useful if it was possible to read and write partial benchmarks. I tried commenting out the validation line

obj.validate()

and everything appears to be working fine. The line probably has a purpose but perhaps that could be achieved differently while also allowing partial benchmark writing?

@ardunn
Copy link
Collaborator

ardunn commented Aug 19, 2022

This is a good point. I never really considered people would be using Matbench in a parallel fashion but now that they are it makes sense to think of a more comprehensive and robust solution. I'll do some thinking on my side but if you have ideas for how to do this while still allowing validation on loading I'm open to suggestions

Something off the top of my head is just introducing a conditional that will validate only if all folds are recorded. The purpose of the validation is to really check for any possible errors before it is saved as a complete benchmark (and used by the doc builder to actually create the docs) to avoid downstream debugging chaos. But I can't immediately forsee any scenario where the benchmark checker would allow an incomplete task without error, so maybe just a simple conditional would work.

@janosh
Copy link
Member Author

janosh commented Sep 22, 2022

Yes, only doing the validation once all folds are recorded would be a good solution. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants