-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid keeping calculated values in memory #49
Comments
Hello, Do you know if you are using 'pure Python' Metview, or do you have the binaries installed? Unless you have the environment variable Thanks, |
Hi Iain, thanks for your feedback. I use the metview bundle and compile it just switching off the metview ui which I don't need for my usecase. Afterwards I install metview-python via pip. All in an docker environment. I just tried and set the METVIEW_PYTHON_ONLY environment variable and tested again. This did not change the behavior. If I use "fork" or just let a single process iterate over the tasks, the memory usage grows. If I use "spawn" metview-python raises and exception: Exception: Command "metview" did not respond within 8 seconds. This timeout is configurable by setting environment variable METVIEW_PYTHON_START_TIMEOUT in seconds. At least Metview 5 is required, so please ensure it is in your PATH, as earlier versions will not work with the Python interface. I don't do special calculations I would say, just some unit conversion, calculating wind speed / direction from u/v and calculating the maximum/minimum/difference of two fields. Please see some examples below:
Best regards |
Hi Dennis, Thank you for providing us with the sample code. I wonder which Metview version you are using. You can get it with this code:
There were significant memory usage improvements last year in Metview, so if your version is too old it could be an explanation for your problem. However, seeing the code above the excessive memory usage is probably coming from the code performing the looping. Would it be possible to share some code showing the main iteration through your input data? As Iain pointed out the Python code not releasing the fieldset memory is only used when METVIEW_PYTHON_ONLY is set. It is an experimental feature and if you build Metview from a bundle you most probably do not use it. However, to be on the safe side please try the following: make sure METVIEW_PYTHON_ONLY is unset If it works we can be sure that you are not using the pure Python implementation (since gradient() is not available in it). In this case we just need to focus on the binary (C++) version to find out why memory is accumulated for you. Kind regards, |
Hi Sandor, thank took a while but I now also prepared a bit of sample code. In the meantime I also upgraded to Metview-Bundle but that did not solve the issue. The output from mv.version_info() is: mv.gradient(f) did not cause an error, so METVIEW_PYTHON_ONLY is not set. To my repository at https://github.com/meteoiq/metview I added a few examples. /examples/test_version.py is showing the version output and testing if mv.gradient(f) works. /examples/test_regrid.py in the repository is a trimmed down version of one of the use cases which results in the memory issues. In the example I download some grib fields from the open data server and then want to parallelise some calculations and regridding to a new grid. I output the memory usage of the child processes and they show the increasing memory consumption:
In production I work with much higher resolution and many more parameters so that the memory usage adds up to 40 GB. But I hope the small example outlines the issue. I also added an option to switch to spawn method to create new processes which results in an error:
I hope this provides sufficient information to further review this issue. If you require any more information, please reach out to me. Thank you very much for your support, |
Hi Dennis, |
Hi Dennis, |
Hi Sandor, operationally we mainly interpolate O1280 to regular 0.125°x0.125°. We intend to run 8 parallel processes but due to the memory accumulation this exceeds the machine limitations, so currently we just run 2 parallel processes. Best regards |
Hi Dennis, Many thanks! So each process is using up ~20 GB of memory? I fixed a memory leak related to GRIB handling in the C++ Metview code (not yet released) but unfortunately it did not fix your case. I have not found the reason for the memory leak using your examples so far. Actually, the memory is accumulated very slowly, I tried to scale it up by repeating the processing loops without seeing any significant growth. On my Mac I even noticed that after a while the memory usage occasionally decreases! Nevertheless, if I use the garbage collector explicitly by calling Best regards, |
Hi Dennis, I have to admit that so far I have only run the tests as a single process (PARALLEL=False) and could not really reproduce a proper memory leak. However, when I used the PARALLEL=True mode the memory leak became obvious and as I do more and more iterations the memory is consistently increasing. Unfortunately, Metview should not be used in parallel applications like that. It might work but it is completely unsupported and untested. The recommended way is to run two/more Metview Python scripts at the same time independently. My guess is that the memory would not accumulate in that case. So far this has been the best idea I could come up with. Best regards, |
Hi Sandor, thank you for the feedback. I will then refactor the processing so that metview-python runs in different processes. I am just wondering if you are aware of other ways or best practices to use MIR interpolation, easily apply arithmetics operations on gridded data and output grib again. My understanding was that Metview is the only possibility as MIR is not available as a separate package. Best regards and thanks again for your help! |
Hi Dennis, Actually, MIR is available as a separate package: it is on github, and you can build it e.g as part of the metview bundle. We are fully aware of the limitations of using Metview when it comes to parallel runs. There is a new project at ECMWF called earthkit that will offer similar functionalities in Python to Metview but will be fully scalable. However, it is not yet available. As for Metview best practices, I can make the following recommendations.
runs almost 2.5 times faster than this:
Best regards, |
Thank you Sandor, that was very helpful. Feel free to close this issue. I am happy to test earthkit with my specific use case and give feedback once it is a bit more mature. Best regards |
Hello,
I am using metview-python to work with large amounts of GRIB data and during the processing a lot of memory is allocated (even though I eventually only extract some grid points). When inspecting the code it seems that by default the data is not kept in memory. But it seems that for calculated values this is not the case. Would it be possible to find a better way for memory handling, even if the performance may suffer (e.g. storage in temporary files, invalidating/removing values again from memory when no longer needed)?
Alternatively I tried to parallelize jobs using Python multiprocessing but can only use the "fork" method which retains the memory usage and the process looses the connection to metview when trying "spawn" instead.
Thanks
Dennis
The text was updated successfully, but these errors were encountered: