Skip to content
This repository has been archived by the owner on Dec 4, 2023. It is now read-only.

Close netcdf dataset after getting its size #29

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ninsbl
Copy link

@ninsbl ninsbl commented Oct 1, 2021

Thanks for this cool python library. I looked at alternatives (like siphon) but this one still is the most to the point solution with the features / functions I need.
Hopefully, the fact that there have not been any commits to the master branch the last years is a sign of the reliability of the library and that it just works (and not that it is no longer actively maintained).

When I used it to crawl a larger Thredds server, I noticed that the server at some point returned a 502 Bad Gateway error. It may be related to the issue I try to address in this PR, that netcdf files are not closed after their size is computed, leaving the server with plenty of open datasets?

Another related question is, if I am not interested in the size of a dataset, but just want to get the URLs opening the dataset and computing it`s size is time spent unneccessary. Would it be acceptable for you to change the default, that the size is only computed on user request? I could have a look at that and make a separate PR...

But if this library is no longer maintained, I would be really happy if you could point me to an alternative library that could be used as a replacement (with the same features)...

@ninsbl
Copy link
Author

ninsbl commented Oct 3, 2021

So it seemed the main problem was actually the TCP connection that was kept alive (for some time) with a newer version of requests. That accumulates and led to thousands of open connections on a bigger thredds server, effectively killing the server.

This could probably be solved more elegant, like e.g. here:
https://stackoverflow.com/questions/54876452/run-parallel-request-session-in-python
but closing the connection after the data is read limits the number of open connections to the number of workers. Performance seems unchanged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant