You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a somewhat unique situation where, when nodeenv downloads Node and then calls extractall it can take up to three or four hours to execute the extraction. This is due to various security scanners we are required to use and the fact that extractall is a synchronous/one-at-a-time extraction operation. Note this is on Windows, so there isn't currently an option to support the system Node (which would also solve a lot of problems).
I downloaded a single version of the Node.js zip file locally just to test the differences. I replicated the download_node_src method (basically) and just had it extract those files in the way that works now.
Running this script takes three hours to finish extracting for me.
I found this interesting blog article that explained how to use the ThreadPoolExecutor to unzip in parallel. This allows me to unzip in three minutes because the security scanner can do its thing in parallel along with the thread pool. In the example below I have it set to 100 threads. If I increase that to 200 threads, it cuts the corresponding time in half to about 90 seconds.
I'm curious if this project would be interested in a pull request to update the zip file extraction to work in parallel. I'm not a huge Python developer but I'd be happy to give it a go.
The text was updated successfully, but these errors were encountered:
We have a somewhat unique situation where, when
nodeenv
downloads Node and then callsextractall
it can take up to three or four hours to execute the extraction. This is due to various security scanners we are required to use and the fact thatextractall
is a synchronous/one-at-a-time extraction operation. Note this is on Windows, so there isn't currently an option to support thesystem
Node (which would also solve a lot of problems).Specifically, we're running into this in the context of using
pre-commit
, which for each Node-based pre-commit validator, sets up a separate Node environment usingnodeenv
. If you have four or five Node-based hooks, that means it can take up to a day to getpre-commit
initialized and then when it's time to update a hook... be prepared to spend some time.I downloaded a single version of the Node.js zip file locally just to test the differences. I replicated the
download_node_src
method (basically) and just had it extract those files in the way that works now.Running this script takes three hours to finish extracting for me.
I found this interesting blog article that explained how to use the
ThreadPoolExecutor
to unzip in parallel. This allows me to unzip in three minutes because the security scanner can do its thing in parallel along with the thread pool. In the example below I have it set to 100 threads. If I increase that to 200 threads, it cuts the corresponding time in half to about 90 seconds.I'm curious if this project would be interested in a pull request to update the zip file extraction to work in parallel. I'm not a huge Python developer but I'd be happy to give it a go.
The text was updated successfully, but these errors were encountered: