-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving installing packages with conda #97
Comments
An issue with having pyctdev run two
From a folder containing this # dodo.py
def task_reproduce():
return {'actions': [
'conda install --yes "python=3.8.8"',
'echo "I will be reported as failing :("'
]} Leads to this error:
Not that the action that is reported as failing is the second one that has nothing to do with Python. The error occurs maybe in between the two actions but isn't well reported. This is actually a pretty important point as this error occurs quite often and the suggestion I made in my previous post doesn't take that into account. |
Ideally you would not install pyctdev (or any similar tool) in the environment you're testing or building, as its own dependencies may affect those of the software to test/build. It's even more true for pyctdev because of the bug reported above. One may think that pyctdev could be installed in the base environment and as such would be made accessible to all the other environments:
While we could accommodate with 1), 2) is a blocker. To unlock 2), we could convert all the projects to using |
Currently the pyctdev conda package depends on conda itself, which means that when you install pyctdev in any environment outside of the base, it will be install conda there as well which causes all sorts of confusion and problems. Really, any tool that actually depends on having conda in the same environment really should not be installed anywhere except base. |
I'm opening this issue to discuss about how we could improve the installation of packages with conda, with the goals of making that:
This is motivated by the recent work done by @philippjfr to improve the speed of the test workflows (starting from Panel) and by various difficulties experienced with using
pyctdev
for almost a year now.How it works
Installing packages with
pyctdev
requires first to create an environment. This is usually done with installing firstpyctdev
and then running:doit env_create --name envname --python=3.x -c channel1 -c channel2
which:
pyctdev
in that new environment:pyctdev
(the one installed originally and running these steps) is in a pre-release version, install from thepyviz/label/dev
channelPYCTDEV_SELF_CHANNEL
is provided, install from the channel provided as value of the env varpyviz
channelNote as this may be important that in step 2, the
conda install
step does not include all the channels listed in thedoit env_create
call.Then the environment is activated, there's no
pyctdev
command for that.The main installation can now take place, this is done by running:
doit develop_install -c channel1 -c channel2 -o options1 -o options2 -o options3
which:
-c-
)-c-
)python -m pip install --no-deps --no-build-isolation -e .
(--no-deps
as all the dependencies should already be there,--no-build-isolation
to avoid creating a virtual environment, all the build dependencies are already installed anyway)--conda-mode=mamba
can be set to usemamba
instead ofconda
in steps 1 and 2.Making it faster
There are I believe two main avenues to make this faster.
The first one would be to use a faster solver, by default, either mamba or the libmamba solver. The
--conda-mode
option already offers the possibility to run the slowest install steps withmamba
. Ideally though we wouldn't have to usemamba
, and we would rely on the libmamba solver implemented inconda
, which hopefully should one day becomesthe default one, or at least available not under an experimental flag.The second one would consist in reducing the number of
conda install
steps. There are currently 4 conda install steps, the first one being when the environment is created, the three other ones to installpyctdev
, the build dependencies and then all the other required dependencies. Installing multiple times in a conda environment is known to lead to long solving times:doit env_create
could install installpyctdev
when it is creating the environment, reducing it basically toconda create -n new-env python=3.x pyctdev
(as a matter of fact I have replaceddoit env_create
by this in a number of workflows already)doit develop_install
.We could go even further and have a single command to install all the dependencies, adding to
doit env_create
some of the features ofdoit develop_install
, which would then be called as such:doit env_create --name my-env --python=3.x -c channel1 -c channel2 -o options1 -o options2 -o options3
. As this might end up in installing a version ofpyctdev
that is not the latest, another command line parameter could be added to be able to add a version constraint, e.g.--pyctdev-install=">=1.1"
.Making it more flexible
Some packages that are needed to run the test suite or the docs build are not available on PyPi. Because of that they are not listed anywhere in the
setup.py
file, instead, they are installed directly in the Github workflows files withconda install
.pyctdev
should offer a way to install these packages without having to resort to useconda
directly.Some packages are not available on Anaconda.org (e.g. a recent example is
pytest-playwright
). These packages are usually installed manually withpip
after runningdoit develop_install
. There should be a way to declare a list of packages thatpyctdev
should install withpip
.Regarding these two points, one could think that the packages to install only with conda and only with pip could be declared in a config file (e.g. in
setup.cfg
). However, I believe that for maximum flexibility it would actually be better to add command line parameters topyctdev
instead, as sometimes the packages to install depend on the operating system or on some other conditions. I would suggest something likedoit develop_install --conda-install "nodejs>15" --conda-install mesalib --pip-install pytest-playwright --pip-install ...
.Making it more predictable
What sometimes makes the installation process difficult to predict, and even not so robust, is the "channel dance" 💃 , whereby some packages get re-installed from another channel because of different channels being specified in the install steps. This was the source of a bad bug - that took months to find on HoloViews test suite as it happened only on a platform, and that still happens from time to time in the ecosystem - by which Python itself was being re-installed during a
doit develop_install
call, leading to a cryptic doit/pyctdev error.One of the steps that I think is one source of this problem is the second step of
doit env_create
, the one that installspyctdev
. Because it doesn't re-use the channels passed todoit env_create
, and because it chooses the channel to installpyctdev
itself based on some rather implicit conditions. I would suggest that this step should installpyctdev
with the channels provided todoit env_create
, and adding a command line parameter toenv_create
to override the channel it should be installed from, e.g.doit env_create ... --pyctdev-channel "pyviz/label/dev"
. Note that in most HoloViz cases you wouldn't use that new parameter as eitherpyviz
orpyviz/label/dev
are specified in the channels list.An approach that I have recently tried and that I find very appealing is to:
conda create -n my-env
conda activate my-env
conda config --env --append channels channel1 --append channels channel2
conda config --set channel_priority strict
This creates a local condaRC file associated with that environment. The benefits of this approach is that all the channels are declared prior to installing anything, in the order they are supposed to be used. Setting the channel priority to
strict
makes the environment solving even more predictable (and faster I believe). So the environment is set up, and the laterconda install
calls don't have to specify any channel at all. I think that this approach also offers a better separation between the user conda configuration, their system condaRC is less likely to leak its configuration during the installation procedure. Another situation that can benefit from this approach is in a local setup you want to download a new package or update a package. In that case you would do that usingconda install
directly, and you would have to remember the channels you should use and their order in order to avoid the channel dance. With the suggested approach you don't have to remember anything about the channels.Suggestion
If I would combine all the suggestions I've made into a rather ambitious proposal, that would be extending
doit env_create
so that the following is allowed:which would do:
conda create -n my-env
conda activate my-env
conda config --env --append channels pyviz/label/dev --append channels conda-forge --append channels nodefaults
conda config --set channel_priority strict
mamba install python=3.x "pyctdev>=1.1" all the other tests and examples dependencies "nodejs>15"
python -m pip install pytest-playwright
python -m pip install --no-deps --no-build-isolation -e .
I would appreciate any feedback on this issue. If the last suggestion is too ambitious, implementing separately some of the first suggestions should already be an improvement. Note that I have not given any thought on the
pip
version ofdoit develop_install
, which I think doesn't suffer from the issues reported here, at least not the performance related problems.The text was updated successfully, but these errors were encountered: