Skip to content
This repository has been archived by the owner on Nov 13, 2020. It is now read-only.

mkl increasing slug size significantly #21

Open
mszheng opened this issue Feb 18, 2016 · 12 comments
Open

mkl increasing slug size significantly #21

mszheng opened this issue Feb 18, 2016 · 12 comments

Comments

@mszheng
Copy link

mszheng commented Feb 18, 2016

Since the upgrade described here (Feb 5 2016): https://www.continuum.io/blog/developer-blog/anaconda-25-release-now-mkl-optimizations, conda is defaulting to the mkl optimized numpy and scipy, which require the ~120 MB mkl package. This can easily bump the slug size over 300 MB. It's simple to work around this by specifying "nomkl" in conda-requirements.txt, but perhaps that should be the default for this buildpack.

@htylab
Copy link

htylab commented Feb 20, 2016

Recommend it. remove mkl in the default setting.

pierrelb referenced this issue in pierrelb/flask-demo Feb 24, 2016
Running into slug size problems:
```
remote: -----> Compressing...
remote:  !     Compiled slug size: 321.9M is too large (max is 300M).
```

Seems mkl is causing this.
https://github.com/kennethreitz/conda-buildpack/issues/21
Current discussion says to use 'nomkl'
pierrelb referenced this issue in pierrelb/flask-demo Feb 24, 2016
Running into slug size problems:
```
remote: -----> Compressing...
remote:  !     Compiled slug size: 321.9M is too large (max is 300M).
```

Seems mkl is causing this.
https://github.com/kennethreitz/conda-buildpack/issues/21
Current discussion says to use 'nomkl'

Reduced slug size from 321.9M (over the 300M) to 182.5M
@durkode
Copy link

durkode commented Mar 1, 2016

I agree with this. It took me a long while trying to work out how to reduce my slug size, as the lowest I could get it was down to 310MB. adding nomkl instantly dropped to it 166MB. It would be great to either have this as the default or featured in the readme to increase visibility of this.

@alexlouden
Copy link

👍 Adding nomkl reduced my slug from 420mb to 280mb. Using scikit-learn, scipy, numpy and opencv.

@mrgordon
Copy link

+1, this project is borderline unusable on Heroku without this. Thanks for the hard work as this buildpack is a pleasure other than the slug size issues!

@evdoks
Copy link

evdoks commented May 10, 2016

Putting nomkl in conda-requirements.txt does not seem to be helping in my case. Having following packages in conda-requirements.txt

nomkl
scipy=0.17.0
scikit-learn=0.17.1
pandas=0.18.0
nltk=3.2.1
sqlalchemy=1.0.12
joblib=0.9.4

still leads to mkl being downloaded:

     The following packages will be downloaded:
remote:        
remote:            package                    |            build
remote:            ---------------------------|-----------------
remote:            libgcc-5.2.0               |                0         1.1 MB
remote:            libgfortran-3.0.0          |                1         281 KB
remote:            mkl-11.3.1                 |                0       121.2 MB
remote:            nomkl-1.0                  |                0          402 B
remote:            system-5.8                 |                2         170 KB
remote:            openblas-0.2.14            |                0         6.6 MB
remote:            joblib-0.9.4               |           py27_0         121 KB
remote:            nltk-3.2.1                 |           py27_0         1.7 MB
remote:            numpy-1.10.4               |     py27_nomkl_0         6.0 MB
remote:            pytz-2016.3                |           py27_0         178 KB
remote:            six-1.10.0                 |           py27_0          16 KB
remote:            sqlalchemy-1.0.12          |           py27_0         1.3 MB
remote:            python-dateutil-2.5.2      |           py27_0         236 KB
remote:            scipy-0.17.0               |      np110py27_3        31.3 MB
remote:            pandas-0.18.0              |      np110py27_0        12.0 MB
remote:            scikit-learn-0.17.1        |np110py27_nomkl_0         8.6 MB
remote:            ------------------------------------------------------------
remote:                                                   Total:       190.6 MB
remote:        
remote:        The following NEW packages will be INSTALLED:
remote:        
remote:            joblib:          0.9.4-py27_0            
remote:            libgcc:          5.2.0-0                 
remote:            libgfortran:     3.0.0-1                 
remote:            mkl:             11.3.1-0                
remote:            nltk:            3.2.1-py27_0            
remote:            nomkl:           1.0-0                   
remote:            numpy:           1.10.4-py27_nomkl_0      [nomkl]
remote:            openblas:        0.2.14-0                
remote:            pandas:          0.18.0-np110py27_0      
remote:            python-dateutil: 2.5.2-py27_0            
remote:            pytz:            2016.3-py27_0           
remote:            scikit-learn:    0.17.1-np110py27_nomkl_0 [nomkl]
remote:            scipy:           0.17.0-np110py27_3      
remote:            six:             1.10.0-py27_0           
remote:            sqlalchemy:      1.0.12-py27_0           
remote:            system:          5.8-2  

Am I missing something?

@dtran320
Copy link

@evdoks For some reason this broke for me recently as well, perhaps due to a regression in Conda or a change in the way dependencies are handled. The workaround I finally stumbled on was to pass the --no-deps flag to conda install and explicitly list out all your packages in conda-requirements.txt. See conda/conda#2032 (comment)

This means, unfortunately that you'll need to fork/edit this buildpack (which I was already doing to make it work well with my multi-buildpack setup). The line you need to change in question is in bin/steps/conda_compile:

It previously was:

conda install --file conda-requirements.txt --yes | indent

and should be changed to:

conda install --no-deps --file conda-requirements.txt --yes | indent

@evdoks
Copy link

evdoks commented May 12, 2016

@dtran320 Thanks! Having --no-depts has solved the problem.

@mcg1969
Copy link

mcg1969 commented May 13, 2016

That's a great workaround. The transition from mkl to nomkl has proven... difficult.

@jake17007
Copy link

jake17007 commented Nov 22, 2016

@dtran320, @evdoks

Instead of forking and adding --no-depts, I added nomkl and the highest order package I needed: scikit-learn, but specified nomkl on that dependency.

My conda-requirements.txt:

nomkl
scikit-learn=0.18.1=np111py27_nomkl_0

That allowed me to use this conda buildpack and let the solver find the dependencies, circumventing having to specify lower level packages like numpy, scipy, etc.

@dtran320
Copy link

@jake17007 Ah, using --no-deps actually broke some things for us with the newest versions, so we've migrated away from that solution

@dtran320
Copy link

Hmm, did everyone's workarounds just break? For some reason, my buildpack went back to mkl:

       The following packages will be downloaded:
       
           package                    |            build
           ---------------------------|-----------------
           mkl-11.3.3                 |                0       122.1 MB
           openblas-0.2.19            |                0         3.0 MB
           numpy-1.11.2               |           py27_0         6.2 MB
           scipy-0.18.1               |      np111py27_0        30.9 MB
           scikit-learn-0.18.1        |      np111py27_0        10.9 MB
           ------------------------------------------------------------
                                                  Total:       173.1 MB
       
       The following NEW packages will be INSTALLED:
       
           mkl:          11.3.3-0                
       
       The following packages will be UPDATED:
       
           numpy:        1.11.2-py27_nomkl_0      [nomkl] --> 1.11.2-py27_0     
           openblas:     0.2.14-4                 --> 0.2.19-0          
           scikit-learn: 0.18.1-np111py27_nomkl_0 [nomkl] --> 0.18.1-np111py27_0
           scipy:        0.18.1-np111py27_nomkl_0 [nomkl] --> 0.18.1-np111py27_0

@icoxfog417
Copy link

icoxfog417 commented Dec 24, 2016

In my experience, we have to do following steps.

  • At first, conda install nomkl --yes | indent
    • Not to use mkl, you have to install nomkl at first.
  • Add --nodeps when installing the packages (conda install --no-deps --file conda-requirements.txt --yes | indent).
    • Do not worry about --nodeps, because if you create conda-requirements.txt from conda list --export, all dependencies are listed in the file.
  • Check nomkl version of libraries are listed in conda-requirements.txt.
    • You have to use nomkl version of packages even if you have installed nomkl.
    • For example, not numpy-1.11.2-py35_0.tar.bz2, but numpy-1.11.2-py35_nomkl_0.tar.bz2
    • If you use nomkl version, its dependencies changes. (For example, nomkl version of numpy depends on openblas). So you have to check these if you don't use nomkl version in your local environment.

Below is my repository that succeeded to deploy to Heroku recently.

icoxfog417/machine_learning_in_application

And my patched buildpack is below(It supports Python3 also).

icoxfog417/conda-buildpack

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

10 participants