Thank you for your interest in contributing to this open-source Machine Learning book! We greatly value feedback and contributions from our community.
Please read through this document before you submit any pull requests or issues. It will help us work together more effectively.
To contribute, send us a pull request. Please review our general Guidelines for contributing and Style guide before you start.
This section describes the development environment setup and workflow which should be followed when modifying/porting Python code and making changes to one of the machine learning frameworks in the book. We follow a set of pre-defined Style guide for consistent code quality throughout the book and expect the same from our community contributors. You may need to check other chapters from other contributors as well for this step.
All the chapter sections are generated by JupyterBook.
Before you start, you will need Python and Conda on your computer.
Add the following paths(depending on your OS) to the environment variable `PATH`` if needed. To Windows,
D:\Python\Python310\Scripts\
D:\Python\Python310\
D:\anaconda3\Scripts
Follow the Jupyter Book official guidance to install the latest version.
draw.io is needed for generating draw.io-based diagrams in build time. Install the draw.io desktop application on your local machine. By default, the draw.io execution is correctly located at the platform-appropriate path:
- Windows:
C:\Program Files\draw.io\draw.io.exe
(Attention: Don't change the installation path.) - Linux:
/opt/drawio/drawio
or/opt/draw.io/drawio
(older versions) - macOS:
/Applications/draw.io.app/Contents/MacOS/draw.io
.
Mostly, you don't need to do anything here. The executable will be picked up by sphinxcontrib-drawio automatically.
Clone the source code from remote through your preferred protocol.
# through HTTP
git clone https://github.com/ocademy-ai/machine-learning.git
Move to the working directory.
cd machine-learning/open-machine-learning-jupyter-book/
Initialize the Conda env.
# first time setup
conda env create -f environment.yml
# or update
conda env update -f environment.yml
To Mac,
Warning
You may see below Tensorflow installation failures, especially on the ARM-based M1 Mac.
ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none) ERROR: No matching distribution found for tensorflow
Solution:
- Comment out Tensorflow in environment.yml.
- Follow Apple's official documentation to install the Tensorflow.
- Run
conda env update -f environment.yml
again to install the remaining dependencies.- Optional - try to uncomment the Tensorflow in environment.yml.
Warning
You may see below error when you have trouble access GitHub.
error: RPC failed; curl 56 LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60 fatal: expected flush after ref listing
Solution:
Change your network. In order to proceed smoothly later, hope you can solve this problem here.
To Windows,
Warning
You may see below HTTP error first.
An HTTP error occurred when trying to retrieve this URL. HTTP errors are often intermittent, and a simple retry will get you on your way.Create
.condarc
conda configuration file(This file should):conda config --set show_channel_urls yesThis file is in your user directory by default,for example:
C:\Users\gouha\.gitconfig
Delete initial content in
.condarc
, the add the following content to.condarc
.channels: - defaults show_channel_urls: true default_channels: - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r custom_channels: conda-forge: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud msys2: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud bioconda: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud menpo: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud pytorch: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud simpleitk: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
Warning
You may see below error when you have trouble access GitHub.
error: RPC failed; curl 56 LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60 fatal: expected flush after ref listing
Solution:
Change your network. In order to proceed smoothly later, hope you can solve this problem here.
Warning
You may encounter download or run failures due to lack of administrator privileges.
error: Could not install packages due to an OSError: [WinError 5] Access denied. Consider using the `--user` option or check the permissions.
Solution:
Turn off administrator privileges by using the command prompt.
Run cmd as Administrator.
Enter the command
NET USER administrator /active:no
and run.
Warning
When you are building the book, you may encounter an error when running terminal (like powershell).
error: Failed building wheel for jupyter-nbextensions-configurator or Unable to load file: C:\Users\87897\Documents\WindowsPowerShell\profile.ps1Solution:
Enter the command:
set-ExecutionPolicy RemoteSigned
, then enterY
.Tips: You can use the command
get-ExecutionPolicy
to check , and ifRemoteSigned
appears, it means the modification is successful.
conda activate open-machine-learning-jupyter-book
# official guidance - https://jupyterbook.org/en/stable/start/build.html
# Windows
jupyter-book build .
# Mac
# if you are using bash
bash ./build.sh
# or you can rebuild everything
bash ./build-force-all.sh
Then you should be able to follow the build success message to view the book locally.
To Mac,
Warning
You may encounter following problem when you program on ARM-based M1 Mac.
OSError: no library called "cairo-2" was found no library called "cairo" was found no library called "libcairo-2" was foundSolution:
- Install Homebrew.
- Fetch Homebrew sources:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Install the below missing dependencies through Homebrew:
brew install cairo pango gdk-pixbuf libxml2 libxslt libffi
- Find out the path of
cairo
,glib
andpango
installation, and export them to DYLD_LIBRARY_PATH:# for example export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/cairo/1.16.0_5/lib/:/opt/homebrew/Cellar/pango/1.50.9/lib/:/opt/homebrew/Cellar/glib/2.72.3_1/lib/How to find out above pathes? Here is an example of cairo:
- Run the command
which brew
.- If the response is
/opt/homebrew/bin/brew
, now we get the Homebrew root path as '/opt/homebrew/'.(The result may depend on your OS!!)- Check if
cairo
,glib
,pango
are existing in/opt/homebrew/Cellar
.- Find out the lib path for above libraries, such as
/opt/homebrew/Cellar/cairo/1.16.0_5/lib
.(The result may depend on your OS!! Remind again.)
- Rerun
jupyter-book build .
- Run
pip uninstall xcffib
if error still exists, and then try again.
To Windows,
Warning
You may encounter following problem when you program.
OSError: no library called "cairo-2" was found no library called "cairo" was found no library called "libcairo-2" was foundSolution:
Download GTK3.
Run the following command.
pip uninstall xcffibRestart the terminal and build again.
The slides are implemented as notebooks in slides/
, which is powered by RISE.
If you want to edit or preview the slides locally, you need to use Jupyter Notebook. Once you use Jupyter Notebook/JupyterLab to load the project, the slide will be launched in live mode after you open any corresponding notebook.
# Install javascript and css files
jupyter contrib nbextension install --user
# Enabling extensions
jupyter nbextension enable init_cell/main
# Launch the notebook
jupyter notebook
Warning
Please make sure the Jupyter Notebook is running in trusted mode, and the init_cell is configured for the first cell of slide notebook. So that the first cell will be automatically executed to load the CSS.
Regarding the deletion and addition of the _toc.yml file:
* The _toc file is located in the open-machine-learning-jupyter-book [directory](https://github.com/ocademy-ai/machine-learning/blob/main/open-machine-learning-jupyter-book/_toc.yml)
* In Jupyter Book, the _toc.yml file is the file used to define the directory structure of the book, containing the book chapters, sub-chapters and page hierarchy.
* When you build your book using Jupyter Book, it reads the _toc.yml file and generates a navigation bar based on the directory structure in it.
* To speed up the local book build, you can keep only the content of the chapters you changed for the build. This speeds up the build and ignores errors reported by other chapters.
* However, when deleting other chapters, pay attention to ensure the integrity of the entire book structure, otherwise it may lead to error reporting, it is recommended that when you first get started, one by one CAPTION deletion.
* After the preview, please restore the original _toc structure
Non-consecutive header level increase
You may see below failures when building the books:
WARNING: Non-consecutive header level increase; 0 to 2 [myst.header]
This error is caused by the presence of a non-consecutive heading level increase in the specified Jupyter notebook file. Specifically, the heading level increases directly from 0 to 2 without going through level 1. To resolve this issue, you can follow these steps:
1. Open the specified Jupyter notebook file.
2. Check the setting for the heading level, which in this lesson is the number of '#'.
3. Ensure that the heading level increases continuously without skipping any levels.
4. If you find a non-continuous heading level increase, adjust it to a continuous level increase.
5. Save the file and re-run the code to ensure the error has been resolved.
6. Take care of '---', this will be recognize as a title. If you are transforming md file to ipynb file, and confirm that the above situation does not exist, please delete the last '---'.
Can't run the code locally
You may encounter a situation where the code cannot be run locally. You can try uploading the document to Google Colab for running, and then download the file containing the results to submit a PR locally.
Couldn't find cache key
You may meet a error like this:
ERROR: Execution Failed: /home/runner/work/machine-learning/machine-learning/open-machine-learning-jupyter-book/data-science/data-visualization/visualization-distributions.md
ERROR: Couldn't find cache key for notebook file data-science/data-visualization/visualization-distributions.md. Outputs will not be inserted.
To solve this error, you can find the file and just add a ' ' in anywhere of the file, just to resubmit it. And then delete the ' ' in next commit to make sure the file not exist in you PR.