If you have cloned Tesseract from GitHub, you must generate the configure script.
If you have tesseract 4.0x installation in your system, please remove it before new build.
You need Leptonica 1.74.2 (minimum) for Tesseract 4.0x.
Known dependencies for training tools (excluding leptonica):
- compiler with c++17 support
- automake
- pkg-config
- pango-devel
- cairo-devel
- icu-devel
So, the steps for making Tesseract are:
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
make training
sudo make training-install
You need to install at least English language and OSD traineddata files to
TESSDATA_PREFIX
directory.
You can retrieve single file with tools like wget, curl, GithubDownloader or browser.
All language data files can be retrieved from git repository (useful only for packagers!). (Repository is huge - more that 1.2 GB. You do NOT need to download traineddata files for all languages).
git clone https://github.com/tesseract-ocr/tessdata.git tesseract-ocr.tessdata
You need an Internet connection and curl to compile ScrollView.jar
because the build will automatically download
piccolo2d-core-3.0.1.jar and
piccolo2d-extras-3.0.1.jar and
jaxb-api-2.3.1.jar and place them to tesseract/java
.
Just run:
make ScrollView.jar
and follow the instruction on Viewer Debugging.
There is alternative build system based on multiplatform cmake
mkdir build
cd build && cmake .. && make
sudo make install
You need to use leptonica with cmake patch:
git clone https://github.com/DanBloomberg/leptonica.git
cd leptonica
mkdir build
cd build
cmake ..
cmake --build .
cd ..\..
git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
mkdir build
cd build
cmake .. -DLeptonica_BUILD_DIR=\abs\path\to\leptonica\build
cmake --build .
There is another build system available which just uses regular makefiles, and should work on any unix-like systems.
To configure install paths, library locations, etc. edit the simplemake/config.mk file.
$ cd simplemake
$ make
$ make training # (optional)
Please read http://vorba.ch/2014/tesseract-3.03-vs2013.html
See the documentation for more information on this.