OctShuffle-MLT

Repository for the paper OctShuffleMLT: A Compact Octave Based Neural Network for End-to-End Multilingual Text Detection and Recognition.

We use E2E-MLT (https://github.com/MichalBusta/E2E-MLT) as baseline, modifying it to obtain a compacter model.

To use the more robust and less compact model Oct-MLT, check the octmlt branch.

Requirements

Similiar to E2E-MLT we use the following datasets

ICDAR 2019 MLT Dataset
ICDAR 2017 MLT Dataset
ICDAR 2015 Dataset
RCTW-17
Synthetic MLT Data (Arabic, Bangla, Chinese, Japanese, Korean, Latin, Hindi )
and converted GT to icdar MLT format (see: http://rrc.cvc.uab.es/?ch=8&com=tasks) (Arabic, Bangla, Chinese, Japanese, Korean, Latin, Hindi )

Use the train.py script to start training. It has the following arguments:

-train_list: Text file with list of images for detection to be trained upon. Default='dataset/images/trainMLT.txt'
-ocr_feed_list: Text file with list of images for recognition to be trained upon. Default='dataset/crops/crops_list.txt'
-save_path: Path to save model on checkpoints. Default='backup'
-model: Model to load on training, if not set training starts from 0. Default=''
-debug: Prints some informations during training. Default=0
-batch_size: Batch size for detection training. Default=32
-ocr_batch_size: Batch size for recognition training. Default=256
-num_readers: Number of readers. Default=1
-cuda: Sets use of GPU. Default=True
-input_size: Input image size. Default=256
-base_lr: Base Learning Rate for the Adam Optmizer. Default=0.0001
-max_iters: Maximum number of training iterations. Default=300000

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
docker		docker
images		images
nms		nms
octconv		octconv
sample_train_data		sample_train_data
Arial-Unicode-Regular.ttf		Arial-Unicode-Regular.ttf
README.md		README.md
codec.txt		codec.txt
codec_rctw.txt		codec_rctw.txt
data_gen.py		data_gen.py
data_util.py		data_util.py
demo.py		demo.py
generator_crops_list.py		generator_crops_list.py
generator_images.py		generator_images.py
models.py		models.py
net_utils.py		net_utils.py
ocr_gen.py		ocr_gen.py
ocr_test_utils.py		ocr_test_utils.py
ocr_utils.py		ocr_utils.py
profile.py		profile.py
shufflenet.py		shufflenet.py
train.py		train.py
train_ocr.py		train_ocr.py