A UNet(or any other FCN)-based repo for segmentation, especially for binarization.

This repo also hold the Official code of Alleviating pseudo-touching in attention U-Net-based binarization approach for the historical Tibetan document images(deprecated) and order prediction in IACC-DAR-AlphX-Code.

Introduction

This project aims to provide a solution for image segmentation that can be used in many fields, e.g. document binarization. This repo is simple, efficient and flexiable, you can modify anything you want.

Installation

Ensure that the following Python packages have been installed:

pip install numpy
pip install torch
pip install torchvision
pip install opencv-python
pip install tensorboard
pip install tqdm

or just pip install the missing package is more than enough.

Usage

Prepare the data

Set imgs_dir and masks_dir to your path.

Here, --input is the path to the input image and --output is the path to where the model will write the output.

Training the Model

If you wish to train the model or use your own dataset, follow these steps:

Prepare your data as requested.
Navigate to the base directory in the terminal and run the following command:

python train.py

Args :

--imgs_dir: Directory of input images

--masks_dir: Directory of GT masks

--dir_checkpoint: Directory to save the checkpoints.

--input_size: Size of input images

--epoch: Number of epochs for training

--batch_size: Batch size for training

--val_percent: Percentage of validation data

--lr: Learning rate for training

--weight_decay: Weight decay factor for training

--momentum: Momentum factor for the optimizer

You can add those args if needed.

Model Inference

python infer.py

Args :

--imgs_dir: The directory where input images are located.

--out_dir: The directory to save the output images after processing.

--model_pth: The path to the trained network model.

--batch_size: The number of images to process in each batch.

--patch_size: The size of each image patch to be processed, recommand 256.

--bitwise_img_size: The size of the images after bitwise operations. We recommend setting this value to as large as possible.

You can add those args if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
datasets		datasets
models		models
utils		utils
workshop		workshop
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
infer.py		infer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A UNet(or any other FCN)-based repo for segmentation, especially for binarization.

This repo also hold the Official code of Alleviating pseudo-touching in attention U-Net-based binarization approach for the historical Tibetan document images(deprecated) and order prediction in IACC-DAR-AlphX-Code.

Introduction

Installation

Usage

Prepare the data

Training the Model

Model Inference

About

Releases

Packages

Languages

License

ssocean/UNet-Binarization

Folders and files

Latest commit

History

Repository files navigation

A UNet(or any other FCN)-based repo for segmentation, especially for binarization.

This repo also hold the Official code of Alleviating pseudo-touching in attention U-Net-based binarization approach for the historical Tibetan document images(deprecated) and order prediction in IACC-DAR-AlphX-Code.

Introduction

Installation

Usage

Prepare the data

Training the Model

Model Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages