Skip to content

Latest commit

 

History

History
72 lines (54 loc) · 3.11 KB

README.md

File metadata and controls

72 lines (54 loc) · 3.11 KB

Depixelizer

AI model to remove pixel noise from text images.

Description

Depixelizer is a Convolutional AutoEncoder model. The goal is to learn a function to reconstruct "pixeled" text images.

To train the model I generate two kind of text images, clean and pixeled. The clean image is a text image with a white background and black text, the corresponding pixeled image is the same image with pixel noise. You can download the dataset from this directory or use your own dataset. You can also generate a new dataset from the Colab notebook using the script.

In my experiments I generate images using only one font (Roboto), but to improve the model generalization ability you can use multiple fonts.

Input (x) Desired output (y)

Dataset Structure

  • dataset
    • clean
      • 20.000 clean images (each image name is <ID.png>)
    • pixeled
      • 20.000 pixeled images (each image name is <ID.png>)

Convoutional AutoEncoder Model

Layer (type:depth-idx)                   Param #
=================================================================
AutoEncoder                              --
├─Sequential: 1-1                        --
│    └─Conv2d: 2-1                       168
│    └─ReLU: 2-2                         --
│    └─MaxPool2d: 2-3                    --
│    └─Conv2d: 2-4                       660
│    └─ReLU: 2-5                         --
│    └─MaxPool2d: 2-6                    --
│    └─Conv2d: 2-7                       2,616
│    └─ReLU: 2-8                         --
│    └─MaxPool2d: 2-9                    --
├─Sequential: 1-2                        --
│    └─ConvTranspose2d: 2-10             1,164
│    └─ReLU: 2-11                        --
│    └─ConvTranspose2d: 2-12             294
│    └─ReLU: 2-13                        --
│    └─ConvTranspose2d: 2-14             75
├─MSELoss: 1-3                           --
=================================================================
Total params: 4,977
Trainable params: 4,977
Non-trainable params: 0
=================================================================

Model prediction examples

Todo

  • Improve predictions quality
  • Data augmentation
  • Use more fonts to generate the dataset

Authors

Samir Salman