Support for Handwritten text #1049

harindercnvrg · 2022-09-07T12:55:30Z

🚀 The feature

Addition of new / Fine Tuning of existing models to support OCR for Handwritten Text.
As a first step we can start with detection/prediction models that work specifically for Handwritten Text and down the line we can launch a model that works well for both Handwritten and Typed text.

Motivation, pitch

Thousands of forms, documents and notes are scanned stored in archives but are not accessible by search. This can enable digitising such documents that contain handwritten text and enable search on them.

Alternatives

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

frgfm · 2022-09-08T09:35:30Z

Hi @harindercnvrg 👋

That's indeed a long term goal for us! I would suggest dissecting the problem as follows:

Character classification on handwritten characters
Figure out our class structure for both handwritten & typed characters (i.e. should we double the size of output classes to be able to say whether each character is typed or not, or should we mix typed & handwritten in the same symbol)
Text recognition on handwritten
Text detection on handwritten

harindercnvrg · 2022-09-12T15:47:45Z

@frgfm thank you for detailing the steps. Also, I was wondering if we cannot work with a model that is able to work with both digital and handwritten text, would it be possible to include an extra layer of classification and use separate models for handwritten and digital text based on the classification?

harindercnvrg · 2022-09-12T16:02:13Z

@frgfm also for reference CRAFT + TrOCR
The model works fine for handwritten and digital text. It was fine tuned on the IAM dataset.

frgfm · 2022-09-16T10:22:46Z

Also, I was wondering if we cannot work with a model that is able to work with both digital and handwritten text

Yes of course, but I suggested handwritten only first because if that first step doesn't work, it's extremely unlikely than handling both will work 😅

@frgfm also for reference CRAFT + TrOCR
The model works fine for handwritten and digital text. It was fine tuned on the IAM dataset.

Thanks a lot for the heads up 🙏 I added those to the wishlist of new model implementation on docTR (#1007)

felixdittrich92 · 2022-09-16T11:25:17Z

@frgfm i have to disagree with both model additions (CRAFT is really not a performance beast (with VGG backbone)) and let's don't talk about TrOCR ... this is a beast from Microsoft to show how big a model can be to perform OCR 😆 No back to the facts .. TrOCR uses Roberta as Decoder we don't want to integrate some big LM (really) and i think we are also not able to train it from scratch (would only be possible if we take hf transformers as dependency).
Another point is that TrOCR is also on an actual GPU really slow and performs not on char level .. it needs sentences

ParSeq will be a good fit also for handwritten (where it is solved in the decoding strategy without using any big LM)

frgfm · 2022-09-21T11:35:03Z

Yeah you're right! We'll have to filter the models once we have gathered all requests (& compare them)

tobiascornille · 2023-05-13T20:26:31Z

What is the status on this? If I understand correctly, some handwriting datasets are already added (#587), so is this issue still relevant?

felixdittrich92 · 2023-05-14T10:13:08Z

Hi @tobiascornille 👋 ,

Yes it is still not solved, because we have some architectures which should be able to perform well for handwritten (sar, master, vitstr) but a lack of training data.
Imgur5k contains handwritten samples but is to small in overall so it could be only used for validation.

tobiascornille · 2023-05-14T11:53:08Z

@felixdittrich92 Good to know. I will be trying to collect some handwriting data in the coming months, so I might be able to contribute to this then.

One more question: are the current models already trained on Imgur5k? This might actually be problematic for some use cases, since the dataset is licensed under CC BY-NC 4.0 (see https://github.com/facebookresearch/IMGUR5K-Handwriting-Dataset/blob/main/LICENSE).

tobiascornille · 2023-05-14T12:38:17Z

@felixdittrich92 Have you considered adding the IIIT-HWS dataset? It's a synthetic dataset, but considering it 9M words and ~750 fonts, it seems promising. The first author is also the same guy behind IMGUR5k and TextStyleBrush, @kris314

felixdittrich92 · 2023-05-14T20:01:28Z

@tobiascornille the current pretrained models are trained on an custom dataset ( internal mindee data) :)

About the dataset request (looks like MJSynth but only with fonts which looks like handwritten !?):
I didn't see any license in the repository .. if it's freely available and we could add it in a similar way to MJSynth in doctr then we're welcome to do that. Would you be interested to work on it ? :)

tobiascornille · 2023-05-14T20:24:26Z

@felixdittrich92 I've opened an issue (kris314/hwnet#7). Let's see if the author responds.

And yes, this would be very relevant for a project I'm working on. It's a side project, so I cannot commit on any timeline, but I'd like to collect some data and fine-tune a handwriting model this summer.

ffalkenberg · 2023-10-04T09:22:16Z

@tobiascornille any progress on the fine-tuned handwriting model? ✌️

tobiascornille · 2023-10-09T07:57:17Z

@tobiascornille any progress on the fine-tuned handwriting model? ✌️

Hey @felixdittrich92 , I'm afraid not. For the side project I ended up using a cloud provider because it was faster :/
Will post again if anything changes.

ArsalanYounus007 · 2024-01-15T14:44:48Z

Hi,

I was going to start Handwriting training now and thought to ask if someone made any progress on this.
It would be really helpful.

Thank you.

felixdittrich92 · 2024-02-09T08:30:24Z

@odulcy-mindee Do you have internal datasets we could use to train one detection and one recognition model for each backend ? :) (pinned to 2.0.0 so no stress 😅)

sh1man999 · 2024-04-14T21:25:09Z

no support for handwritten text? Can someone tell me what project is currently relevant in this direction?

harindercnvrg added the type: enhancement Improvement label Sep 7, 2022

frgfm added topic: character classification Related to the task of character classification type: new feature New feature and removed type: enhancement Improvement labels Sep 8, 2022

frgfm added this to the 1.0.0 milestone Sep 8, 2022

tobiascornille mentioned this issue May 14, 2023

License? kris314/hwnet#7

Open

felixT2K mentioned this issue May 16, 2023

[datasets] Add IIIT HWS dataset #1199

Merged

felixdittrich92 modified the milestones: 1.0.0, 2.0.0 Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Handwritten text #1049

Support for Handwritten text #1049

harindercnvrg commented Sep 7, 2022

frgfm commented Sep 8, 2022

harindercnvrg commented Sep 12, 2022

harindercnvrg commented Sep 12, 2022

frgfm commented Sep 16, 2022

felixdittrich92 commented Sep 16, 2022 •

edited

Loading

frgfm commented Sep 21, 2022

tobiascornille commented May 13, 2023

felixdittrich92 commented May 14, 2023

tobiascornille commented May 14, 2023

tobiascornille commented May 14, 2023 •

edited

Loading

felixdittrich92 commented May 14, 2023

tobiascornille commented May 14, 2023

ffalkenberg commented Oct 4, 2023 •

edited

Loading

tobiascornille commented Oct 9, 2023

ArsalanYounus007 commented Jan 15, 2024

felixdittrich92 commented Feb 9, 2024

sh1man999 commented Apr 14, 2024

Support for Handwritten text #1049

Support for Handwritten text #1049

Comments

harindercnvrg commented Sep 7, 2022

🚀 The feature

Motivation, pitch

Alternatives

Additional context

frgfm commented Sep 8, 2022

harindercnvrg commented Sep 12, 2022

harindercnvrg commented Sep 12, 2022

frgfm commented Sep 16, 2022

felixdittrich92 commented Sep 16, 2022 • edited Loading

frgfm commented Sep 21, 2022

tobiascornille commented May 13, 2023

felixdittrich92 commented May 14, 2023

tobiascornille commented May 14, 2023

tobiascornille commented May 14, 2023 • edited Loading

felixdittrich92 commented May 14, 2023

tobiascornille commented May 14, 2023

ffalkenberg commented Oct 4, 2023 • edited Loading

tobiascornille commented Oct 9, 2023

ArsalanYounus007 commented Jan 15, 2024

felixdittrich92 commented Feb 9, 2024

sh1man999 commented Apr 14, 2024

felixdittrich92 commented Sep 16, 2022 •

edited

Loading

tobiascornille commented May 14, 2023 •

edited

Loading

ffalkenberg commented Oct 4, 2023 •

edited

Loading