add SVTR LCNet PP-OCRv3 CH model #614

tonytonglt · 2023-11-17T08:23:37Z

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

You have read the Contributing Guidelines on pull requests
Your code builds clean without any errors or warnings
You are using approved terminology
You have added unit tests

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

panshaowu · 2023-11-18T04:22:25Z

configs/rec/svtr/README_CN_PP-OCRv3.md

+
+## 2. 权重转换
+
+如您已经有采用PaddleOCR训练好的PaddlePaddle模型，想在MindOCR下直接进行推理或进行微调续训，您可以对训练好的模型转换为MindSpore格式的ckpt文件。


"对训练好的模型转换为"->"将训练好的模型转换为"

已修改，同步修改DBNet README

panshaowu · 2023-11-18T09:35:16Z

mindocr/data/transforms/svtr_transform.py

+        data_out = dict()
+        data_out["img_path"] = data.get("img_path", None)
+        data_out["image"] = data["image"]
+        ctc = self.ctc_encode.__call__(data_ctc)


为何必须使用__call__调用？

此处由于会预生成预处理transform_pipeline的实例列表，后续对每个batch使用transform(data)进行分步预处理，因此需要写为__call__调用形式，参考代码https://github.com/mindspore-lab/mindocr/blob/main/mindocr/data/transforms/transforms_factory.py#L68

panshaowu · 2023-11-18T10:31:38Z

mindocr/data/transforms/svtr_transform.py

+    return padding_im, valid_ratio, valid_width_mask
+
+
+def resize_norm_img(


应该有同功能函数或函数组合，建议此次或后续重构，避免冗余代码。

此处由于需要将参考代码中valid_ratio转换为valid_width_mask以规避动态长度tensor切片引入的动态shape，有一定特殊性

panshaowu · 2023-11-18T10:34:06Z

mindocr/models/backbones/rec_svtr_enhance.py

+        return out
+
+
+class SEModule(nn.Cell):


与utils文件夹下SE模块有何区别？代码能否复用。

已修改，复用utils下SEModule

panshaowu · 2023-11-18T10:35:10Z

mindocr/models/heads/rec_multi_head.py

+        return result
+
+
+class ConvBNLayer(nn.Cell):


mindcv文件夹下可能有类似代码，建议确认是否可复用，以规避和减少冗余代码。
本文件夹下的MLP等代码亦建议做相同检视。

此处由于需要判断是否使用自定义的Swish激活函数，有一定特殊性；另外通常从mindcv引用模型作为backbone使用，较少引入单一功能模块

panshaowu · 2023-11-18T12:25:21Z

mindocr/models/heads/rec_sar_head.py

+            valid_width_masks = img_metas[-1]
+
+        lab_embedding = self.embedding(label)
+        # bsz * seq_len * emb_dim


无需使用的代码直接删除，不应用注释的方式保留。下同

此处主要为tensor shape的注释，便于代码阅读和调试，已修改为代码后注释形式

panshaowu · 2023-11-18T12:35:09Z

tools/infer/text/postprocess.py

-                    character_dict_path=dict_path,
-                    use_space_char=False,
-                )
+                dict_path = "mindocr/utils/dict/ch_dict.txt" if algo == "CRNN_CH" or algo == "SVTR_PPOCRv3_CH" else None


if algo in ["CRNN_CH", "SVTR_PPOCRv3_CH"]
方便后续拓展

tonytonglt force-pushed the main branch 4 times, most recently from d5f437e to 39839c7 Compare November 17, 2023 08:40

tonytonglt requested review from SamitHuang, HaoyangLee and panshaowu November 17, 2023 08:41

panshaowu reviewed Nov 18, 2023

View reviewed changes

tonytonglt force-pushed the main branch 2 times, most recently from c09c363 to 701cd54 Compare November 20, 2023 01:49

add SVTR LCNet PP-OCRv3 CH model

445f8fe

tonytonglt force-pushed the main branch from 701cd54 to 445f8fe Compare November 20, 2023 02:12

panshaowu approved these changes Nov 20, 2023

View reviewed changes

VictorHe-1 approved these changes Nov 20, 2023

View reviewed changes

tonytonglt merged commit 1ca2764 into mindspore-lab:main Nov 20, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add SVTR LCNet PP-OCRv3 CH model #614

add SVTR LCNet PP-OCRv3 CH model #614

tonytonglt commented Nov 17, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023 •

edited

Loading

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023

panshaowu Nov 18, 2023

tonytonglt Nov 20, 2023


		## 2. 权重转换

		如您已经有采用PaddleOCR训练好的PaddlePaddle模型，想在MindOCR下直接进行推理或进行微调续训，您可以对训练好的模型转换为MindSpore格式的ckpt文件。

		return padding_im, valid_ratio, valid_width_mask


		def resize_norm_img(

add SVTR LCNet PP-OCRv3 CH model #614

add SVTR LCNet PP-OCRv3 CH model #614

Conversation

tonytonglt commented Nov 17, 2023

Motivation

Test Plan

Related Issues and PRs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tonytonglt Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tonytonglt Nov 20, 2023 •

edited

Loading