Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SVTR LCNet PP-OCRv3 CH model #614

Merged
merged 1 commit into from
Nov 20, 2023
Merged

Conversation

tonytonglt
Copy link
Collaborator

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)


## 2. 权重转换

如您已经有采用PaddleOCR训练好的PaddlePaddle模型,想在MindOCR下直接进行推理或进行微调续训,您可以对训练好的模型转换为MindSpore格式的ckpt文件。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"对训练好的模型转换为"->"将训练好的模型转换为"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,同步修改DBNet README

data_out = dict()
data_out["img_path"] = data.get("img_path", None)
data_out["image"] = data["image"]
ctc = self.ctc_encode.__call__(data_ctc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为何必须使用__call__调用?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处由于会预生成预处理transform_pipeline的实例列表,后续对每个batch使用transform(data)进行分步预处理,因此需要写为__call__调用形式,参考代码https://github.com/mindspore-lab/mindocr/blob/main/mindocr/data/transforms/transforms_factory.py#L68

return padding_im, valid_ratio, valid_width_mask


def resize_norm_img(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该有同功能函数或函数组合,建议此次或后续重构,避免冗余代码。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处由于需要将参考代码中valid_ratio转换为valid_width_mask以规避动态长度tensor切片引入的动态shape,有一定特殊性

return out


class SEModule(nn.Cell):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

与utils文件夹下SE模块有何区别?代码能否复用。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,复用utils下SEModule

return result


class ConvBNLayer(nn.Cell):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mindcv文件夹下可能有类似代码,建议确认是否可复用,以规避和减少冗余代码。
本文件夹下的MLP等代码亦建议做相同检视。

Copy link
Collaborator Author

@tonytonglt tonytonglt Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处由于需要判断是否使用自定义的Swish激活函数,有一定特殊性;另外通常从mindcv引用模型作为backbone使用,较少引入单一功能模块

valid_width_masks = img_metas[-1]

lab_embedding = self.embedding(label)
# bsz * seq_len * emb_dim
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

无需使用的代码直接删除,不应用注释的方式保留。下同

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处主要为tensor shape的注释,便于代码阅读和调试,已修改为代码后注释形式

character_dict_path=dict_path,
use_space_char=False,
)
dict_path = "mindocr/utils/dict/ch_dict.txt" if algo == "CRNN_CH" or algo == "SVTR_PPOCRv3_CH" else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if algo in ["CRNN_CH", "SVTR_PPOCRv3_CH"]
方便后续拓展

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@tonytonglt tonytonglt force-pushed the main branch 2 times, most recently from c09c363 to 701cd54 Compare November 20, 2023 01:49
@tonytonglt tonytonglt merged commit 1ca2764 into mindspore-lab:main Nov 20, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants