-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add SVTR LCNet PP-OCRv3 CH model #614
Conversation
d5f437e
to
39839c7
Compare
|
||
## 2. 权重转换 | ||
|
||
如您已经有采用PaddleOCR训练好的PaddlePaddle模型,想在MindOCR下直接进行推理或进行微调续训,您可以对训练好的模型转换为MindSpore格式的ckpt文件。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"对训练好的模型转换为"->"将训练好的模型转换为"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改,同步修改DBNet README
data_out = dict() | ||
data_out["img_path"] = data.get("img_path", None) | ||
data_out["image"] = data["image"] | ||
ctc = self.ctc_encode.__call__(data_ctc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为何必须使用__call__调用?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处由于会预生成预处理transform_pipeline的实例列表,后续对每个batch使用transform(data)
进行分步预处理,因此需要写为__call__调用形式,参考代码https://github.com/mindspore-lab/mindocr/blob/main/mindocr/data/transforms/transforms_factory.py#L68
return padding_im, valid_ratio, valid_width_mask | ||
|
||
|
||
def resize_norm_img( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
应该有同功能函数或函数组合,建议此次或后续重构,避免冗余代码。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处由于需要将参考代码中valid_ratio转换为valid_width_mask以规避动态长度tensor切片引入的动态shape,有一定特殊性
return out | ||
|
||
|
||
class SEModule(nn.Cell): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
与utils文件夹下SE模块有何区别?代码能否复用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改,复用utils下SEModule
return result | ||
|
||
|
||
class ConvBNLayer(nn.Cell): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mindcv文件夹下可能有类似代码,建议确认是否可复用,以规避和减少冗余代码。
本文件夹下的MLP等代码亦建议做相同检视。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处由于需要判断是否使用自定义的Swish激活函数,有一定特殊性;另外通常从mindcv引用模型作为backbone使用,较少引入单一功能模块
mindocr/models/heads/rec_sar_head.py
Outdated
valid_width_masks = img_metas[-1] | ||
|
||
lab_embedding = self.embedding(label) | ||
# bsz * seq_len * emb_dim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
无需使用的代码直接删除,不应用注释的方式保留。下同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处主要为tensor shape的注释,便于代码阅读和调试,已修改为代码后注释形式
tools/infer/text/postprocess.py
Outdated
character_dict_path=dict_path, | ||
use_space_char=False, | ||
) | ||
dict_path = "mindocr/utils/dict/ch_dict.txt" if algo == "CRNN_CH" or algo == "SVTR_PPOCRv3_CH" else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if algo in ["CRNN_CH", "SVTR_PPOCRv3_CH"]
方便后续拓展
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
c09c363
to
701cd54
Compare
Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:
Motivation
(Write your motivation for proposed changes here.)
Test Plan
(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)
Related Issues and PRs
(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)