CV2-5247-support-yake-keyword-extraction-for-chinese #112

ahmednasserswe · 2024-09-24T22:50:15Z

Description

using jieba to segment mandarin text

Reference: CV2-5247 and CV2-4909

How has this been tested?

Has it been tested locally? Are there automated tests?
locally. will add automated tests as well.

Are there any external dependencies?

Are there changes required in sysops terraform for this feature or fix?

Have you considered secure coding practices when writing this code?

Please list any security concerns that may be relevant.

skyemeedan

Can you add a test with a text example that shows this working? Ideally would be a chunk of non space delimited Chinese text that is badly parsed without jieba and then returns appropriate keywords when processed with jieba.

skyemeedan

seems like a really cool library!

lib/model/yake_keywords.py

DGaffney · 2024-09-26T17:08:21Z

One extremely minor note otherwise good to go!

lib/model/yake_keywords.py

…or language == 'zh' or language == 'zh-TW'` to `if language[:2]=="zh"`

using jieba to segment mandarin text

72123ae

ahmednasserswe requested review from DGaffney, computermacgyver and skyemeedan as code owners September 24, 2024 22:50

skyemeedan requested changes Sep 24, 2024

View reviewed changes

adding tests to jieba and chinese text with yake

6da8a98

skyemeedan approved these changes Sep 26, 2024

View reviewed changes

DGaffney reviewed Sep 26, 2024

View reviewed changes

lib/model/yake_keywords.py Show resolved Hide resolved

DGaffney approved these changes Sep 26, 2024

View reviewed changes

computermacgyver reviewed Sep 26, 2024

View reviewed changes

lib/model/yake_keywords.py Outdated Show resolved Hide resolved

lib/model/yake_keywords.py Show resolved Hide resolved

lib/model/yake_keywords.py Show resolved Hide resolved

Code styling in yake_keywords.py. and change `if language == 'zh-CN' …

923d43d

…or language == 'zh' or language == 'zh-TW'` to `if language[:2]=="zh"`

ahmednasserswe merged commit 6167257 into master Sep 30, 2024
2 checks passed

computermacgyver deleted the CV2-5247-support-yake-keyword-extraction-for-chinese branch September 30, 2024 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CV2-5247-support-yake-keyword-extraction-for-chinese #112

CV2-5247-support-yake-keyword-extraction-for-chinese #112

ahmednasserswe commented Sep 24, 2024

skyemeedan left a comment

skyemeedan left a comment

DGaffney commented Sep 26, 2024

CV2-5247-support-yake-keyword-extraction-for-chinese #112

CV2-5247-support-yake-keyword-extraction-for-chinese #112

Conversation

ahmednasserswe commented Sep 24, 2024

Description

How has this been tested?

Are there any external dependencies?

Have you considered secure coding practices when writing this code?

skyemeedan left a comment

Choose a reason for hiding this comment

skyemeedan left a comment

Choose a reason for hiding this comment

DGaffney commented Sep 26, 2024