Support text normalization #55

csukuangfj · 2023-11-04T11:06:22Z

TODOs

Add documentation about how to use it
Add examples about how to generate the rule.fst

Example

#!/usr/bin/env python3

import kaldifst

rule = "./rule.fst"
normalizer = kaldifst.TextNormalizer(rule)
text = "3年前中国总人口为1411778724 人"
out = normalizer(text)
print(out)

The output is given below:

三年前中国总人口为十四亿一千一百七十七万八千七百二十四 人

The rule.fst used in the above example is attached below.
rule.fst.zip

Note: The above example uses Chinese numbers, but the implementation is very generic.

csukuangfj added 5 commits November 4, 2023 18:56

Support text normalization

2e4e135

Release v1.7.7

30abc8e

Fix typos

1dd5a5d

minor fixes

2ea4820

Add ccache to CI

53f81d0

csukuangfj merged commit 0872dbc into k2-fsa:master Nov 4, 2023
30 checks passed

csukuangfj deleted the text-normalization branch November 4, 2023 13:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support text normalization #55

Support text normalization #55

csukuangfj commented Nov 4, 2023

Support text normalization #55

Support text normalization #55

Conversation

csukuangfj commented Nov 4, 2023

TODOs

Example