Proposal: OpenUtau g2p phonemizers for machine learning voicebanks #46
oxygen-dioxide
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently for some languages supported by machine learning renderers like ENUNU, users have to manually input phoneme instead of words. This proposal suggests a set of G2P phonemizers for these languages that enable user to input words and use existing ustx project files.
How does it work
Take French as an example
rangée
→rr en jj ei
rangée
→rr | en jj | ei
(Here|
means the border between notes.)+
to place a syllable or use+~
or+*
to extend the current syllable, like how we use EN VCCV in OpenUtau.Dictionary format
Each phonemizer corresponds to a dictionary named
<type>-<lang>.yaml
. For example, the ENUNU French phonemizer uses "enudict-fr.yaml" in the voicebank. Voicebank developers should place a yaml dictionary in their voicebanks.The dictionary consists of 3 parts: "replacements", "symbols" and "entries".
replacements
This part isn't necessary in a dictionary. The voicebank may use a different phoneme set from the phoneme set used by the G2P in OpenUtau. This part tells OpenUtau how to convert the phonemes produced by G2Ps to the phonemes supported by the voicebank.
symbols
This part is necessary in a dictionary. It tells OpenUtau which phonemes are vowels and which are consonants. OpenUtau require these infomations to split words into syllables.
entries
This part isn't necessary in a dictionary. Voicebank developers can use this part to define some unique words.
example
Here is an example French dictionary:
Note that I don't know French, so I write this documentation based on existing resources and codes, including https://enunufr.carrd.co/ and FrenchVCCVPhonemizer.cs. If there are anything wrong in my documentation, feel free to tell me.
Beta Was this translation helpful? Give feedback.
All reactions