-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will switching to SeamlessM4Tv2 be better #4
Comments
Probably yes. Advancements in this area are made crazy fast, i feel stuff like this expires in like 4 weeks. |
I think it may have to be a flag between whisper and Meta as Seamless M4T V2 is still CC-BY-NC license which is incompatible with your MIT License. |
Damn. You are right and this also counts for the coqui. Need to revoke MIT 2 here asap. |
Probably this one https://huggingface.co/spaces/styletts2/styletts2 can replace Coqui and its MIT. |
Not sure about that. StyleTTS2 is only good in english and can't zero shot voice clone. |
This one seems to capture Tone and emotion is what they claim https://research.myshell.ai/open-voice/zero-shot-cross-lingual-voice-cloning may work for your TurnVoice project |
Fyi, this is what I found as extra information (Mac): Use Apple's Metal for GPU Acceleration: For PyTorch, there's an experimental project called PyTorch-Metal that aims to bring Metal GPU acceleration to PyTorch on macOS. Use PlaidML: |
maybe MeloTTS/OpenVoice would be a good replacement, also distil_whisper |
You can alread use distil whisper models. Update your faster whisper to latest version (pip install -U faster-whisper), then change the model to one of the distil supported ones (distil-large-v2, distil-medium.en, distil-small.en) in this line: recorder = AudioToTextRecorder(model="tiny.en", language="en", spinner=False) Melo I found to have rather bad quality (so few emotions) and OpenVoice is a research project which does not get updates. So I won't implement those into RealtimeTTS (takes a lot for a TTS engine to be considered for me to make it realtime). |
SeamlessM4Tv2 Released today seems to have all this and translation with streaming support ? Will it be better than Whisper and Coqui ?
The text was updated successfully, but these errors were encountered: