X-Voice The universal translator
X-Voice is a voice clone app that lets you clone voices in any language.
The Zero-Shot Voice Cloning tab works like any other TTS app. Upload your voice, write your text and clone the voice. But if you use X-Voice Stage2 you can do this in any language, as long as the reference audio and the reference text use the same language.

But the real magic happens in the Translate & Clone tab. You can upload a voice in any language and translate and clone it into one or more of 27 different languages. The reference voice will be transcribed and translated automatically. But you can edit the translation. This means you are not stuck on the translated reference text. You can clone a voice and let it say whatever you want in 27 different languages.

The app consumes around 8.5GB VRAM max. So it also runs on smaller GPUs. On an RTX 5080 (16GB ) cloning a voice in all 27 available languages needs around 150 sec.
Original:
French:
Indonesian:
