Nutika compilation of Mimic3

uras · December 11, 2023, 2:44pm

What we are trying to do
We are a team of Computer Science students trying to build a voice-banking solution. We want to use mimic3 for voice-cloning and text-to-speech however for that, we need to make sure that we can compile our code back to C using Nuitka (This is unfortunately non-negotiable as the tool we are building for uses MSVC).

What we have done and what the issue is
We have created a venv where we installed all requirements + nuitka. We did nutika compilation using the following line(the build file code can be seen below):

python build.py --lto path\to\project\mimic3\mimic3_tts\__main__.py

The compilation was successful but when we tried to run the .exe file generated we got errors such as the below picture where it was unable to locate files. mimic3_tts directory wasn’t under generated files and many of the libraries didn’t have any files in them except one or couple .pyd files.

Work around
We copied the mimic3_tts directory in the directory created by nuitka compilation and we copied every file that exists in the venv library and got rid of file not found errors one by one.

The issue that we couldn’t solve
As I am writing this, it was successful in creating a voice-file without me changing literally anything. Now that we know we can compile the tts, we are wondering how we could clone one’s voice and does mimic3 offer any tools for it? We would greatly appreciate any suggestion/help we could get.

build.py

import argparse
import subprocess
import sys

nuitka_args = [
    '--standalone',  # Create a standalone folder with all the required files and a .exe
    '--remove-output',  # Remove the build folder after creating the dist folder
    '--assume-yes-for-downloads',  # Assume yes for all downloads
    #'--include-data-dir=data=data',  # Copy over the data folder to the output directory
    '--output-dir=Release',  # Output directory

    # Include the following plugins
    '--user-plugin=plugins/mediapipe.py',
    '--user-plugin=plugins/sounddevice.py'
]


def build() -> None:
    """
    Sets up an argument parser for building a MotionInput project with Nuitka, providing options for
    link time optimization (LTO), console visibility, and target file specification.

    Command line arguments:
        * --lto: Use link time optimization. Defaults to False.
        * --console: Show console when running the built executable. Defaults to False.
        * target_file: Target file to build. Defaults to "motioninput.py".

    :Example:

        Run the following command in the terminal:

        .. code-block:: bash

            python build.py --lto --console motioninput.py

    :return: None
    """
    parser = argparse.ArgumentParser(description='Build motion input with Nuitka.')
    parser.add_argument('--lto', action='store_true',
                        help='Use link time optimisation. Default: False'))
    parser.add_argument('target_file', type=str,
                        help='Target file to build.')
    args = parser.parse_args()

    print(args)

    if args.lto:
        nuitka_args.append('--lto=yes')
    else:
        nuitka_args.append('--lto=no')

    subprocess.call([sys.executable, "-m", "nuitka"] + nuitka_args + [args.target_file])


if __name__ == "__main__":
    build()

ChanceNCounter · December 11, 2023, 6:47pm

Mimic3 development stopped when MycroftAI did, but the successor projects recommend this, from the same developer, which can itself be regarded as a successor to Mimic3

I like your odds much better if you make that switch, as help is much likelier to be forthcoming on Piper. Over here, it’s life after Mimic.

Sorry about that!

baconator · December 11, 2023, 10:49pm

You should really be looking at other solutions, possibly piper, but also coqui and tortoise’s TTS tools.

mikejgray · December 12, 2023, 1:07am

What does the man who’s cloned his voice multiple times suggest? @Thorsten

Thorsten · January 1, 2024, 10:20pm

Sorry for late response @mikejgray . I’d recommend you to go with “Piper TTS” as it’s fast on small compute devices and it’s developed by Mike Hansen who developped Mimic 3, too. IMHO Mike has some stuff on Piper TODO list as SSML and better Python native integration, but as @ChanceNCounter said you’ll get better support/help on Piper than Mimic 3.
Coqui TTS might offer some more high quality voices but requires more compute power. As it support a MaryTTS API you might be able to integrate it easy.