TextToSpeech engine

zzuegg · July 30, 2014, 10:43am

[java]
getAssetManager().registerLoader(MaryLoader.class, “tts”);
getAssetManager().registerLocator(“tts”, MaryLocator.class);

    AudioKey audioKey = new AudioKey("Hello fellow monkeys. This text is automatically converted to a mono audiostream.tts");
    AudioData audioData = assetManager.loadAudio(audioKey);
    new AudioNode(audioData, audioKey).play();

[/java]

One known bug to solve. (At the end of the audiostream it jumps back a little bit and replays a short chunk)
Some missing features to implement. (Probably trough a custom AudioKey)

ndebruyn · July 30, 2014, 12:07pm

Wow, cool. How can I make use of this library?

Fissll · July 30, 2014, 1:07pm

This would really be awesome for my game
Can I use it? … And if so, how can I use it?

zzuegg · July 30, 2014, 1:32pm

Good news is, i fixed the error, now sound plays completely.
Going to refactor everything and upload the code today evening on github.

@ndebruyn: i dont think that this lib is suitable for android however, the generation takes already a few sec on my pc. Also the whole lib is too big to fit an apk. But i am going to add some lines to allow pregenration of .wav files.

@Fissll: beside the jars the above code is nearly everything you need to use it.

simon_heinen · August 11, 2014, 7:21am

is the tts process done online? the quality is quite good

zzuegg · August 11, 2014, 12:50pm

Oh good that you reminded me on this one.

@simon.heinen said: is the tts process done online? the quality is quite good

In the tests i generated the wavs local. But there is also the possibility of using a remote “generation server”.

simon_heinen · August 11, 2014, 12:52pm

ok nice, so the generation process can be done offline on the fly?

zzuegg · August 11, 2014, 12:53pm

yeah, it does take some time, so it might be better dooing it on a separate thread however

Empire_Phoenix · August 11, 2014, 1:38pm

So, where do I find a download for the lib?

zzuegg · August 11, 2014, 1:55pm

Well, its basically only a very few lines of code. Calling it a library is overshooted.

I created a repository and uploaded the needed stuff as well as a very simple test.

https://github.com/zzuegg/Jme3MaryIntegration

normen · August 11, 2014, 1:56pm

Whats with the fancy reverb on the speech?

zzuegg · August 11, 2014, 1:58pm

Not really sure, the german male voice does not have that reverb…

fabsterpal · August 11, 2014, 5:10pm

I would not use AssetKey here, and instead call a generator class directly. Use loader to load caches audio where available (you could do an md5 hash of the entire string as the file name).

zzuegg · August 11, 2014, 5:14pm

Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things

fabsterpal · August 11, 2014, 5:16pm

@zzuegg said: Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things ;)

Hmm, well the assetloader is really that - an asset loader.
By the way, how does MARY work without the server? I had a look before but it seemed a bit… meh when it comes to portable design.

zzuegg · August 11, 2014, 5:25pm

Hm, not sure if i understand “portable” in the same way of dooing it.

[java]
MaryInterface maryInterface = new LocalMaryInterface();
[/java]

is the only line you have to change if you want to switch from a serverbased generation to local

fabsterpal · August 11, 2014, 5:29pm

@zzuegg said: Hm, not sure if i understand "portable" in the same way of dooing it.
[java]
MaryInterface maryInterface = new LocalMaryInterface();
[/java]

is the only line you have to change if you want to switch from a serverbased generation to local

So the client library does have generation support…

normen · August 11, 2014, 8:26pm

@zzuegg said: Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things ;)

I think its the proper way. Its an asset file you edit while you create the game and which you want to load as audio data into your game later. You wouldn’t unpack an .ogg file with some “generator” either, you simply load the .ogg and get audio data inside the game.

fabsterpal · August 11, 2014, 9:23pm

@normen said: I think its the proper way. Its an asset file you edit while you create the game and which you want to load as audio data into your game later. You wouldn't unpack an .ogg file with some "generator" either, you simply load the .ogg and get audio data inside the game.

But it’s not unpacking it, it’s generating it… This point would be valid if the files were stored somewhere. In this, the asset key is being used as the single parameter to the generator.
By the way, if you’re going to take the path of using an asset loader, I would advise making your own AssetKey as with ModelKey:
https://code.google.com/p/jmonkeyengine/source/browse/branches/jme3/src/core/com/jme3/asset/ModelKey.java?r=6060
This way, you could set the path as the path to the voice file (or something along those lines), and have the text, speed, volume or any other variables separately. Just an idea

Empire_Phoenix · August 11, 2014, 9:33pm

Hm it is not assetloading, but behaves similar, eg you call it and it needs some time and returns you usable data.

Having a custom key where all the datas could be added makes sense.
In fact it might make sense to even allow the configuration of a cache directory, and add a boolean if a string should be cached.