TextToSpeech engine

[java]
getAssetManager().registerLoader(MaryLoader.class, “tts”);
getAssetManager().registerLocator(“tts”, MaryLocator.class);

    AudioKey audioKey = new AudioKey("Hello fellow monkeys. This text is automatically converted to a mono audiostream.tts");
    AudioData audioData = assetManager.loadAudio(audioKey);
    new AudioNode(audioData, audioKey).play();

[/java]

One known bug to solve. (At the end of the audiostream it jumps back a little bit and replays a short chunk)
Some missing features to implement. (Probably trough a custom AudioKey)

5 Likes

Wow, cool. How can I make use of this library?

This would really be awesome for my game :smiley:
Can I use it? … And if so, how can I use it?

Good news is, i fixed the error, now sound plays completely.
Going to refactor everything and upload the code today evening on github.

@ndebruyn: i dont think that this lib is suitable for android however, the generation takes already a few sec on my pc. Also the whole lib is too big to fit an apk. But i am going to add some lines to allow pregenration of .wav files.

@Fissll: beside the jars the above code is nearly everything you need to use it.

1 Like

is the tts process done online? the quality is quite good

Oh good that you reminded me on this one.

@simon.heinen said: is the tts process done online? the quality is quite good

In the tests i generated the wavs local. But there is also the possibility of using a remote “generation server”.

ok nice, so the generation process can be done offline on the fly?

yeah, it does take some time, so it might be better dooing it on a separate thread however

So, where do I find a download for the lib?

Well, its basically only a very few lines of code. Calling it a library is overshooted.

I created a repository and uploaded the needed stuff as well as a very simple test.

https://github.com/zzuegg/Jme3MaryIntegration

Whats with the fancy reverb on the speech?

Not really sure, the german male voice does not have that reverb…

1 Like

I would not use AssetKey here, and instead call a generator class directly. Use loader to load caches audio where available (you could do an md5 hash of the entire string as the file name).

Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things :wink:

@zzuegg said: Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things ;)

Hmm, well the assetloader is really that - an asset loader.
By the way, how does MARY work without the server? I had a look before but it seemed a bit… meh when it comes to portable design.

Hm, not sure if i understand “portable” in the same way of dooing it.

[java]
MaryInterface maryInterface = new LocalMaryInterface();
[/java]

is the only line you have to change if you want to switch from a serverbased generation to local

@zzuegg said: Hm, not sure if i understand "portable" in the same way of dooing it.

[java]
MaryInterface maryInterface = new LocalMaryInterface();
[/java]

is the only line you have to change if you want to switch from a serverbased generation to local

So the client library does have generation support…

@zzuegg said: Indeed, i use the generator class directly in my project, but i tought this is more the jme way of dooing things ;)

I think its the proper way. Its an asset file you edit while you create the game and which you want to load as audio data into your game later. You wouldn’t unpack an .ogg file with some “generator” either, you simply load the .ogg and get audio data inside the game.

@normen said: I think its the proper way. Its an asset file you edit while you create the game and which you want to load as audio data into your game later. You wouldn't unpack an .ogg file with some "generator" either, you simply load the .ogg and get audio data inside the game.

But it’s not unpacking it, it’s generating it… This point would be valid if the files were stored somewhere. In this, the asset key is being used as the single parameter to the generator.
By the way, if you’re going to take the path of using an asset loader, I would advise making your own AssetKey as with ModelKey:
https://code.google.com/p/jmonkeyengine/source/browse/branches/jme3/src/core/com/jme3/asset/ModelKey.java?r=6060
This way, you could set the path as the path to the voice file (or something along those lines), and have the text, speed, volume or any other variables separately. Just an idea :wink:

Hm it is not assetloading, but behaves similar, eg you call it and it needs some time and returns you usable data.

Having a custom key where all the datas could be added makes sense.
In fact it might make sense to even allow the configuration of a cache directory, and add a boolean if a string should be cached.