TextToSpeech engine

fabsterpal · August 11, 2014, 9:45pm

I think, considering how long MARY takes to generate each short line of text, it may be best if all static lines are pre-generated?

normen · August 11, 2014, 10:11pm

Its exactly the same as uncompressing an .ogg file. The ogg decoder also generates PCM data from encoded information, the entropy encoding part is exactly the same principle as making voice sounds from text data.

fabsterpal · August 11, 2014, 10:24pm

@normen said: Its *exactly* the same as uncompressing an .ogg file. The ogg decoder also generates PCM data from encoded information, the entropy encoding part is exactly the same principle as making voice sounds from text data.

But the simple principle of it being an asset LOADER indicates that the final asset exists in some form already? The ogg is being decompressed and perhaps converted to a new format, in a same way as images are. This data is being calculated and created on the spot based off template data.

normen · August 11, 2014, 11:27pm

@fabsterpal said: But the simple principle of it being an asset LOADER indicates that the final asset exists in some form already? The ogg is being decompressed and perhaps converted to a new format, in a same way as images are. This data is being calculated and created on the spot based off template data.

It seems like you think ogg/mp3 compression works like zip compression or something. In fact it does exactly what you describe as “data being calculated and created on the spot based off template data”. I guess wikipedia has some good primers on entropy encoding in the context of lossy audio compression if you want to learn more about that. At any instance, using a loader is the most elegant and correct solution here in terms of the jME API and I’m out of the discussion. Good work zzuegg

fabsterpal · August 12, 2014, 11:33am

@normen said: It seems like you think ogg/mp3 compression works like zip compression or something. In fact it does exactly what you describe as "data being calculated and created on the spot based off template data". I guess wikipedia has some good primers on entropy encoding in the context of lossy audio compression if you want to learn more about that. At any instance, using a loader is the most elegant and correct solution here in terms of the jME API and I'm out of the discussion. Good work zzuegg :)

MP3/OGG compression still has the final data existing in some form, however. It doesn’t have every possible sound in one single file and then reference them.
I don’t see what’s elegant about assetManager.loadAsset(“some really long string of text here which is in no way a path.tts”), I think you are just disagreeing for the sake of it because you’re upset about something that happened a year ago.

Let’s throw this over to SkyFactory. This essentially does the same thing, it creates a sky node from textures that are loaded using assetloader. You don’t to .loadAsset(“Some/Path/To/Sky.png.sky”).

normen · August 12, 2014, 11:45am

@fabsterpal said: MP3/OGG compression still has the final data existing in some form, however. It doesn't have every possible sound in one single file and then reference them. I don't see what's elegant about assetManager.loadAsset("some really long string of text here which is in no way a path.tts"), I think you are just disagreeing for the sake of it because you're upset about something that happened a year ago.
Let’s throw this over to SkyFactory. This essentially does the same thing, it creates a sky node from textures that are loaded using assetloader. You don’t to .loadAsset(“Some/Path/To/Sky.png.sky”).

Well, I kind of expected that it can also load actual .tts text files which you can edit while you develop, otherwise if its only the path interpreted as the text I’d agree. You are still wrong about mp3 though, believe me, lossy audio compression was one of the topics of my final exam when I studied audio engineering, so…

And no, I don’t even know who you are, why would I be upset about you? I do remember that you got pretty upset and called me names because I tried to explain to you that if you want to believe that I am antagonizing theres no way for me to change that, not even by saying I don’t. But I think you didn’t understand that, just like in this case you will probably continue to believe I am out to get you.

The only reason I answered in this thread is to avoid giving zzuegg the wrong impression, I actively tried to avoid replying to you directly, still you jump on me. So now I’m really out of this and I WILL remember your name now to avoid getting into such situations with you again.

fabsterpal · August 12, 2014, 11:53am

@normen said: Well, I kind of expected that it can also load actual .tts text *files* which you can edit while you develop, otherwise if its *only* the path interpreted as the text I'd agree. You are still wrong about mp3 though, believe me, lossy audio compression was one of the topics of my final exam when I studied audio engineering, so...
And no, I don’t even know who you are, why would I be upset about you? I do remember that you got pretty upset and called me names because I tried to explain to you that if you want to believe that I am antagonizing theres no way for me to change that, not even by saying I don’t. But I think you didn’t understand that, just like in this case you will probably continue to believe I am out to get you.

The only reason I answered in this thread is to avoid giving zzuegg the wrong impression, I actively tried to avoid replying to you directly, still you jump on me. So now I’m really out of this and I WILL remember your name now to avoid getting into such situations with you again.

Not how I remember it I remember you sticking by a disprovable point, me providing you with countless links ( I can’t remember, it maybe newer OpenGL support, not sure) disproving / nullifying the point, and you getting so upset by it that you started throwing thinly veiled insults at me.

Additionally, not quoting somebody but still directly responding to their point is an ineffective way of avoiding direct response when the forum subscribes you to the thread by default via email.

Fissll · August 13, 2014, 2:30pm

Ok, I finally figured out how to get this working.

The only problem I have is that I can’t change the language.
I installed the marytts-lang-de-5.0.jar and tried everything explained here but it all didn’t work.

I basically want to have the option to change the language in game.
I don’t want to just have one language to use.

zzuegg · August 14, 2014, 8:13am

Not quite sure what you tried. In order to make german work, you also have to add a voice that supports german.

I am quite sure that the default voice does not. i used “pavoque” in my tests for german

Empire_Phoenix · August 14, 2014, 11:56am

Probably something for the custom assetkey then, to select a voice to use (and also a style)
And maybee some methods in the binding, giving the currently supported ones in a simple way.