ALAudioRenderer: code improvements

Hi everyone,

I’ve recently been working on enhancing the sound system within the demo I am currently developing. I’ve found that the current jME API appears to lack support for implementing custom AudioRenderer solutions beyond the existing lwjgl and joal implementations. This presents a significant limitation. Furthermore, extending the functionality of the current AudioRenderer seems to require modifications to the engine’s core codebase, which is challenging given the complexity of the existing source.

JmeDesktopSystem.newAudioRenderer()

// API doesn't really allow for writing custom AudioRenderer 
// implementations using AppSettings.

    @Override
    public AudioRenderer newAudioRenderer(AppSettings settings) {
        initialize(settings);

        AL al;
        ALC alc;
        EFX efx;
        if (settings.getAudioRenderer().startsWith("LWJGL")) {
            al = newObject("com.jme3.audio.lwjgl.LwjglAL");
            alc = newObject("com.jme3.audio.lwjgl.LwjglALC");
            efx = newObject("com.jme3.audio.lwjgl.LwjglEFX");
        } else if (settings.getAudioRenderer().startsWith("JOAL")) {
            al = newObject("com.jme3.audio.joal.JoalAL");
            alc = newObject("com.jme3.audio.joal.JoalALC");
            efx = newObject("com.jme3.audio.joal.JoalEFX");
        } else {
            throw new UnsupportedOperationException(
                    "Unrecognizable audio renderer specified: "
                    + settings.getAudioRenderer());
        }

        if (al == null || alc == null || efx == null) {
            return null;
        }

        return new ALAudioRenderer(al, alc, efx);
    }

LegacyApplication.initAudio()

// why put a Listener object in the LegacyApplication class?
// why does the Listener class have such a generic, non-specialized name?
// why not AudioListener?

    private void initAudio() {
        if (settings.getAudioRenderer() != null && context.getType() != Type.Headless) {
            audioRenderer = JmeSystem.newAudioRenderer(settings);
            audioRenderer.initialize();
            AudioContext.setAudioRenderer(audioRenderer);

            listener = new Listener();
            audioRenderer.setListener(listener);
        }
    }

It also became apparent that many capabilities of the OpenAL library, such as Effects and Sound Filters, have not been fully utilized in the current jme integration.

ALAudioRenderer.setEnvironment()

// ALAudioRenderer only supports ReverbEffect (alias Environment)

    @Override
    public void setEnvironment(Environment env) {
        checkDead();
        synchronized (threadLock) {
            if (audioDisabled || !supportEfx) {
                return;
            }

            efx.alEffectf(reverbFx, EFX.AL_REVERB_DENSITY, env.getDensity());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_DIFFUSION, env.getDiffusion());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_GAIN, env.getGain());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_GAINHF, env.getGainHf());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_DECAY_TIME, env.getDecayTime());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_DECAY_HFRATIO, env.getDecayHFRatio());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_REFLECTIONS_GAIN, env.getReflectGain());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_REFLECTIONS_DELAY, env.getReflectDelay());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_LATE_REVERB_GAIN, env.getLateReverbGain());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_LATE_REVERB_DELAY, env.getLateReverbDelay());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_AIR_ABSORPTION_GAINHF, env.getAirAbsorbGainHf());
            efx.alEffectf(reverbFx, EFX.AL_REVERB_ROOM_ROLLOFF_FACTOR, env.getRoomRolloffFactor());

            // attach effect to slot
            efx.alAuxiliaryEffectSloti(reverbFxSlot, EFX.AL_EFFECTSLOT_EFFECT, reverbFx);
        }
    }

ALAudioRenderer.updateFilter()

// ALAudioRenderer only supports LowPassFilter

    private void updateFilter(Filter f) {
        int id = f.getId();
        if (id == -1) {
            ib.position(0).limit(1);
            efx.alGenFilters(1, ib);
            id = ib.get(0);
            f.setId(id);

            objManager.registerObject(f);
        }

        if (f instanceof LowPassFilter) {
            LowPassFilter lpf = (LowPassFilter) f;
            efx.alFilteri(id, EFX.AL_FILTER_TYPE, EFX.AL_FILTER_LOWPASS);
            efx.alFilterf(id, EFX.AL_LOWPASS_GAIN, lpf.getVolume());
            efx.alFilterf(id, EFX.AL_LOWPASS_GAINHF, lpf.getHighFreqVolume());
        } else {
            throw new UnsupportedOperationException("Filter type unsupported: "
                    + f.getClass().getName());
        }

        f.clearUpdateNeeded();
    }

I temporarily paused development on the demo to thoroughly review the OpenAL documentation and identify untapped potential.

Based on my findings, I discovered a method to bypass the standard AudioRenderer initialization process and integrate my own custom SoundManager, offering a significantly simpler and more modular API. Attached is a screenshot showcasing the functionality I have currently implemented:

    public static void main(String[] args) {
        AppSettings settings = new AppSettings(true);
        settings.setResolution(640, 480);
        settings.setAudioRenderer(null); // disable jME AudioRenderer

        Test_SoundManager app = new Test_SoundManager();
        app.setSettings(settings);
        app.setShowSettings(false);
        app.setPauseOnLostFocus(false);
        app.start();
    }

At this stage, I am still working on integrating multichannel audio management, audio streams, and a dedicated WAV loader. However, I have successfully implemented:

  • SoundManager
  • SoundSource
  • SoundBuffer
  • SoundListener
  • AudioEffects: Chorus, Compressor, Distortion, Echo, Flanger, PitchShift, Reverb
  • AudioFilters: LowPassFilter & HighPassFilter
  • the capability to modify the distance attenuation model equation.
  • OGGLoader

I am eager to explore the extent to which I can improve the audio system and integrate advanced features, drawing inspiration from concepts found in engines like Unity, specifically the AudioMixer and AudioMixerGroup.

Edit:
Maybe something to think about for jme4

In the meantime, here is the new look of my editor for the current jme AudioNodes:

12 Likes

I agree the name is bad.

But this is not a classic Java XxxListener. This is “the listener”. The “ears of the camera”.

2 Likes

How about PositionalAudio or SoundEmitter?
I agree that changing the name of an already established class is not a good ideia, but I also agree that Listener is not a good name due to the listener project pattern. Anyway, just giving my two cents.

1 Like

Ear.

It’s not a PostionalAudio. It’s a LISTENER. It’s hearing the sound. PositionalAudio would be MAKING the sound that the “listener” is “hearing”. So it’s not a SoundEmitter, either.

It’s the “ear”… the “listener” (but “listener” is an overloaded word as already discussed).

“AudioReceiver” is probably the most accurate descriptive term… but even that I think could be confusing. AudioReceiverPosition would also work but might confuse people that “position” extends to include orientation and velocity (which in some circles is normal).

It’s always fun when a perfectly good word for a thing doesn’t work because that word is already heavily used for something else… because really, if there were no such things as the “observer pattern”, the word “Listener” would be crystal clear here. :slight_smile:

1 Like

The audiosystem is one of the things i underestimated the complextiy by far. After reading the specs and prototyping my audio pipeline looks quite the same as the graphics pipeline. Besides the naming that is of course, but you have different “rendertargerts” with postprocessing. different per object rendering informations. very very similiar.

Fun fact, i was playing with other engines in my exploration time. And while it is super easy in all of them to setup a ingame cctv camera and an ingame monitor, it is super hard to get audio support for the camera/monitor setup. (in the end it is just allowing multiple listeners, and transforming coordinates taking into account player->monitor and camera->audiosource coordinates)

Hy everyone,
Here are some updates on studies done so far:

Why I Built a wav2ogg Converter Instead of a WAVLoader

Handling audio in applications, especially games, can be surprisingly tricky. When working with Java, LWJGL, and OpenAL, a common requirement is loading sound files. WAV seems like a straightforward choice, right? It’s uncompressed and widely supported. But as I delved deeper into building a robust audio loading system, I hit some roadblocks that led me to a different conclusion and a pivot in my approach.

My initial thought was to build a dedicated WAVLoader. However, after wrestling with the nuances of the WAV format and the tools available in Java, the path became clear: converting WAV files to OGG Vorbis and using LWJGL’s stb_vorbis bindings is often a far more practical and efficient solution.

Here’s why this shift makes sense:

  1. File Size: WAV files are uncompressed PCM data, meaning they are significantly larger. OGG Vorbis, on the other hand, uses lossy compression, resulting in much smaller files. This is crucial for reducing disk space and improving loading times, particularly in projects with many sound assets.
  2. Loading Simplicity: LWJGL provides excellent, relatively simple bindings for the native stb_vorbis library. Loading an OGG file into memory (typically using stb_vorbis_decode_memory) is quite direct. You get the decoded PCM data ready for OpenAL without needing to manually parse complex headers or navigate the quirks of javax.sound.sampled.
  3. Reliability: stb_vorbis is a mature, well-tested decoder for the OGG format. Relying on it lets you bypass all the potential headaches related to the variations, internal codecs, and header issues within the WAV container format that can lead to dreaded UnsupportedAudioFileException. If the OGG file is valid, stb_vorbis will decode it.
  4. Standard Format: OGG Vorbis is an open, well-defined standard, widely supported across platforms and libraries.
  5. OpenAL Compatibility: Regardless of whether you start with WAV or OGG, the data loaded into an OpenAL buffer is uncompressed PCM. From OpenAL’s perspective, the original source format doesn’t matter once the data is in the buffer.

Of course, the OGG approach isn’t without its downsides:

  • Lossy Compression: Being a lossy format, there is a small, theoretically quality loss compared to the original uncompressed WAV. For most game sound effects, this is practically imperceptible, but it might be a consideration for extremely high-fidelity audio needs.
  • Preprocessing Step: It requires converting your WAV files to OGG before running your application. This adds an extra step to your asset pipeline or workflow.

The Conclusion and the Pivot:

After weighing these points, the advantages of using OGG Vorbis with stb_vorbis for loading audio in LWJGL/OpenAL scenarios significantly outweigh the complexities and potential instability of trying to build a universally robust WAV parser in Java.

What are your thoughts on handling audio formats in your projects?

4 Likes

I use ogg for anything that is particular long or I want “absolutely no control over”… and wav for anything I want to be able to index into.

If the seeking behavior of an ogg solution works well then that mitigates 99% of the problem I have with .ogg files. I want to be able to start playing at a specific time and in my experience, .ogg makes that difficult. (Often times you can go forward but not backward or must restart the stream, etc..)

This could partially be related to JME’s current implementation but when I looked into it, I couldn’t find a way to hack in the support I wanted, either.

2 Likes

Hi guys,
I am working on a PR for JME’s ALAudioRenderer to improve its readability and maintainability by adding logs, comments, and error checking for the most relevant OpenAL operations. This will make it easier to understand what is going on under the hood in case of problems.

The current implementation has no logs or comments, making it impossible for a newbie to understand what is going on without reading the OpenAL documentation (which I’m doing).

This PR also fixes a bug where the Environment and Listener would be lost on restart of the AudioRenderer context.

I’ll notify you when it’s ready for review. Initial testing hasn’t revealed any obvious issues, and I plan to explore adding more targeted tests to the jme3-examples module.

I will try to document the API with a UML class diagram to get the big picture. Furthermore, this work opens the possibility of incorporating additional OpenAL filters—HighPassFilter and BandPassFilter—beyond the existing LowPassFilter. This is due to the com.jme3.audio.Filter interface where I recently fixed the cloning bug.

Let me know what you think and if you are open to this kind of PR.

9 Likes

Thanks for working on this long-neglected area of JME.

3 Likes

Hi all,
the jME sound system overhaul is complete! Expect significantly improved robustness, stability, documentation, and testing. I’ve addressed fragile areas and hidden bugs, and expanded the examples/tests to showcase best practices.
The updated examples and new visual tests should help you leverage all the features.
Hope you find this a solid improvement and look forward to continued collaboration.

For anyone reviewing, I recommend taking a quick look at the OpenAL documentation. Some of the function logic and design decisions might seem unusual at first glance, but they’re often dictated by the OpenAL library itself

Here are the PRs:

UML Class Diagram

Cheers!

10 Likes

Thanks for working on this @capdevon !

I will leave your big PR improving the audio system ( ALAudioRenderer: code improvements by capdevon · Pull Request #2423 · jMonkeyEngine/jmonkeyengine) open for a while since it could use the most review. Then I’ll aim to merge it in about a week or two, at which point I’ll also release the first alpha version of 3.9 so all of your changes can start being officially tested.

But I plan to merge all of your other PRs updating javadocs and many of the tests in jme-examples in ~3 days from now. I’ve reviewed most of them and everything looks okay but let me know if you want me to wait longer on any of those.

1 Like

Hi @yaRnMcDonuts ,
thanks for your collaboration and for speeding up the release process. The PRs are complete, and I agree with your planning. If any other improvements come to mind, I’ll let you know in good time. :wink:

My goal was to make this part of the engine clearer, more robust, and easier to understand for future development. For this reason, the ALAudioRenderer class doesn’t contain substantial changes to the previous logic; it does the same things as before, but in a clearer way. I think it’s better to implement further evolutions with subsequent dedicated PRs.


Legacy OpenAL Context Issue

However, there are still some unresolved legacy issues. The current behavior of destroying and recreating the OpenAL context when the audio output device disconnects (for example, headphones) isn’t the best solution. I’ve written another version of the ALAudioRenderer class, inspired by the libgdx library, and I’ve verified on PC using the lwjgl3 libraries that the libgdx version effectively solves the problem by switching between the available output devices instead of restarting everything. Due to the complicated API of jME’s AudioRenderer and the shared (perhaps not optimal) interfaces between the core, lwjgl2, lwjgl3, android, and iOS modules (see the UML diagram and the AL, ALC, and EFX classes), I preferred not to make changes to this aspect, leaving things as they are. To be revised with a future PR…

Potential Audio Enhancements

We could introduce additional Filters (HighPass & BandPass), configuration parameters to the AudioNode (see rolloffFactor and coneOuterGain), and also the possibility to modify the sound attenuation model. I’m currently verifying how useful these extra parameters and filters might be and whether they actually add further nuances to the sound.

import org.lwjgl.openal.AL10;
import org.lwjgl.openal.AL11;

/**
 * Enum for OpenAL attenuation models:
 * The default distance model in OpenAL is AL_INVERSE_DISTANCE_CLAMPED.
 * 
 * @author capdevon
 */
public enum AttenuationModel {

    NONE(AL10.AL_NONE),
    INVERSE_DISTANCE(AL10.AL_INVERSE_DISTANCE),
    INVERSE_DISTANCE_CLAMPED(AL10.AL_INVERSE_DISTANCE_CLAMPED),
    LINEAR_DISTANCE(AL11.AL_LINEAR_DISTANCE),
    LINEAR_DISTANCE_CLAMPED(AL11.AL_LINEAR_DISTANCE_CLAMPED),
    EXPONENT_DISTANCE(AL11.AL_EXPONENT_DISTANCE),
    EXPONENT_DISTANCE_CLAMPED(AL11.AL_EXPONENT_DISTANCE_CLAMPED);

    private final int model;

    AttenuationModel(int model) {
        this.model = model;
    }

    public int getModel() {
        return model;
    }
}

Listener Encapsulation (Philosophical)

I’m a bit perplexed about the choice of having a Listener variable in the LegacyApplication class. In my opinion, for better encapsulation, the Listener should be present and accessible only through the ALAudioRenderer class. Although this is likely something that can no longer be changed, it remains a philosophical point for me.

see LegacyApplication

see ALAudioRenderer

AudioNode Design Concerns

Another aspect that leaves me puzzled is the AudioNode class. Why mix the graphical properties of a scene graph element with those of an audio component? And why is it possible to set an AudioData to an AudioNode only once? The AudioNode implements the AudioSource interface, and as such, it serves to play the sound data contained in the buffer of an AudioData. In theory, it would be possible to use a single AudioNode/AudioSource object and switch between different AudioData when you need it and the sound is not playing, without duplicating the same variables in memory each time (see volume, pitch, maxDistance, refDistance, flags, etc.).

see AudioNode.setAudioData(AudioData audioData, AudioKey audioKey)

AudioNode as AbstractControl (Unity Inspired)

I’ve transformed the AudioNode class into an AudioSourceControl associated with a simple Node. I’m studying various hypotheses and doing some tests inspired by Unity. At the moment, all these technical and functional improvements work correctly and are applicable in jME.

e.g

        Node node = new Node("AudioNode");
        AudioSourceControl audioSource = new AudioSourceControl();
        audioSource.setLooping(true);
        audioSource.setPositional(true);
        audioSource.setMaxDistance(100);
        audioSource.setRefDistance(5);
        node.addControl(audioSource);

        AudioData data = loadAudioData("Sound/Effects/Foot steps.ogg", AudioData.DataType.Buffer);
        audioSource.setAudioData(data);
        audioSource.play();

        rootNode.attachChild(node);


    private AudioData loadAudioData(String filepath, AudioData.DataType type) {
        boolean stream = (type == AudioData.DataType.Stream);
        boolean streamCache = true;
        AudioKey audioKey = new AudioKey(filepath, stream, streamCache);
        AudioData data = assetManager.loadAsset(audioKey);
        return data;
    }

public class AudioSourceControl extends AbstractControl implements AudioSource {

    private boolean loop = false;
    private float volume = 1;
    private float pitch = 1;
    private float rolloffFactor = 1.0f; // Standard attenuation factor ** NEW **
    private float timeOffset = 0;
    private AudioKey audioKey;
    private AudioData data = null;
    private volatile AudioSource.Status status = AudioSource.Status.Stopped;
    private volatile int channel = -1;
    private boolean reverbEnabled = false;
    private Filter reverbFilter;
    private Filter dryFilter;
    private boolean positional = true;
    private float maxDistance = 200; // 200 meters
    private float refDistance = 10; // 10 meters
    private boolean directional = false;
    private Vector3f direction = new Vector3f(0, 0, 1);
    private float innerAngle = 360; // (fully open)
    private float outerAngle = 360; // (fully open)
    private float coneOuterGain = 0.0f; // Gain outside outer cone (silent) ** NEW **
    private final Vector3f previousWorldTranslation = Vector3f.NAN.clone();
    private Vector3f velocity = new Vector3f();
    private boolean velocityFromTranslation = false;
    private float lastTpf;

...

}
3 Likes

You probably do not want volume shared between different audiofiles, usually the audio data is normalized and the resulting raw volume depends on the audio file. I guess that using audionode as savable to store this information was the best option. (At least thats what i am dooing as a complete newby when it comes to anything audio related)

I am not sure if reusing the audionode for different audiofiles might not complicate the management code in my game. I would have to check if there is currently something playing, if yes duplicating the audionode anyway. And cleaning up afterwards.

1 Like

Hi @zzuegg ,
thanks for the feedback! I came across this video that sparked an idea about reusing the same AudioSource with different AudioData. I’m now trying to wrap my head around whether this approach would even play nice with jME’s current audio system architecture.

youtube - randomized footsteps

Personally, I still think there’s merit in keeping things separate with a Node + AudioSourceControl setup instead of the monolithic AudioNode. Both are Savable, so once linked, you could save them just like an AudioNode without losing any save/load capabilities. Plus, having that clear separation of concerns could be a win down the line.

It’s strange to be able to execute this instruction on an object you are creating in order to play a sound:

AudioNode audioNode = ...;
audioNode.setShadowMode(ShadowMode.CastAndReceive);

I’m not planning on pushing a PR for a brand-new AudioSourceControl or anything, since everyone’s got their own way of doing things. But I was really keen to throw this concept out there, see what you all think, and maybe hash out some pros and cons together.

Yeah i agree that the abstraction hirarchy is strange. I would actually prefer
Spatial>Node>Geometry
Spatial>AudioNode (Or AudioSource)

Additionally for my liking spatial has too much responsability for stuff that only matters to graphic type spatials. As your example with shadowmode.

I like the control based approach too, composition over inheritance usually pays of. Api wise, i suspect there is missing a getter in spatial to get access to the second audiosourcecontrol (in case you have more then one).

In the big picture the main decicision is if jme want’s to move to more a component based scenegraph. (Imho this idea is a step in that direction if we eliminate the ‘no logic’ rule from an ecs.)

The word component means multiple things. We should absolutely NOT conflate the two. And a scene graph is absolutely 10000000000000000000% no place for ECS components.

Component meaning a “bit of something”… sure. Composition over inheritance. JME chose to call this “controls”. A “composition based scene graph” is a good idea and would have prevented is from having half-a-dozen special spatials like LightNode, CameraNode, etc… just use controls.

Component meaning “a data only thing in an ECS” is a very specific thing, though… and has no real place in this discussion, in my opinion. It’s confusing enough as it is.

2 Likes

Agreed. I will use control.

2 Likes