Enhancements to the JME3 Rendering Engine

oxplay2 · October 10, 2023, 12:29pm

Thats cool
I were always afraid to use too many lights in JME before ;p

I understand general idea of tilebased rendering one, tho its still very hard to understand for me by seeing the code.

What about Shadows? I belive Shadows and Lights were main weakness in JME, not counting Opengl backend only. Do you plan to touch them too? Or maybe they will be affected by it also?

If possible to remove main weaknesses of JME it would be awesome

JhonKkk · October 10, 2023, 12:31pm

There are two fallback rendering paths I didn’t port: Forward+ and ClusterDeferredShading. I think Tile-BasedDeferredShading should be enough to handle most scenes with a large number of lights, so I have no plan to port ClusterDeferredShading. For Forward+, it may be in my subsequent plans, mainly for mobile (but mobile usually has better solutions, i.e. implementing DeferredShading based on on-chip caches and MRT).

JhonKkk · October 10, 2023, 12:36pm

Currently, shadows can only be used for a small number of lights, even though deferred rendering allows 5000 PointLights. You can only activate shadows for a main lights, otherwise each shadow has to render its ShadowMap separately. However I noticed two techniques for mass shadowing from many lights, one being VirtualTextureShadowMap (VSM) in UE5, the other from: https://www.researchgate.net/publication/304344770_Fast_Many-Lights_Rendering_with_Temporal_Shadow_Maps
But I’m currently planning to prioritize implementing the GI part (LightProbeVolume and light propagation volumes), so you can only activate post-process shadows for a main lights.

zzuegg · October 10, 2023, 4:20pm

I get an assertion error when using your master.

java.lang.AssertionError
	at com.jme3.renderer.framegraph.FrameGraph.reset(FrameGraph.java:106)
	at com.jme3.renderer.RenderManager.renderViewPort(RenderManager.java:1312)
	at com.jme3.renderer.RenderManager.render(RenderManager.java:1537)
	at com.jme3.app.SimpleApplication.update(SimpleApplication.java:283)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.runLoop(LwjglAbstractDisplay.java:160)
	at com.jme3.system.lwjgl.LwjglDisplay.runLoop(LwjglDisplay.java:225)
	at com.jme3.system.lwjgl.LwjglAbstractDisplay.run(LwjglAbstractDisplay.java:242)
	at java.base/java.lang.Thread.run(Thread.java:833)

JhonKkk · October 11, 2023, 2:31am

Thank you for testing, I noticed this is a low level error, and have fixed it now. (I’m on the other side of the earth, so I just woke up…, sorry for the late reply)

zzuegg · October 11, 2023, 5:41am

No need to hurry, or even reply fast. Take your time to fix things
We all have lives out there. Going to test it after work

zzuegg · October 11, 2023, 8:22am

Hi,

There seems to be an issue with your tiled light list generation algorithm. i guess the top tiles are not filled correctly, see following screenshots:

Also i noticed some flickering of lights when moving the camera, maybe due to the same bug.

Lastly, performance wise there seems to be quite an issue. I have not yet had time to investigate with the gpu profiler, i will if the api is stable enough to be considered merging to core. I assume it has to do with the way you manage the the lights. But as said, just a unproven tough.

Your tiled deferred renderer should outperform illuminas by far in theory. Not sure if this is a representative scene or not since everything is static and there is not a lot of overdraw. So it might favor a basic deferred renderer.
Anyways, here are the result of your TestTiled* vs illuminas from github, on nearly the same scene. Illuminas uses the pbr pipeline since well, the blinnphong is bugged and i am currently using a new implementation so it is far low on my todo to fix the github version

I have removed the postprocessing in your example to match the output as close i could. We are talking about 6.5ms fs 1.72ms which is quite a big deal. (In illuminas i can enable shadow casting on 10% of the lights and still render faster)

I am fairly sure we can track the issue down quite fast since a 5ms loss should be easy to spot in the profiler.

JhonKkk · October 11, 2023, 9:34am

Thank you for taking the time to compare. Regarding your doubts, I think I can answer some of them for now.

I’m using Minimize Visibility Culling for light collection, so when camera is far from lights, there are some precision errors. However, when camera is close to the lights, the top should get correct collection.

This issue is likely caused by dynamic texture recreation, however I remember in my last submission I slightly increased the size of each texture recreation by 1.5x.

For this last issue, I should know where the problem lies, because I am using 2 dynamic textures to store tiles, and 3 textures to store lightData. All this data is fully updated every frame. Currently, I have not fully optimized these parts, but instead prioritized completing the overall framework first. I guess you could optimize this, because my main plans going forward are to test the compatibility of the existing code, and implement the GI portion.

zzuegg · October 11, 2023, 10:00am

Did you implement the texture based system to overcome a jme limitation. if yes, since the scope of this change is already super big, we should work on those limitations as well. I guess its the lack of SSBO that requires all of this. Most implementations i have seen, and the one i have implemented, use a SSBO for the global light list and a LinkedList SSBO for each tile in combination with an AtomicInteger texture to map the tiles to the first list item.

It offers the possibility to offload the generation to the gpu. Also most of the code stays the same when switching to clustered shading.

There is already a UBO/SSBO pull request from ricchardo waiting…

But yeah, work on the GI, since i am curious to see that. One thing i have never implemented myself up to now. And it is on my todo for engine tech since ages

JhonKkk · October 11, 2023, 10:35am

Tested all ShadowFilter shadow types, rendering paths now support them.

JhonKkk · October 11, 2023, 10:42am

Yes, and I think the way it’s currently being used is not efficient enough, it’s not efficient enough on both the CPU and GPU sides. I rebuilt and updated the dynamic texture data in the following way:

And I think, in order to be fully compatible with existing stuff, I retained all the effect data calculations in the material (diffuse, specular, ao, lightmap, refelctionVec… basically all the data from the existing lighting/pbrlighting packed together). I think being compatible with all effects in one shading is more important than prioritizing performance. This should perhaps also be an optimization option to be addressed later.

JhonKkk · October 11, 2023, 10:53am

Moreover, most importantly, I update all the texture data every frame instead of caching the light source list or storing it only once. This is where the main performance overhead lies (both GPU and CPU). However, even without fully optimizing the code here, it should be able to meet and be compatible with most existing jme3.7 projects.

I think anyone with time and experience (like you @zzuegg ) can go and optimize this part of the code, because I wanted to focus on fixing existing bugs and adding new features.

zzuegg · October 11, 2023, 11:13am

Well, it is far easier to implement a tech demo, or a implementation that works for me, than a implementation that works for all.

JhonKkk · October 11, 2023, 11:20am

I think it would be meaningful to have someone with graphics knowledge like you to participate in enhancing the jme3 renderer. I look forward to your work on it (of course I know everyone has their own job and life).

zzuegg · October 11, 2023, 11:47am

I lack many skills a project of this scope requires. Turns graphic tech is a subject that intrests me and i have spent a lot of time implementing various techniques.

On all other sides of software development i suck big time!

JhonKkk · October 11, 2023, 1:22pm

Tested implementing corresponding shading models for different materials in deferred rendering：

oxplay2 · October 16, 2023, 11:10am

Hello Jhon,

i noticed there is topic about pull request and code review.

Im not sure if its related, but i belive it might be helpfull for your pull requests.

JhonKkk · October 16, 2023, 2:50pm

This is the realistic skin shading model (subsurface scattering for skin) I recently implemented, however, it requires multiple passes, which may be necessary for realistic AAA games, while for cheap mobile platforms, I truly implemented subsurface shading techniques based on the approximated skin shading.

The images below show a comparison of the effects with realistic skin subsurface scattering turned off and on：

JhonKkk · October 16, 2023, 2:51pm

Thank you, I will read the information you provided later.

JhonKkk · October 16, 2023, 3:22pm

Hi guys @RiccardoBlb @sgold @pspeed @Ali_RS （I might have forgotten to @ others, if so, please let me know, thanks!）, before moving forward with this PR（https://github.com/jMonkeyEngine/jmonkeyengine/pull/2090）, I think we should discuss the following questions:
I think this current PR, as a 1.0 version containing framegraph, renderPath, and shadingModel, should be built on the basis of compatibility with existing architecture and stable operation. So this PR is not perfect code, just as @zzuegg 's previous comparison, however, let’s have the 1.0 version first, then add optimized PRs and continuous fix PRs in constant feedback.
Additionally, should I also include global illumination content as part of this PR (i.e. as a 1.0 version of “enhancing renderer capabilities”?

Another possibility is: I should continuously fix bugs in this PR, and continuously optimize the code until everyone thinks it’s about time to merge into the core? So the time for this PR may be longer, also, before merging into the core, how to let other jme3 users test it and give me feedback to fix (let them pull separately)?