FrameGraph API

Over the last several weeks, I’ve been working on improving @JhonKkk’s FrameGraph API. That project turned into a complete rewrite, which turned into another rewrite, so I guess this would be “JME FrameGraph, version 3.” Hopefully this one will make it into the core engine.

This API is based on the GDC presentation linked below, and on @JhonKkk’s work.

The goal is to provide a modular rendering architecture and a render resource manager to minimize resource creation and binding. While Johnkkk’s implementation worked, it did not provide a clean, reusable API. Therefore, I’ve reworked virtually all of the related Java code and API. I am finally ready to present the work I’ve done to the community for review so that hopefully this will get integrated to the core engine.

How the FrameGraph works

The FrameGraph, put simply, is an object that renders a ViewPort according to a list of passes. The FrameGraph can be thought as a node editor (like Blender’s shader editor), where each pass is a node, and passes, like nodes, can share resources between each other. The tricky part is managing the resources that pass between passes as efficiently as possible.

To handle resources, I’ve implemented a “lazy” resource manager, which attempts to minimize the number of existing resources by reusing resources where possible. It also includes a “reservation” protocol that allows passes to snag the exact resource they used last frame to minimize the number of texture binds that occur.

Rendering a ViewPort with a FrameGraph has 4 stages:

  1. Resource Declaration. All passes are expected to declare whatever resources they plan to use for rendering ahead of time, so that the resource manager can plan out which resources are allocated where and when.

  2. Pass Culling. Any passes which produce resources that are not then referenced by another pass are culled. This process is recursive, so once a pass is culled, whatever pass it was getting resources from might be culled as well.

  3. Execution. All unculled passes are expected to, during this step, acquire whatever resources necessary, do whatever is needed (like rendering), and release all declared or referenced resources.

  4. Reset. All resource pointers are destroyed (not the resources themselves), and passes do whatever cleanup is necessary.

All resources that are not used for the entire rendering process of all ViewPorts are deleted.

Here is the PR for this feature:

I would greatly appreciate any advice and testing. Thank you all for your support!

18 Likes

I am not a big fan of enums for things that are ment to be customizable.

enum RenderPath //in RenderManager.java

for example.

1 Like

Actually, that enum could probably be removed right now, as I’m not using it. It was left over from Johnkkk’s code.

1 Like

Nice. was not sure how much of johns code is still used from a quick lookover

1 Like

Most of his shader code is still in use, but his Java code is rapidly approaching zero at this point.

Thank you for taking the challenge for creating version 3 of the frame graph enhancements initiative.
So, if I understand correctly there are two parts for this feature, first is the " framegraph framework that supports 3 render paths" which is what Jonkk did for improving JME’s FrameGraph.
Second, is the API on top of that which the community asked for improving before merging the entire feature.
So if I’m reading this right, you modified both those parts to meet the community requirements (in the API part) and while doing so you also changed/improved large parts of Jhonkkk’s work.
This is really exciting! because now we have new ownership for this feature. did you try integrating it with v3.7? I’m curious to see whether or not we have major conflicts with existing code.

5 Likes

You’re welcome, glad to do it. :smiley:

I have not tried integrating with 3.7. I had assumed this feature would be integrated in the next engine release past 3.7, since we are now in feature freeze. Also, a feature as large as this will need a good deal of testing and bug fixing, so it will probably be a while longer before this is ready for release.

For merge conflicts, unless another PR has been messing around with rendering internals as well, I believe there will be very few, as the overwhelming majority of the changes are on new files.

4 Likes

Yes, of course it will need to be integrated in future release. I just was curious about conflicts. Actually the way we work here there is not much difference between updating from v3.7 or from master so whenever you update from master we will know about any conflicts.
I think Riccardo contributed some rendering code to v3.7 (a few thousands lines actually - quite big) so conflicts might occur.

1 Like

Hi @codex.

I recently wanted to test this PR but ran into a small problem; I cannot execute certain classes (examples) since there are objects (classes) that do not exist but are implemented in the example…

By the way, download the full project (zip) from github; Could I be doing something I shouldn’t? :thinking:

1 Like

Hi Swiftwolf, thanks for trying it out!

No, you’re fine. There are several tests that use API that has been deleted, and I haven’t gotten around to fixing those tests up yet. The primary test class I am using is TestShadingModule.java, which should work on the latest commit that I uploaded a few minutes ago.

2 Likes

I understand, thanks for clarifying… as soon as I have a little more time; I would like to do more testing with this new API. :yum:

1 Like

Progress Update

  • FrameGraphs can more easily be constructed in code.
  • Tickets can directly reference each other, so ticket getters/setters are no longer necessary.
  • Tickets are now created once during initialization, instead of at every render setup.
  • Attribute passes allow game logic to communicate with the framegraph.
  • Junction passes allow game logic to alter the layout of the framegraph.
  • All tests under jme3test.renderpath have been passed (with the exception of android tests, which I haven’t tried).
  • FrameGraphs can be loaded from .j3g binary files using the asset manager.
  • Three simple framegraphs have been added under Common/FrameGraphs/ in core.
  • ViewPorts can opt-out of using framegraphs, even if the default framegraph is set (perhaps useful for GUIs).
  • FrameGraph rendering can be profiled.
  • Javadoc and licenses have been added.

Issues

  • Having two framegraph-driven viewports overlaid can often be problematic, depending on how the framegraphs are set up.
  • Deferred and tiled deferred pipelines make background black under certain unknown circumstances.
  • Lighting looks “messy” with deferred and tiled deferred pipelines. See below.

13 Likes

After taking care of a number of crippling bugs, I am pleased to announce that the PR is out of draft status, and is ready to be reviewed and tested! As mentioned before, Javadoc is available, especially for the important bits.

For those who don’t want to dive into the nitty-gritty, FrameGraphs can be tested by loading them from framegraph files (j3g) or from a factory class.

FrameGraph graph = assetManager.loadFrameGraph("Common/FrameGraphs/Forward.j3g");
viewPort.setFrameGraph(graph);
FrameGraph graph = FrameGraphFactory.forward(assetManager);
renderManager.setFrameGraph(graph);
6 Likes

You might consider copying your nice writeup from the PR here… it will get a few more eyes if folks don’t have to click through. (Silly but true.)

1 Like

Copied from the PR discussion…

To speed things along, I will try to explain how the resource handling works:

Each FrameGraph has a ResourceList, which handles RenderResources. RenderResources are not actual resources (which are called RenderObjects in the code) , they are simply promises of actual resources (RenderObjects) that will exist in the future. RenderPasses do not have direct access to the RenderResources, but can perform operations on them through ResourceTickets, which are essentially indices pointing to particular RenderResources within an array.

When a RenderPass makes a request for a RenderObject from the ResourceList using a ResourceTicket (called acquiring), the ResourceList will check if the RenderResource located by the ResourceTicket is virtual (is not associated with a RenderObject). If not, great! The associated RenderObject is simply returned. But if the RenderResource is virtual, the ResourceList asks the RenderObjectMap (held by the RenderManager) to either create a new RenderObject to use or find an existing one to reallocate.

The RenderObjectMap considers creating new RenderObjects expensive, so it tries to reallocate existing RenderObjects wherever it can. The ResourceDef (which defines the behavior of a RenderResource and the associated RenderObject) attached to the RenderResource is used to determine which RenderObjects are suitable for reallocation. Also the RenderObjectMap doesn’t like to keep unused RenderObjects alive, so once a RenderObject has gone a number of frames without being used, it is disposed.

As an example to better understand how render passes and resource handling works, here is a basic render pass that performs downsampling. That is, it takes an input texture and renders to an output texture that is a quarter of the size.

public class MyRenderPass extends RenderPass {
    
    private ResourceTicket<Texture2D> inTex, outTex;
    private TextureDef<Texture2D> texDef;
    
    @Override
    protected void initialize(FrameGraph frameGraph) {
        
        // Initialize is called when the pass is assigned to the FrameGraph.
        
        // Add an input named "Texture". This allows this pass to access a texture
        // from another pass.
        inTex = addInput("Texture");
        
        // Add an output named "Texture". This allows other passes to access the
        // resulting texture.
        outTex = addOutput("Texture");
        
        // Declare the texture definition, defining how the output texture is created
        // and how it behaves.
        texDef = new TextureDef(Texture2D.class, img -> new Texture2D(img));
        
    }
    @Override
    protected void prepare(FGRenderContext context) {
        
        // Prepare is called before execution for every ViewPort rendered, and is used primarily to determine
        // what resources are referenced where, so that unused passes can be culled.
        
        // Note that a prepare call does not necessarily proceed a execute call, due to culling.
        
        // Declare a new resource as the output texture
        declare(texDef, outTex);
        
        // Reserve the output texture. This greatly increases the chances of getting
        // the same texture from frame to frame, and thus minimizes texture binds.
        reserve(outTex);
        
        // Reference the input texture, so this pass can safely access it
        reference(inTex);
        
    }
    @Override
    protected void execute(FGRenderContext context) {
    
        // Execute is called to perform the rendering operations. This is not guaranteed
        // to be called for every ViewPort rendered, because of culling.
    
        // Acquire the input texture provided by another pass
        Texture2D texture = resources.acquire(inTex);
        Image img = texture.getImage();
        
        // Set the size of the output texture as half the input texture on each demension
        int w = img.getWidth() / 2;
        int h = img.getHeight() / 2;
        texDef.setSize(w, h);
        
        // Set the format of the output texture to match the input texture
        texDef.setFormat(img.getFormat());
        
        // Get a framebuffer object matching the width, height, and samples. If no
        // such framebuffer exists, a new one will be created. This is handled only by
        // the RenderPass superclass and not the resource manager.
        FrameBuffer fb = getFrameBuffer(w, h, 1);
        
        // Acquire the output texture, which is immediately attached to the framebuffer.
        // This operation is specifically designed to limit texture binds and framebuffer updates.
        resources.acquireColorTargets(fb, outTex);
        
        // Attach the framebuffer for rendering
        context.getRenderer().setFrameBuffer(fb);
        context.getRenderer().clearBuffers(true, true, true);
        context.getRenderer().setBackgroundColor(ColorRGBA.BlackNoAlpha);
        
        // edit: set the camera width and height to the output texture width and height
        context.resizeCamera(w, h, false, false, false);

        // Render the input texture to the output texture on a fullscreen quad.
        // The depth texture is null, so the quad is rendered at a depth of one.
        context.renderTextures(texture, null);
        
    }
    @Override
    protected void reset(FGRenderContext context) {
        // Reset is called after execution. Passes cannot be culled from this step.
    }
    @Override
    protected void cleanup(FrameGraph frameGraph) {
        // Cleanup is called when this pass is removed from the FrameGraph.
    }
    
}

Further details for each function and class can be found in the relavent Javadoc.

10 Likes

I’m a bit curious whether this PR includes the frame graph visualization, used for profiling and debugging.

I mean something like this:

viewer

deferred_pipeline

2 Likes

In this code, it is a bit strange to change the texDefinition but then the definition is never used in any of the following commands. At a first glance this seems not very intuitive.

Thanks for the work, i hope to find some time this week to play with it.

2 Likes

There is no visualization or editor in this PR, but I hope to add something to the effect later on, probably as an external plugin.

Edit: I did add a dedicated event logger that writes key framegraph events over so many frames to a file. That could already be utilized to make a detailed visualization app.

4 Likes

Updates

  • Moved deferred lighting logic from inflexible TechniqueDefLogic to the deferred pass.
  • Merged deferred and tiled deferred into the same files, because they are so similar.
  • Added ticket groups, which are basically arrays of resource tickets that are used similarly.
  • Updated Junction pass and added GroupAttribute pass for ticket group support.
  • Added a Savable settings map in FrameGraph that passes can access to apply user settings.
  • Moved light texture packing and tiling logic to a dedicated pass.
  • Made passes support GeometryLists as arguments, instead of accessing the viewport’s hardcoded render queue to find geometries to render. GeometryLists can also be combined now.
  • Moved scene geometry queuing to a dedicated pass, instead of being hardcoded into RenderManager.
  • Added support for primitive resources, which are much lighter and do not require a resource definition, but lack fine-tuned control. Many passes now use this feature.
  • Fixed GBuffer light accumulation making (n^2)/2 checks to ensure duplicate lights are not added to the light list.
  • Attribute, Junction, and GroupAttribute passes now save Savable GraphTargets and GraphSources.

Known Issues

  • PBR looks wrong when a material’s metallic value is one.
  • Tiled deferred variation is unusable if the forced tile size is at or below 64.
  • Many files lack licenses and javadoc.

Road Map

  • I’ve decided that the current filter API is incompatible with the FrameGraph, so filters will probably have to be ported over. Fortunately, the process is not very difficult, and I’ve already ported a few.

  • A visual FrameGraph editor should be built. Constructing FrameGraphs in code is ok at the moment, but the whole concept is begging for a visual editor. I hope a developer of an existing JME editor will agree to collaborate on this (@ndebruyn, @FoxCC, @capdevon).

  • I’m going to try to make a few videos explaining how to use the FrameGraph and some of what goes on behind the scenes.

  • Edit: Add some sort of documentation that explains how to use each RenderPass implementation (inputs, outputs, settings, etc). This may be included in metadata files specific to each RenderPass implementation that editors can also read.

  • Edit: Add support for multithreading. This would allow passes that don’t do any rendering to be performed in parallel.

7 Likes

Benchmarks

Graphics capabilities:

INFO: LWJGL 3.3.3+5 context running on thread jME3 Main
 * Graphics Adapter: GLFW 3.4.0 Wayland X11 GLX Null EGL OSMesa monotonic shared
Jun 22, 2024 10:52:37 AM com.jme3.renderer.opengl.GLRenderer loadCapabilitiesCommon
INFO: OpenGL Renderer Information
 * Vendor: NVIDIA Corporation
 * Renderer: NVIDIA GeForce GTX 1060 6GB/PCIe/SSE2
 * OpenGL Version: 4.5.0 NVIDIA 545.29.06
 * GLSL Version: 4.50 NVIDIA
 * Profile: Core

Scene: 1,000 white PBR spheres (not instanced); 1 large 200x200 white PBR plane; 1,000 point lights with radius 30 and random color. No light probes. All tests were performed with almost all geometries visible and at roughly the same camera angle.

Forward (no framegraph): 2 fps (window became unresponsive, so likely lower)
Simple Deferred: 26 fps
Tiled Deferred (7 divisions): 62 fps

Tiled Deferred:

This is extremely good! Especially for a scene as ridiculous as this! Performance will likely be even better once tiling is fixed to allow for more divisions than 7, but this performance gain is already staggering.

5 Likes