I need some help with data visualisation

Hi everyone,

I’m trying to make some data visualisation tool using jME. The tool should be able to receive data through networking and display a real-time (scene)graph on screen, which you can travel through and where you can point-and-click an element to get more information about it.

Data

What data

Here’s a description of the data:

  • 800 elements. Let’s call them “nodes” (about five floats, three strings, and a long - could be truncated to several longs if necessary)
  • each with 255 references to another element. Let’s call them “connections” (about two longs, a float, and a string)
  • also, all nodes have a separate culled hit box to aid point-and-click.
  • that’s a total of 204000 connections + 2 * 800 elements = 205600 things to draw.

It should preferably display everything.

I already approached this problem twice. In the last iteration I used Zay-Es because it has a simple to use way to detect changes and I did not yet know how much data I was displaying.

Input

I already have pretty solid networking, so that isn’t a problem.
However, this is a ranked list of what data updates frequently:

really fast

  • color
  • size
  • position
  • hitbox size (might as well be constant)

really slow

My main problems are:

How should I store so much data in Java?

I think there are three options:

  • Use Zay-Es
    Positives: easy change detection; I already have most of the code.
    Negatives: Uuhhhh that’d be at least 206k Java objects and at most 822k Java objects. I don’t know if Java is happy with that.
  • Use big arrays:
    Positives: no overhead - might as well do it in C.
    Negatives: speaks for itself really
  • Use Lists:
    Positives: It’s a compromise between arrays and ES, which means it doesn’t use 822k objects, but it’s also friendlier with memory allocation than pure arrays.
    Negatives: I need to write some sort of change detection myself to optimally update buffers and what not.

I just don’t know what would be a smart way to do this, as I never had to use this much data.

How do I visualize 206k data efficiently with jME?

I tried to represent connections with debug arrows and elements with spheres, but that gives a total of 206k draw calls (which doesn’t sound good). So I tried batching it, but the batching thing would freeze completely after more than ~150k objects.
That basically means I need to create a custom mesh, which isn’t that bad.
The last two options are:

  • Use the buffers in the custom mesh to pass ‘real’ data to a custom shader.
    Positives: I don’t need to pre-process data on cpu, and I can truncate the original data slightly, because it has to be less specific.
    Negatives: I’d have to use buffers like TexCoords or Size or something, which seems like a hacky way to send data to a shader.
  • Use the buffers in the custom mesh to pass ‘correct’ data to unshaded mat.
    Positives: I don’t have to write glsl.
    Negatives: Whenever data is received, it needs to get pre-processed into the correct attributes (i.e. color, size), and that’d happen on cpu.

What is the best way to do this? I honestly don’t care about development time - only about performance.


p.s.: Also, I’d like to have, like, 30-60 fps, because I should be able to move the camera.

Do you have a picture of what it’s supposed to look like?

I’d definitely use one (or a few) custom meshes that each contain large numbers of your objects.

I think you are trying to describe a graph of nodes and edges? The edges could be one set of large meshes and the nodes another.

Whether you use Zay-ES or not is up to you. It will certainly add some overhead but you are already talking about a large number of objects anyway. Maybe just up the RAM requirements until it’s a problem. Either way, the thing that applies the changes to your meshes won’t care where the data came from. You can always swap out the networking later if you’ve done your job properly.

kinda

This is v1, a close-up of a node and a connection, with some data on the side:

This is also v1, and it shows everything (only 400 nodes and no connections though).

As you can see, version one was based on JSON. You can also see the hitboxes.

This is v2, with a debug setup (I couldn’t find the python script where I sent 800 nodes through networking, maybe I can find it tomorrow):

As you can see, it uses arrows instead of lines, and there is some sort of lighting to give you a sense of direction (I also plan on adding three coloured arrows in the gui, but that’s beside the point). The black smudge on the bottom is a grid, but it’s lines are very thick because it uses the same material as the arrows.

I will search for that python script tomorrow, so I can show what it looks like with more arrows.

It is a fun practical case where physics becomes math. It is a visualisation of a network, what indeed becomes a graph of nodes and edges.

Not sure I understand this part unless you are referring to layout.

I have a lot of experience with graph visualization and layouts… that’s why I asked.

Given the graph it sounds like you are trying to display (highly connected), you have your work cut out for you even making it make sense for the user, let alone performance. And sometimes the solution to one is often the solution to the other, as well. LOD, level of interest, some amount of imposed structure… all of these things can help in both cases.