.

.

4 Likes

Nice find.



As a general rule you should never implement your own sorting algorithms, just use comparators and the built-in ones. They are almost always going to be a lot faster, and why reinvent the wheel as well.

[java]/**

  • Quick and merge sort implementations that create no garbage, unlike {@link
  • Arrays#sort}. The merge sort is stable, the quick sort is not.

    */

    [/java]

    I guess this comment explains. Does saving garbage really justify a 7 or 15 times difference though?
@zarch said:
[java]/**
* Quick and merge sort implementations that create no garbage, unlike {@link
* Arrays#sort}. The merge sort is stable, the quick sort is not.
*/
[/java]
I guess this comment explains. Does saving garbage really justify a 7 or 15 times difference though?

Shit from hell, err, I mean, Android.

garbage collection might be an valid argument but the stable argument is not (in the case of geometry lists).

@kwando said:
garbage collection might be an valid argument but the stable argument is not (in the case of geometry lists).


What do you mean by this exactly?

I think the goal was that objects not be reordered unless needed. Thus if you pass a sorted list then you will get no change. I haven't looked at the Arrays.sort() algorithm in a while but I think it reorders even a sorted array.

The GC churn is pretty significant, though. There was a lot of work done to keep memory stable through the render loop. That's why a custom implementation was used to let the target buffers get reused. Whatever sort implementation is picked, that's a pretty significant improvement overall. Otherwise, you create an array of length geometry count every frame.

So if another sort is faster and possibly stable then we could reimplement it in SortUtils to allow passing the target buffer and try it.

We could implement this timSort, and keeping the extra array needed to sort to avoid garbage generation as well like we do for the merge sort.



I made some tests on android (android 2.3, with a nexus one) with the current sort algorithms :

GSort 12979.786585 ms

QSort 92053.292683 ms

MSort 107548.993902 ms

ASort 38413.780488 ms

My definition of stable sort is that if two objects are equal then the first one added to the list should be first one in the sorted list.



The point I’m having is that one maybe should not rely on the queue ordering, if the order is important then another comparator should be used so that sorting behaves as one expects.



Why is it bad if objects is reordered btw? (It might just be me misunderstanding something here).



The GC and memory issues sounds like valid reasons to have one custom sorting algorithm.

Hio about putting that into a interface where the user can use the implementation they like? Eg. with a javaj7 the g1 can run concurrently , so on multicore platforms the gc overhead is barely noticeable if at all.



On certain androids, the gc however is really bad (especially older 2.x versions), so it makes sense to use a different algorithm there.

@phate666 said:
Your implementation creates more "garbage" then Array.sort(), because you do a System.arraycopy with n elements, where TimSort only uses (in worst case) n/2.

The problem is android garbage collection. It's terrible. As a result the main render loop in jME3 is careful not to allocate/deallocate items unless absolutely needed in order to reduce load on the GC.

I've not checked the code but my understanding is that the current implementation may do an array copy but it tries to avoid allocating any new buffers if needed and instead re-uses internal ones....
@phate666 said:
Your implementation creates more "garbage" then Array.sort(), because you do a System.arraycopy with n elements, where TimSort only uses (in worst case) n/2.

uh...i didn't make any implementation of timsort yet.
The current sort used for geometry list is a merge sort and indeed an arraycopy is done just before the sort. but since arraycopy is a JNI call i don't see how it would generate garbage. the 2 arrays are kept in the geometry list and allocated by chunks of 32.

The jdk timsort will allocate n/2 space on each sort. the idea is to have this space allocated once and reused for each sort of the same geometrylist.

We'd also like an implementation that is not specific to a java version, and have the same sort algorithm across platforms/jre.

Anyway, this is clearly a nice area of improvement to boost up the rendering time.

since arraycopy is a JNI call



Actually you can allocate java object sfrom c++, the whole jni stuff is just like reflection in java (excpet its in c++, so everything that reflection can do, can also be done with jni)

@EmpirePhoenix said:
since arraycopy is a JNI call


Not in Hotspot. It is optimized, often inlined, sometimes calling helper function, but I don't think it ever uses JNI for actual resolution. No idea how it is done in Delvik.

Just for updates i’m still looking into this.

IMO there is an issue with our merge sort, it’s very slow compared to the Arrays merge sort of java 6.

Anyway, I’m planning to add the TimSort algorithm.



I made some more performance tests and following pspeed’s advice, I increased the number of elements to be sorted to 500 and decreased the number of iteration to 100k (instead of 1M) to make the test more reliable.

Then i did the test several times in a row (5 for now) to let Hotspot kicks in and do its magic. There is a huge benefit from HotSpot for the gnome sort. the others not much (maybe 5 iterations is not enough).



I tested with java 6 and java 7. I’m really surprised by the numbers pulled by java 7. All algorithms are twice faster than with java6, except the Arrays.sort that is 7 times faster (due to the change of implementation from merge to timSort).

Timsort pulls out the best numbers in any situation, and also benefits a lot from Hotspot with few iterations…so that definitely is the way to go.

1 Like
@nehon said:
The current sort used for geometry list is a merge sort and indeed an arraycopy is done just before the sort. but since arraycopy is a JNI call i don't see how it would generate garbage. the 2 arrays are kept in the geometry list and allocated by chunks of 32.


Ouch. That's an O(N*N) algorithm, because it's copying increasing amounts of array contents.
Increasing the array size by a fixed percentage will reduce this to O(N*log N), because copying becomes rarer as the size increases.


GeometryList actually doubles the array size each time, so all is well.

Oh, and it's not arraycopy that creates the garbage, it's abandoning the previous smaller-sized array :)

@pspeed said:
I think the goal was that objects not be reordered unless needed. Thus if you pass a sorted list then you will get no change.


In three decades of programming, I have very rarely seen a situation where stability mattered.
And if it did, it was easy to add some last-ditch criterion to the Comparator when comparison turned out to say "equivalent", so it's easy to have a unique sort order and you don't need to worry about stability.
Other possible strategies on the list were:

  • Ignore the problem because sort-equivalent items are equivalent for all other purposes anyway

  • Use a multiset if the previous strategy causes too many useless swaps

  • Add an artificial integer to the objects and initialize it with some unique value, use that as last-ditch sort criterion



Stability is important if, for some reason, you need to fake a multi-criterion sort by doing multiple sort passes over the same data. I don't know when you need to do that; I could imagine some weird situations involving tiny amounts of RAM and tape sorting where that might become a factor.
Not applicable to JME I guess :)