Scale whole world to use approximately Ints

navycougar · March 29, 2013, 5:11pm

Hello, i’d like to know if thereis a way to scale up the whole world so that a pixel has approximately int values, so my coordinates will look more like 100,102. than 0.001,0.00102

edit : this is just so i can make bitwise operations (mods, divisions etc)

pspeed · March 29, 2013, 7:30pm

Multiply them by 1000?

I don’t understand what you are trying to do or why you need to use “bitwise operations” especially since neither mod or division are bitwise operations.

You almost certainly do not need to do what you are trying to do but we would need to know more about the meta-problem to give a proper answer.

navycougar · March 29, 2013, 8:01pm

x & 2^n -1 == x mod 2^n
x >> n == x / 2^n
but this only works on ints.

so if I make my regions width = 2^n -1 , i can do a mod on my region size really fast.
this is all because I don’t want to search inside Rtress or B+trees for nearest regions,

I thought there was a built in scaling that changes the units of the world, but I guess i’ll just move my camera really high and scale all my models.
then i’ll send ints to the server by casting, evenb though casting float to ints is expensive, since players donc click every nano seconds, its affordable.

pspeed · March 29, 2013, 8:37pm

@navycougar said: x & 2^n -1 == x mod 2^n x >> n == x / 2^n but this only works on ints.
so if I make my regions width = 2^n -1 , i can do a mod on my region size really fast.
this is all because I don’t want to search inside Rtress or B+trees for nearest regions,

I thought there was a built in scaling that changes the units of the world, but I guess i’ll just move my camera really high and scale all my models.
then i’ll send ints to the server by casting, evenb though casting float to ints is expensive, since players donc click every nano seconds, its affordable.

I still don’t understand why you have to do the bitwise versions when you can just do the regular math.

You can scale your whole scene. Everything is the child of the root node after all. But the scene-related information wasn’t provided in the first post. At any rate, they will always be floats… they will just be bigger floats. So you’d still have to cast them to int to do bitwise ops… instead of just doing mods directly on the floats.

As for the region search, depending on where the data is coming from, it’s almost always better to just use sparse fixed size grids. Neighbor lookups are basically constant-time in that case.

navycougar · March 29, 2013, 8:56pm

thx pspeed for you fast replies, well i wanna do bitwise because i just don’t wanna use regular % aand / because they are too slow. i am doing those operations millions a time in a second, so they kinda break everything.

thanks for pointing me to “sparse fixed size grids”, do yoyu have any examples of implementation?

thanks.

toolforger · March 29, 2013, 8:57pm

You’re optimizing an operation that will happen a few hundred times per frame.
What you really want to optimize is operations that happen 100k times per frame.

Also, you’re replacing a multiplication with a truncate+bitshift. I.e. you have two operations where you formerly had one; if you’re unlucky, but might end up with slower code.

General advice is: Avoid premature optimization.
First, get it to run.
Then, make it run right.
Then, go and measure where your bottlenecks are. In the vast majority of cases, it’s not where you expected it.

Another problem with premature optimization is that any decision that you make is constraining your design choices later. If you do optimization first, you’ll be limited in what data structures you can use - and, sometimes, the data structure will be less efficient. This can result in inefficiencies that outclass the minimal performance gain from shifting by orders of magnitude.

pspeed · March 29, 2013, 9:03pm

@navycougar said: thx pspeed for you fast replies, well i wanna do bitwise because i just don't wanna use regular % aand / because they are too slow. i am doing those operations millions a time in a second, so they kinda break everything.

“they are too slow”

Where do you get this information? This hasn’t been true since the 90s at the latest. Any one or two cycle performance difference is going to be dwarfed by all of the other stuff going on. You aren’t writing assembly language, after all. I agree with toolforger, this is premature optimization of the worst kind, I think.

@navycougar said: thanks for pointing me to "sparse fixed size grids", do yoyu have any examples of implementation?

Decide on a grid size. Turn locations into cell IDs with simple math (division :-o ). Use the cell IDs as keys into a map or database or whatever (don’t know what your data is or how you are keeping it) that contains the list of objects in that cell.

navycougar · March 29, 2013, 9:13pm

thanks alot toolforger for those wise words

the problem I am facing is that I don’t want the clients to make those calculations, i want it all done by a single or a few number of machines on the cloud. (I also need to minimize the costs on the cloud so minimizing communication between machines is my main objective)

so the problem is that i’ll have to do this calculation for each player many times per frame, and that will certainly create alot of latency/reponse time problems

i am not basing this on pure suppositions, i’ve already implemented this, and created 500 players each spamming the server every 20ms.
and started to feel some latency, even though i am using my laptop as a server and simulating the cliens on my PC.

so for the moment its not usable and thought about this kind of optimization.

(iam kindof playing with the limits of the"system")

navycougar · March 29, 2013, 9:32pm

@pspeed said: "they are too slow"
Where do you get this information? This hasn’t been true since the 90s at the latest. Any one or two cycle performance difference is going to be dwarfed by all of the other stuff going on. You aren’t writing assembly language, after all. I agree with toolforger, this is premature optimization of the worst kind, I think.

Decide on a grid size. Turn locations into cell IDs with simple math (division :-o ). Use the cell IDs as keys into a map or database or whatever (don’t know what your data is or how you are keeping it) that contains the list of objects in that cell.

for the database iam using redis (noSQL key/value database).

pspeed, java hashmaps are really slow :/, they calculate the hash 2 times,
the first time for the hashcode of your object using your hashcode method and the second time ( hidden inside put) to minimize the number of collisions,

then when it gets the bucket, it accesses to the heap to retrieve the object, but the problem is that the object is stored randomly in the heap, u’ll get page faults (because you dont have spacial poximity) ETC…that’s why C#'s dictionnary is 10 times faster than java’s hashmap, its because C# dictionarry’s data is stored inside an array, by getting spatial proximity, u’ll get faster retrieval of data.
which will also slow down everything.

[java]
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with key, or
* null if there was no mapping for key.
* (A null return can also indicate that the map
* previously associated null with key.)
*/
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}

    modCount++;
    addEntry(hash, key, value, i);
    return null;
}   [/java]

pspeed · March 29, 2013, 9:53pm

@navycougar said: pspeed, java hashmaps are really slow :/, they calculate the hash 2 times, the first time for the hashcode of your object using your hashcode method and the second time ( hidden inside put) to minimize the number of collisions,
then when it gets the bucket, it accesses to the heap to retrieve the object, but the problem is that the object is stored randomly in the heap, u’ll get page faults (because you dont have spacial poximity) ETC…that’s why C#'s dictionnary is 10 times faster than java’s hashmap, its because C# dictionarry’s data is stored inside an array, by getting spatial proximity, u’ll get faster retrieval of data.
which will also slow down everything.

Yeah… you should go back to using C# and assembler then, I think. You are worried about optimizations that just don’t really matter. For a properly constructed key the hash code calculation should be fast. And it’s impossible to guess about page faults in Java since the memory is 100% managed.

Whatever you are trying to do, there is a 100x better way where these “issues” wouldn’t matter. You are worrying about the nanoseconds when you should be worrying about the milliseconds. You can optimize your “1 million times a second” or you can figure out how to turn it into “1000 times a second”. Which do you think will be faster ultimately?

I can guarantee you that a hashmap lookup will always be faster than a B-tree traversal for neighbor lookups. (where the hashing is not a degeneratively bad case)

navycougar · March 29, 2013, 10:08pm

^^ don’t worry for me pspeed, i am a java developper, I really love the JVM and want to show people what it can do, just like you.

i’m not saying java is bad, i’m just saying that the regular hashmap is bad, there are other better alternatives (in Java). And yes hashmaps are faster than BTrees O(aN) > O(log2N), and the a matters when you are manipulating millions of objects.

i’ve heard about some implementations of hashmaps using arrays

pspeed · March 29, 2013, 10:23pm

As far as big O, hashmaps/hashtables are generally considered “constant time”. O(1) for lookups. For well distributed hashes, this is pretty easy to see since it amounts to an indexed array access.

If you explain your larger problem then we can help optimize that instead of trying to micro-optimize around things that won’t matter. Otherwise, you are looking at a swimming pool and worrying about the temperature of a few drops of water.

I’m not a noob at this. I’ve been optimizing Java code since 1998 or so in some high-end microwave engineering packages. I’ve been using it for graphics almost as long… I even wrote a software based quake level renderer, all java: Paul Speed's Java Projects - JQMap …way back when.

navycougar · March 29, 2013, 10:28pm

thanks pspeed, I’m sorry, i never meant to insult/offend you or anything, i’m not english so i probably said things in a bad way,
I really love JME that’s why I decided to use it, and I always consider any advice people give me.

JQMap looks awsome!
I am not at all a 3d developper, my background is JEE, and I just want to make a really scalable server, using JMonkey as a client.
it’s not about real use case, it’s more about trying to do something that works in the extremes, I see it more like a challegenge.

but i guess wve been too far out of topic

pspeed · March 29, 2013, 11:02pm

Yeah, then you are definitely worrying about the wrong stuff. Think about your larger picture and optimize that with appropriate algorithms and data structures. Especially on the server, the difference between and int bit shift and a float mod or irrelevant.

toolforger · March 30, 2013, 12:12pm

@navycougar said: the problem I am facing is that I don't want the clients to make those calculations, i want it all done by a single or a few number of machines on the cloud. (I also need to minimize the costs on the cloud so minimizing communication between machines is my main objective)
so the problem is that i’ll have to do this calculation for each player many times per frame, and that will certainly create alot of latency/reponse time problems

i am not basing this on pure suppositions, i’ve already implemented this, and created 500 players each spamming the server every 20ms.
and started to feel some latency, even though i am using my laptop as a server and simulating the cliens on my PC.

You should locate the actual bottleneck before starting optimization.
500 players * (1 message / 20ms) = 500 * (1 message / 0.02 ms) = 500 * 50 messages/s = 25,000 messages/s. Even an Atom should be able to handle that in floating-point operations, particularly if you don’t recalculate this for every player for every message.
First thing to check: is networking latency involved in the lag you’re seeing? Does it originate on the client or on the server? You need to inspect network packets to determine that - heck, it’s possible that your network hardware drops more packets than you think and the lag you’re seeing is missing packets that your hub drops. (One typical tool to inspect network packets is Wireshark. You may want to google for alternatives if you find the learning curve of Wireshark too steep.)
Once you know what’s happening on the wire, you can determine whether it’s the server or the client.
Then check whether Java is really eating 100% on either machine; if it isn’t, the bottleneck is outside the JVM. Could be a crappy unsuitable network card, a misconfigured network stack, or simply your Windows installation tailored towards private consumer use instead of for server-style loads (MS wants to sell Server Edition after all, it wouldn’t be the first time they’re crippling their software to limit it to specific use cases).