Which "collection of values" is better?

i'm trying to solve a problem for several month now, and i narrowed down the problem to this:

i got several collections of numbers, for example 1,2,4,4,4,5 and 0,0,4,10,12. now, i have to find the "better" one, where better is defined as "contains more numbers which are higher than numbers of the other collection as the other way round".

in this case, collection a wins:

6 times against 0 (once for every number)

6 times against 0 (once for every number)

1 time against 4 (5 > 4)

never against the 10 and 12

-> 13 wins



collection b wins:

2 times because of the 4

6 times because of the 10

6 times because of the 12

-> 14 wins



so, if i randomly compare the numbers of the collections to each other for a very long time, b will get slightly more wins.



however, this calculation is done veeeeery often in my not yet working solution. is it possible to speed it up? it's not neccesary to know how often a beats b, just IF a beats b.



or is number crunching the only way to get this done?

Here is a linear time algorithm to calculate the scores. I hope it is fast enough  :wink:



    int i, j, scoreA = 0, scoreB = 0, ca, cb;
    //add an ultimate minimum to both arrays
    double[] a = new double[] { -100000, 1,2,4,4,4,5 };
    double[] b = new double[] { -100000, 0,0,4,10,12 };
    i = a.length-1;
    j = b.length-1;
    //Assume a and b are sorted!
    while( i != 0 || j != 0 )
    {
      if( a[i] > b[j] )
      {
        //System.out.println( "a: " + a[i] + " bigger than b: " + b[j] );
        scoreA += j;
        i = ( i == 0 ? 0 : i-1 );
      }
      else if( a[i] < b[j] )
      {
        //System.out.println( "b: " + b[j] + " bigger than a: " + a[i] );
        scoreB += i;
        j = ( j == 0 ? 0 : j-1 );
      }
      else //If they are equal decrement both, but keep track of them
      {
        ca = cb = 1; //How many are equal in both arrays.
        while( a[i] == a[--i] )
          ca++;
        while( b[j] == b[--j] )
          cb++;
        scoreA += j * ca;
        scoreB += i * cb;
        //System.out.println( "Found equal values: " + ca + ", " + cb );
      }
    }
    System.out.println( "Scores: " + scoreA + " vs " + scoreB );

seems to work… thx



but another problem just appeared. this algorithm i'm working on really is damn cursed, bloody cursed, and heavily cursed. however, if i ever get it done, it will throw tons of money and/or glory back at me :slight_smile:

Well, if you tell us what the problem is, perhaps we could force it throw money and glory at you 

imagine a maze.

in the maze, there are several treasure chests. every chest has a unique set of possible contents. example: chest 1 contains 20 gold coins or 100 silver coins or… or…, chest 2 contains a car or a horse…



until now, the solution is: just open all chests. but your time in the maze is limited. you cannot teleport yourself, so by the choice of a chest, you limit the remaining choices until time runs out.

what i figured out so far is how to compute the path which gives the maximum average score, but in my case it's not the best solution. it's a 2-player-game, the other player gets his own (equal) maze. the winner is the one which gets a higher score than the other player. it doesn't matter how big the score difference is - so my idea was to identify all unique situations (you can end up in the same situation by more than one path) and simply brute force my way through and count which values i can get so i can compare them and choose the path which leads to the highest collection of possible values. the remaining problem is that the maze is really big (a few million unique situations). my corei7@3.6ghz can compute the average expectation values of 2k situations per second (fast enough to get it done within hours), but now i have to use collections of values instead of a single one.

which means i have to optimize the hell out of my code to make it finish the calculations before i'm dead :slight_smile:






Given that the problem is random in nature, you need to settle for optimizing average payoff, necessarily… or do something with optiizing average and minimizing variance or whatever, but for now let's just assume that the payoff of each chest is fixed (i.e. the average of the possible payoffs inside). This is a classic Dynamic Programming problem.



It can be solved by building up a solution from partial optimal solutions. Asumme, for instance, that you have a step count (i.e. how many steps you can take before the time runs out), and you have a matrix of distances between all chests (telling you how much it would take to go from one chest to another). If you could compute the best solution (payoff) of the maze for a particular number of steps and a particular initial position, then you would have an overall optimal solution.



So start building up your optimal solution as follows. Set your steps = 1. For each initial position, calculate what is the best solution (Here I am assuming you start at a chest, or at a dummy chest with zero payoff, for instance) you can make in one step… Then bump up the number of steps by one and for each starting position compare what places you can go with two steps, versus trying to go to each place you can with one step and checking the best solution you can do from there… Pick the best for each starting condition. Bump up the number of steps again, and repeat… The solution is O(#chests * #steps) or something similar.



I obviously left many details out, but you should get the idea… This is a very nice problem, indeed, but has a (reasonably) fast solution  :wink:

I m curious if this worked for you.  :?

i think i am doing what you described. i made a little "solving problems of this kind"-framework.

at first i need to identifiy all unique situations. then, i sort them into "layers". layer one gets all situations which can reached in one step, layer 2 contains all which can reached in 2 steps and so on. just backwards. so layer one actually contains all possible "last steps" in an otherwise completely traversed maze. this way, i do calculate a lot of situations that can never be reached, but it solved the problem of finding those by ignoring the fact that they can't :smiley:

this is my solution. i used duenez suggestion and adjusted it for my needs. in my case, there are not many different possible scores, but they appear quite a few times. i'm not sure about the exact numbers, probably more than measurable in billions. so, instead of storing the numbers themselves, i store the numbers of their occurences. for example, if the 5 can be scored 10 times, m_data[5] is 10.


private static class Scores implements Serializable, Cloneable
  {
    private int[] m_data;

    public Scores(final int p_size)
    {
      super();
      m_data = new int[p_size];
    }

    public Scores clone()
    {
      final int[] l_copy = Arrays.copyOf(m_data, m_data.length);
      try
      {
        return ((Scores) super.clone()).replaceArray(l_copy);
      }
      catch (CloneNotSupportedException e)
      {
        throw new ThisShouldNeverHappen(e);
      }
    }

    private Scores replaceArray(final int[] p_copy)
    {
      m_data = p_copy;
      return this;
    }

    public void add(int p_score)
    {
      m_data[p_score]++;
    }

    public void addAll(int... p_score)
    {
      for (int i : p_score)
      {
        add(i);
      }
    }

    public int countHigherElements(final Scores p_other)
    {
      int l_ownCursor = getFirstOccurence();
      int l_otherCursor = p_other.getFirstOccurence();

      int l_ownWins = 0;
      int l_ownImplicitWinsIfBigger = 0;

      while (l_ownCursor < m_data.length && l_otherCursor < p_other.m_data.length)
      {
        if (l_ownCursor > l_otherCursor)
        {
          do
          {
            l_ownImplicitWinsIfBigger += p_other.m_data[l_otherCursor];
            l_otherCursor = p_other.getNextOccurence(l_otherCursor);
          }
          while (l_ownCursor > l_otherCursor);
          do
          {
            l_ownWins += l_ownImplicitWinsIfBigger * m_data[l_ownCursor];
            l_ownCursor = getNextOccurence(l_ownCursor);
          }
          while (l_ownCursor <= l_otherCursor && l_ownCursor < m_data.length);
        }
        else
        {
          l_ownCursor = getNextOccurence(l_ownCursor);
        }
      }

      return l_ownWins;
    }


    public int getFirstOccurence()
    {
      for (int i = 0; i < m_data.length; i++)
      {
        if (m_data[i] > 0)
        {
          return i;
        }
      }
      return 0;
    }

    public int getNextOccurence(final int p_beginAt)
    {
      for (int i = p_beginAt + 1; i < m_data.length; i++)
      {
        if (m_data[i] > 0)
        {
          return i;
        }
      }
      return m_data.length;
    }

    public boolean isHigherThan(final Scores p_scores)
    {
      return countHigherElements(p_scores) > p_scores.countHigherElements(this);
    }

    public void addAll(final Scores p_scores)
    {
      for (int i = 0; i < m_data.length; i++)
      {
        m_data[i] += p_scores.m_data[i];
      }
    }

    public void mult(final int p_occurences)
    {
      for (int i = 0; i < m_data.length; i++)
      {
        m_data[i] *= p_occurences;
      }
    }

    public void addEachElementToAll(final Scores p_scores)
    {
      int[] l_tmp = new int[m_data.length];
      for (int i = 0; i < m_data.length; i++)
      {
        int l_count = m_data[i];
        if (l_count > 0)
        {
          p_scores.addToAll(l_count, i, l_tmp);
        }
      }
      m_data = l_tmp;
    }

    private void addToAll(final int p_count, final int p_value, final int[] p_storeIn)
    {
      for (int i = m_data.length; --i >= p_value;)
      {
        if (m_data[i - p_value] > 0)
        {
          p_storeIn[i] += m_data[i - p_value] * p_count;
        }
      }
    }
  }


  public static class TestScoreCounter extends TestCase
  {
    public void testSimple()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(100, 80, 60, 40, 2, 1);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(70, 20, 10, 8);

      assertEquals(10, l_score2.countHigherElements(l_score1));
      assertEquals(14, l_score1.countHigherElements(l_score2));
    }

    public void testDoubleElements()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(100, 80, 60, 40, 2, 1);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(70, 20, 10, 8, 8, 8);

      assertEquals(14, l_score2.countHigherElements(l_score1));
      assertEquals(22, l_score1.countHigherElements(l_score2));
    }

    public void testEqualElements()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(100, 80, 70, 60, 40, 2, 1);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(70, 20, 10);

      Scores l_score3 = new Scores(375);
      l_score3.addAll(80, 70, 20, 10);

      Scores l_score4 = new Scores(375);
      l_score4.addAll(80, 70, 70, 70, 70, 20, 10);

      Scores l_score5 = new Scores(375);
      l_score5.addAll(1, 1, 2, 2, 3, 3, 4, 4, 5, 5);

      assertEquals(8, l_score2.countHigherElements(l_score1));
      assertEquals(12, l_score1.countHigherElements(l_score2));
      assertEquals(6, l_score3.countHigherElements(l_score2));
      assertEquals(9, l_score3.countHigherElements(l_score4));
      assertEquals(12, l_score4.countHigherElements(l_score3));
      assertEquals(15, l_score4.countHigherElements(l_score4));
      assertEquals(40, l_score5.countHigherElements(l_score5));
    }


    public void testOneElement()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(100);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(110, 100, 10);

      Scores l_score3 = new Scores(375);
      l_score3.addAll(101, 99);

      assertEquals(1, l_score2.countHigherElements(l_score1));
      assertEquals(1, l_score1.countHigherElements(l_score2));
      assertEquals(3, l_score3.countHigherElements(l_score2));
    }

    public void testEvilCases()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(100);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(1, 2, 3);

      Scores l_score3 = new Scores(375);
      l_score3.addAll(101, 102, 103);

      assertEquals(3, l_score1.countHigherElements(l_score2));
      assertEquals(3, l_score3.countHigherElements(l_score1));
      assertEquals(0, l_score2.countHigherElements(l_score3));
      assertEquals(9, l_score3.countHigherElements(l_score2));
    }

     public void testMergeTwoLevels()
    {
      Scores l_score1 = new Scores(375);
      l_score1.addAll(1,2,2,3,3,3);

      Scores l_score2 = new Scores(375);
      l_score2.addAll(10,20,20,30,30,30);

      l_score1.addEachElementToAll(l_score2);
      System.out.println("");
    }
  }



the unittests are working :)

however, i wonder if it's possible to optimize the isHigherThan-method. why? right now, it counts how many elements of a collection are higher than elements of the other, but the actual number is not important. i only need to now IF, not HOW MUCH better one collection is. can it be optimized? the method is my last bottleneck. if this is solved, i have an abstract strategy generator for any type of single player "treasure chest maze" game. i even went to so far to abstract the score type away, so i can switch max wins with max median, max average and so on... *proud*