Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4726380

Perf Bug in Arrays.sort(Object[], int, int).

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 1.4.2
    • 5.0
    • core-libs
    • None
    • mantis
    • generic
    • solaris_8
    • Verified

      There is (IMHO) a nasty performance bug in the methods

        Arrays.sort(Object[], int, int)
        Arrays.sort(Object[], int, int, Comparator)

      The javadoc says

        The sorting algorithm is a modified mergesort (in which the merge is omitted if the highest
        element in the low sublist is less than the lowest element in the high sublist). This
        algorithm offers guaranteed n*log(n) performance, and can approach linear performance
        on nearly sorted lists.

      Unfortunately, the "n" in this specification is undefined. The most
      obvious interpretation is that it would be the size of the subregion
      of the array being sorted, that is, the difference between "toIndex"
      and "fromIndex". Unfortunately, the algorithmic complexity of the
      implementation has a term that is linear in the size of the input
      array, because it does a "clone" operation! (I can tell from looking
      at the source, but a non-Sun person could tell by observing that the
      GC time differs wildly for sorting equal-sized subarrays of small and
      large arrays.)

      This is a quite pernicious problem -- a user can use this according to
      a reasonable interpretation of the spec (which *does* claim to specify
      its performance!) and get totally anomolous performance results,
      unrelated to code they can modify. For example, below I attach a
      program that exercises the bug (derived from code I was writing when I
      discovered this) where I just measured the following executions:

       trash 72 =>time java -ms24m -XX:-PrintGCDetails SortPerfBug 50 10000 10000 0
      5.53u 0.47s 0:06.11 98.1%
       trash 73 =>time java -ms24m -XX:-PrintGCDetails SortPerfBug 50 10000 10000 1
      1.04u 0.14s 0:01.21 97.5%

      The last argument value "1" means use a workaround. I could contrive
      an arbitrarily large ratio between the execution times with different
      other command lines. Here is the program:

      --------------------------------------------------
      import java.lang.*;
      import java.util.*;
      import java.lang.reflect.*;

      class SortPerfBug {

        private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) {
          if (fromIndex > toIndex)
            throw new IllegalArgumentException("fromIndex(" + fromIndex +
      ") > toIndex(" + toIndex+")");
          if (fromIndex < 0)
            throw new ArrayIndexOutOfBoundsException(fromIndex);
          if (toIndex > arrayLen)
            throw new ArrayIndexOutOfBoundsException(toIndex);
        }

        private static Object cloneSubarray(Object a, int from, int to) {
          Class c = a.getClass();
          if (!c.isArray())
            throw new IllegalArgumentException("first argument is not an array.");
          int len = Array.getLength(a);
          rangeCheck(len, from, to);
          Class compClass = c.getComponentType();
          int n = to - from;
          Object result = Array.newInstance(compClass, n);
          System.arraycopy(a, from, result, 0, n);
          return result;
        }

        public static void main(String[] arg) {
          int arr_size = Integer.parseInt(arg[0]);
          int buf_size = Integer.parseInt(arg[1]);
          int n_sorts = Integer.parseInt(arg[2]);
          int do_clone = Integer.parseInt(arg[3]);
          Integer[] ia = new Integer[arr_size + 2 * buf_size];
          for (int i = buf_size; i < buf_size + arr_size; i++) {
            ia[i] = new Integer(i);
          }

          for (int sort_num = 0; sort_num < n_sorts; sort_num++) {
            if (do_clone != 0) {
      Integer[] ia_tmp = (Integer[])cloneSubarray(ia, buf_size, buf_size+arr_size);
      Arrays.sort(ia_tmp, 0, arr_size);
      for (int i = 0; i < arr_size; i++) {
      ia[buf_size + i] = ia_tmp[i];
      }
            } else {
      Arrays.sort(ia, buf_size, buf_size+arr_size);
            }
          }
        }
      }
      --------------------------------------------

            mmcclosksunw Michael Mccloskey (Inactive)
            duke J. Duke
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: