Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6714245

[Col] Collator - Faster Comparison for identical strings.

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Fixed
    • P5
    • 21
    • 6u10, 21
    • core-libs
    • b23
    • generic
    • generic

    Description

      A DESCRIPTION OF THE REQUEST :
      Collator (RuleBasedCollator?) does not check the compared strings for == before going into the complex rules for comparison.

      JUSTIFICATION :
      It seems silly to compare every character of a string to itself to determine what will amount to equality. Collator .getInstance() is called from TableRowSorter for all String sorting. JTables do often have duplicated data. The test program shows a significant speed increase. Although Collator is still very slow compared to String.compateTo() or String.CASE_INSENSITIVE_ORDER. I assume this is because of Unicode support. It would be great if the US/English Collator was as fast as String.compateTo().

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Collator should check for == before moving on to the complicated comparisons.

      ---------- BEGIN SOURCE ----------
      package ScaleModels;

      import java.text.Collator;
      import java.util.ArrayList;
      import java.util.Collections;
      import java.util.Comparator;

      public class CollatorTest {

      /**
      * @param args
      */
      public static void main(String[] args) {

      System.out.println("ComparableComparator Based Sort:");
      float startTime = System.nanoTime();
      Collections.sort(makeStateList());
      float endTime = System.nanoTime();
      System.out.println("Total Elapsed Time: " + (endTime - startTime));

      System.out.println("\nString.CASE_INSENSITIVE_ORDER Based Sort:");
      startTime = System.nanoTime();
      Collections.sort(makeStateList(), String.CASE_INSENSITIVE_ORDER);
      endTime = System.nanoTime();
      System.out.println("Total Elapsed Time: " + (endTime - startTime));

      System.out.println("\nCollator w/ == check Based Sort:");
      startTime = System.nanoTime();
      Collections.sort(makeStateList(), compareTestEq);
      endTime = System.nanoTime();
      System.out.println("Total Elapsed Time: " + (endTime - startTime));

      System.out.println("\nCollator Based Sort:");
      startTime = System.nanoTime();
      Collections.sort(makeStateList(), Collator.getInstance());
      endTime = System.nanoTime();
      System.out.println("Total Elapsed Time: " + (endTime - startTime));
      }

      private static ArrayList<String> makeStateList() {
      String state1 = "Mississippi";
      String state2 = "New Mexico";
      String state3 = "New Jersey";

      ArrayList<String> listOfStates = new ArrayList<String>();
      for (int i = 0; i < 100000; i++) {
      switch (i % 3) {
      case 1:
      listOfStates.add(state1);
      break;
      case 2:
      listOfStates.add(state2);
      break;
      case 3:
      listOfStates.add(state3);
      break;
      }
      }
      return listOfStates;
      }

      private static Comparator<String> compareTestEq = new Comparator<String>() {

      private Comparator<Object> base = Collator.getInstance();

      @Override
      public int compare(String o1, String o2) {
      if (o1 == o2) {
      return 0;
      }
      return base.compare(o1, o2);
      }

      };

      }

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      private Comparator<String> compareTestEq = new Comparator<String>(){

      @Override
      public int compare(String o1, String o2) {
      if(o1 == o2){
      return 0;
      }
      return Collator.getInstance().compare(o1, o2);
      }

      };

      Attachments

        Issue Links

          Activity

            People

              jlu Justin Lu
              ndcosta Nelson Dcosta (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: