Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4174436

Correct Norwegian and Danish sort orders

XMLWordPrintable

    • tiger
    • generic
    • generic



      Name: bb33257 Date: 09/17/98


      We need to modify our collation orders for Norwegian and Danish
      to correspond with the official national standards for those
      countries. According to the ISO representatives for those
      countries, the correct rules are as follows:

      Some official wordings taken from the new ISO/IEC 15897 cultural
      register on Danish is:


         Ordering in Danish is defined in Danish Standard DS 377,
         3rd edition (1980) and the Danish Orthography Dictionary
         ("Retskrivningsordbogen", 2. udgave, Aschehoug, København
         1996. ISBN 87-11-10000-1).

         Normal <a> to <z> ordering is used on the Latin script, except
         for the following letters: The letters <æ> <ø> <å> are
         ordered as 3 separate letters after <z>. <ü> is ordered as <y>,
         <ä> as <æ>, <ö> as <ø>, <ð> as <d>, <þ> as <t><h>, French <?>
         as <o><e>. Two <a>s are ordered as <å>, except when denoting two
         sounds (which is normally the case only in combined words).
         Nonaccented letters come before accented letters, and capital
         letters come before small letters, when words otherwise compare
         equally. There is no explicit ordering of accents specified
         in "Retskrivningsordbogen", and whether case or accents
         are the most important is not specified.

      Data from the ISO/IEC (and CEN) cultural register is available
      at http://www.dkuug.dk/cultreg/

      -----------------------------------------------------------
      Norwegian ordering is as follows

      Aa, Bb, Cc....,Yy:Üü, Vv..., Zz,Ææ:Ää, Øø:Öö,Åå:<Aa><aa>.

      Æ have the name LATIN LETTER AE ...(Ash) in 10 646 by the way....
      Notice that a double a (Aa or aa) is ordered as Å and å. Å replaced Aa in
      the 1917 Norwegian writing reform. The same happened in Danish in 1948.

      Æ can be displayed AE, Ø can be displayed OE and Å can be displayed as AA
      in 7-bit ASCII when there are no alternative ways of displying them. AE and
      OE are not legitimate variants of Æ and Ø as such and should therefore be
      regarded as separate characters when written as separate characters.

      I have been representing Norway in the JTC1/SC2 in the ISO/IEC 10 646 work.
      Kolbjørn Aambø,
      University of Oslo Library.


      ======================================================================


      Name: rlT66838 Date: 07/20/99


      For Danish (da_DK) and Norwegian users (no_NO, no_NO_B), Java sorts improperly:
      When sorting by catalog products' descriptions, v and w are treated as the same character (have the same weighting).
      Example: 1. waffle
      2. verkehrt
      3. Victor
      4. wood
      5. vox
      6. wrench
      The correct result is to have v and w have different primary weightings. This desired behavior has been confirmed by a native Norwegian in our company.

      //Get the Collator for no_NO (or da_DK or no_NO_B) and set its strength to
      PRIMARY
      Collator no_NO_Collator = Collator.getInstance(Locale.no_NO);
      no_NO_Collator.setStrength(Collator.PRIMARY);
      if( no_NO_Collator.compare("waffle", "vaffle") == 0 )
      {
      System.out.println("Strings are equivalent");
      }

      The (incorrect) result will be that the "Strings are equivalent". This is
      the case for the locales: no_NO, no_NO_B, da_DK. This is the case for
      Collator.PRIMARY and Collator.SECONDARY. The difference only shows up for
      Collator.TERTIARY.

      The correct answer is that for the letter "w" and "v", Collator.PRIMARY and
      Collator.SECONDARY should be viewed as different characters. This is the
      correct behavior in Danish, Norwegian, as described by our native Norwegian
      employee.
      (Review ID: 85477)
      ======================================================================

            kcolfersunw Kieran Colfer (Inactive)
            bcbeck Brian Beck (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: