Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6959785

UTF-8 encoding does not recognize initial BOM

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • tbd
    • 6u10, 8
    • core-libs
    • Cause Known
    • generic, x86
    • generic, windows_xp

      FULL PRODUCT VERSION :


      ADDITIONAL OS VERSION INFORMATION :
      all OS

      A DESCRIPTION OF THE PROBLEM :
      A Utf-8 stream can optionally beign with a byte order mark (see, for example http://www.unicode.org.unicode/faq/utf_bom.html). This is the character FEFF, which is represented as EF BB BF in utf-8. Java's utf-8 encoding does not recognize this character as a BOM, though; the result of reading such a stream is a set of characters bginning with FEFF.

      see bug:
      http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

      look mat the comments too.
      look at the vote number.


      REPRODUCIBILITY :
      This bug can be reproduced always.

      CUSTOMER SUBMITTED WORKAROUND :
      Application code must recognize and skip the BOM itself.

            sherman Xueming Shen
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Imported:
              Indexed: