JDK should provide support for charset detection.

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Won't Fix
    • Priority: P5
    • None
    • Affects Version/s: 6
    • Component/s: core-libs

      A DESCRIPTION OF THE REQUEST :
      The ICU Unicode utilities include a CharsetDetector class that is able to scan a byte stream and detect the character encoding of character data in an unknown format. It produces a CharsetMatch that contains the name of the detected charset and indicates the level of confidence of the match. It can also guess the language of the text. This would be a very useful addition to the core Java libraries.

      JUSTIFICATION :
      Charset detection is useful when reading in text documents where the character set is unknown. It would allow the documents to be decoded correctly with a reasonable level of confidence. The language detection allows the text to be processed correctly in a language-sensitive manner.


      CUSTOMER SUBMITTED WORKAROUND :
      Bundle the ICU libraries with the application.

            Assignee:
            Xueming Shen
            Reporter:
            Nelson Dcosta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: