Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8251119

Implement UTF-8 / UTF-16 conversion intrinsics on x86 and AArch64

XMLWordPrintable

    • x86, x86_64, aarch64
    • generic

      UTF-16 is the default encoding of strings in Java (whenever they can't be compact strings), but the most common encoding out there is UTF-8. It is then common for a Java application to frequently convert between these two encodings.

      The current implementation of UTF-8 / UTF-16 conversion in `sun.nio.cs.UTF_8` can be accelerated with vectorization (see "A Case Study in SIMD Text Processing with Parallel Bit Streams, UTF-8 to UTF-16 Transcoding", https://dl.acm.org/doi/pdf/10.1145/1345206.1345222 for an academic reference). We expect such acceleration to be helpful to any application encountering UTF-8 texts.

            luhenry Ludovic Henry
            luhenry Ludovic Henry
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: