CharsetEncoder.canEncode(CharSequence) is much slower than necessary

XMLWordPrintable

    • Type: Bug
    • Resolution: Unresolved
    • Priority: P4
    • tbd
    • Affects Version/s: 26
    • Component/s: core-libs
    • None

      Subclasses of `CharsetEncoder` often override `canEncode(char)` in order to make it very fast. This is not the case for `canEncode(CharSequence)`, which must usually perform the full encoding process. As a result, `canEncode(CharSequence)` is about 20x slower than `canEncode(char)` when the input is encodable, and about `1600x` slower than `canEncode(char)` when the input is not encodable. The reason that performance is even slower for un-encodable input is that the internal logic is relying on a thrown exception to determine that the input cannot be encoded (requiring stack trace setup, etc).

      JMH benchmark results:

      Benchmark Mode Cnt Score Error Units
      CharsetCanEncode.asciiCanEncodeCharNo avgt 30 0.502 ± 0.004 ns/op
      CharsetCanEncode.asciiCanEncodeCharYes avgt 30 0.503 ± 0.003 ns/op
      CharsetCanEncode.asciiCanEncodeStringNo avgt 30 821.635 ± 7.055 ns/op <<
      CharsetCanEncode.asciiCanEncodeStringYes avgt 30 8.875 ± 0.115 ns/op <<
      CharsetCanEncode.iso88591CanEncodeCharNo avgt 30 0.508 ± 0.006 ns/op
      CharsetCanEncode.iso88591CanEncodeCharYes avgt 30 0.506 ± 0.004 ns/op
      CharsetCanEncode.iso88591CanEncodeStringNo avgt 30 833.165 ± 7.315 ns/op <<
      CharsetCanEncode.iso88591CanEncodeStringYes avgt 30 10.357 ± 1.427 ns/op <<
      CharsetCanEncode.iso88592CanEncodeCharNo avgt 30 0.957 ± 0.009 ns/op
      CharsetCanEncode.iso88592CanEncodeCharYes avgt 30 1.407 ± 0.010 ns/op
      CharsetCanEncode.iso88592CanEncodeStringNo avgt 30 826.478 ± 4.409 ns/op <<
      CharsetCanEncode.iso88592CanEncodeStringYes avgt 30 13.223 ± 1.479 ns/op <<
      CharsetCanEncode.shiftjisCanEncodeCharNo avgt 30 1.370 ± 0.012 ns/op
      CharsetCanEncode.shiftjisCanEncodeCharYes avgt 30 1.386 ± 0.010 ns/op
      CharsetCanEncode.shiftjisCanEncodeStringNo avgt 30 850.336 ± 20.107 ns/op <<
      CharsetCanEncode.shiftjisCanEncodeStringYes avgt 30 10.672 ± 0.088 ns/op <<
      CharsetCanEncode.utf16leCanEncodeCharNo avgt 30 0.518 ± 0.005 ns/op
      CharsetCanEncode.utf16leCanEncodeCharYes avgt 30 0.517 ± 0.005 ns/op
      CharsetCanEncode.utf16leCanEncodeStringNo avgt 30 857.907 ± 15.492 ns/op <<
      CharsetCanEncode.utf16leCanEncodeStringYes avgt 30 12.492 ± 1.444 ns/op <<
      CharsetCanEncode.utf8CanEncodeCharNo avgt 30 0.522 ± 0.008 ns/op
      CharsetCanEncode.utf8CanEncodeCharYes avgt 30 0.518 ± 0.004 ns/op
      CharsetCanEncode.utf8CanEncodeStringNo avgt 30 869.428 ± 11.116 ns/op <<
      CharsetCanEncode.utf8CanEncodeStringYes avgt 30 19.587 ± 0.190 ns/op <<

            Assignee:
            Daniel Gredler
            Reporter:
            Daniel Gredler
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: