Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8356979

Convert unicode sequences in tests to UTF-8

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • 25
    • 25
    • infrastructure
    • None
    • In Review

      After we converted the source base to be fully UTF-8, we do not need to use unicode sequences (like \u0123) in string literals, but are free to replace them with real UTF-8 characters.

      Whether that makes sense or not depends very much on the actual circumstatnces. In contrast to the sibling patch JDK-8356978 (Convert unicode sequences in Java source code to UTF-8) which deals with the `src` directory, and where basically all sequences made sense to convert, the situation for the `test` directory is very different.

      First of all, there are a lot more non-ASCII Unicode characters, due to the need to be able to test with these kinds of characters. Secondly, in many cases the unicode characters are contrived and supposed to provoke a specific behavior, rather than to be readable text.

      I did an automatic conversion of all unicode characters to UTF-8 in the test files, and then I went through the result and immediately reverted most of the changes. If at first glance something did not make sense, it was reverted without pardon. I then made several passes at the remaining files. In the end, I kept those changes where I believe the readability of the test is improved by having real UTF-8 characters rather than abstract unicode sequences. Since this is a judgement call, opinions may vary.

            ihse Magnus Ihse Bursie
            ihse Magnus Ihse Bursie
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated: