Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8020596

Initialization of white space strings in scanner should be done with \u strings

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 8
    • 8
    • core-libs
    • None
    • b102
    • generic
    • generic

      Iterating through all unicode characters seems inefficient.

          static {
              final StringBuilder ws = new StringBuilder();
              final StringBuilder wsEOL = new StringBuilder();
              final StringBuilder wsRegExp = new StringBuilder();
              final StringBuilder jsonWs = new StringBuilder();

              jsonWs.append((char)0x000a);
              jsonWs.append((char)0x000d);
              JSON_WHITESPACE_EOL = jsonWs.toString();

              jsonWs.append((char)0x0009);
              jsonWs.append((char)0x0020);
              JSON_WHITESPACE = jsonWs.toString();

              for (int i = 0; i <= 0xffff; i++) {
                 switch (i) {
                  case 0x000a: // line feed
                  case 0x000d: // carriage return (ctrl-m)
                  case 0x2028: // line separator
                  case 0x2029: // paragraph separator
                      wsEOL.append((char)i);
                  case 0x0009: // tab
                  case 0x0020: // ASCII space
                  case 0x000b: // tabulation line
                  case 0x000c: // ff (ctrl-l)
                  case 0x00a0: // Latin-1 space
                  case 0x1680: // Ogham space mark
                  case 0x180e: // separator, Mongolian vowel
                  case 0x2000: // en quad
                  case 0x2001: // em quad
                  case 0x2002: // en space
                  case 0x2003: // em space
                  case 0x2004: // three-per-em space
                  case 0x2005: // four-per-em space
                  case 0x2006: // six-per-em space
                  case 0x2007: // figure space
                  case 0x2008: // punctuation space
                  case 0x2009: // thin space
                  case 0x200a: // hair space
                  case 0x202f: // narrow no-break space
                  case 0x205f: // medium mathematical space
                  case 0x3000: // ideographic space
                  case 0xfeff: // byte order mark
                      ws.append((char)i);

                      wsRegExp.append(Lexer.unicodeEscape((char)i));
                      break;

                  default:
                      break;
                  }
              }

              JAVASCRIPT_WHITESPACE = ws.toString();
              JAVASCRIPT_WHITESPACE_EOL = wsEOL.toString();
              JAVASCRIPT_WHITESPACE_IN_REGEXP = wsRegExp.toString();

          }

            jlaskey Jim Laskey
            jlaskey Jim Laskey
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: