Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4106810

java.io.StreamTokenizer cannot parse "/" alphanumeric and C/C++ comments simulta

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 1.2.0
    • 1.1.4
    • core-libs
    • 1.2beta4
    • generic
    • generic
    • Not verified



      Name: dgC58589 Date: 01/26/98


      java.io.StreamTokenizer should be able to parse
      "/" as a word constituent and strip C and/or C++
      comments simultaneously.

      My application is parsing ascii files containing
      market data with "/"-delimited dates; I can think
      of others. The documentation for the
      StreamTokenizer class is inadequate and gives no
      hint that this won't work.

      "/" dhould be allowed to be a word constituent, so date
      strings like "1/16/98" get parsed as words. Otherwise, if "/" is an
      ordinary character (and " " is white space), there's no way to tell the
      difference between "1/1" and "1 / 1", "1/ 1", or "1 /1".

      The clause in the main loop of the tokenizer that begins
      "if ((ctype & CT_ALPHA) != 0)", which parses words, appears before the
      one that
      begins "if c == '/' && (slashSlashCommentsP...", so if "/" is set to be
      a word constituent, there's no way the tokenizer can possibly parse C or
      C++ comments. Anyway, I've already got a fix for the source code. I can
      send it to you if you're interested.


      Fix diff against the JDK 1.1.5 FCS sosurce code

      567,578c567
      <
      < // +++ Modified segment begins here:
      < //
      < if (specialSlash(c)) {
      < return nextToken();
      < }
      < buf[0] = (char) c;
      < c = peekc;
      < int i = 1;
      < //
      < // +++ Modified segment ends here.
      <
      ---
      > int i = 0;
      667,678d655
      <
      < // +++ Modified code segment begins here:
      < //
      < if (specialSlash(c)) {
      < return nextToken();
      < }
      < //
      < // +++ Modified code segment ends here.
      < return ttype = c;
      < }
      <
      < private boolean specialSlash(int c) throws java.io.IOException {
      696,700c673,674
      < if (c < 0) {
      < String s =
      < "reached eof while parsing C-style comment";
      < throw new RuntimeException(s);
      < }
      ---
      > if (c < 0)
      > return ttype = TT_EOF;
      704c678
      < return true;
      ---
      > return nextToken();
      708c682
      < return true;
      ---
      > return nextToken();
      711c685
      < return false;
      ---
      > return ttype = '/';
      713,715d686
      < } else {
      < peekc = read();
      < return false;
      716a688,689
      > peekc = read();
      > return ttype = c;

      (Review ID: 23516)
      ======================================================================

      mircea.oancea@canada 1998-02-25

      More information from the client receieved on Fri, 20 Feb 1998 18:41:39

      We've been parsing lots of files with my modified version of
      StreamTokenizer, and we found a bug. My "fix" resulted in a failure to
      parse one-character words. (The character after the first word
      constituent was always treated as a word constituent.) The attached
      version adds one more line and changes a "do" to a "while". It could
      still use more testing (especially since I've only tested the features I
      need).

      THe following diff is from the patched version obtained from the original
      one with the above patch applied.

      574d573
      < c = peekc;
      576,577c575,576
      < //
      < // +++ Modified segment ends here.
      ---
      > c = peekc;
      > ctype = c < 0 ? CT_WHITESPACE : c < 256 ? ct[c] : CT_ALPHA;
      579c578
      < do {
      ---
      > while ((ctype & (CT_ALPHA | CT_DIGIT)) != 0) {
      588c587,589
      < } while ((ctype & (CT_ALPHA | CT_DIGIT)) != 0);
      ---
      > }
      > //
      > // +++ Modified segment ends here.

            zlisunw Zhenghua Li (Inactive)
            dgrahamcsunw David Graham-cumming (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: