Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8176371

(scanner) Scanner fails when string length equals buffer size and latest characters are the delimiter

    XMLWordPrintable

Details

    • generic
    • generic

    Description

      FULL PRODUCT VERSION :


      ADDITIONAL OS VERSION INFORMATION :
      Microsoft Windows 8.x

      A DESCRIPTION OF THE PROBLEM :
      I've found a strange behaviour of java.util.Scanner class. I tried to split a String variable into a set of tokens separated by the delimiter ";" using a Scanner variable.

      If I consider a string of "<any_char>[*1022]" + ";[*n]" I expect that Scanner returns a number n of token. However, when n=3, the Scanner class fails: it "see" just 2 tokens instead of 3. I think it's something related to internal char buffer size of Scanner class (1024 characters) and I've found this issue only if the last characters are exacly the delimiter set for the Scanner variable.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Generate a string of composed by 2 parts:
      1- 1022 random characters (even the delimiter)
      2- an ending set of 3 characters exactly the same as the delimiter set (in my case ";;;")

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      If I consider a string of "a[*1022]" + ";[*n]" I expect a number n of token. However if n=3 the Scanner class fails: it "see" just 2 tokens instead of 3. I think it's something related to internal char buffer size of Scanner class.

      a[x1022]; -> 1 token

      a[x1022];; -> 2 token

      a[x1022];;; -> 3 token

      a[x1022];;;; -> 4 token
      ACTUAL -
      a[x1022]; -> 1 token: correct

      a[x1022];; -> 2 token: correct

      a[x1022];;; -> 2 token: wrong (I expect 3 tokens)

      a[x1022];;;; -> 4 token: correct

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      I attach a simple example:

      import java.util.Scanner;

      public static void main(String[] args) {

          // generate test string: (1022x "a") + (3x ";")
          String testLine = "";
          for (int i = 0; i < 1022; i++) {
              testLine = testLine + "a";
          }
          testLine = testLine + ";;;";

          // set up the Scanner variable
          String delimeter = ";";
          Scanner lineScanner = new Scanner(testLine);
          lineScanner.useDelimiter(delimeter);
          int p = 0;

          // tokenization
          while (lineScanner.hasNext()){
                  p++;
                  String currentToken = lineScanner.next();
                  System.out.println("token" + p + ": '" + currentToken + "'");
          }
          lineScanner.close();
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Using String .split method

      Attachments

        Issue Links

          Activity

            People

              sherman Xueming Shen
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: