ADDITIONAL SYSTEM INFORMATION :
This issue was noticed in Java8. But it was reproducible in Java10 as well.
A DESCRIPTION OF THE PROBLEM :
For a certain input string StringTokenizer is not tokenizing according to the delimiter given. For instance, when the delimiter="DELIM" and input="Text1DELIMText2|Text3", it tokenizes correctly as token1="Text1" & token2="Text2|Text3". But when input="142104DELIM500-00004|DUMMY", it tokenizes as token1="142104" & token2="500-00004|". I expect token2="500-00004|DUMMY".
In the attached source code, i have demonstrated the behavior of org.apache.commons.lang3.StringUtils & com.google.common.base.Splitter. Please note that, StringUtils also behave in the wrong way, whereas Splitter behaves in the right way.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached source code. All asserts should pass.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
When delimiter="DELIM" and input="142104DELIM500-00004|DUMMY",
expected tokens; token1="142104" & token2="500-00004|DUMMY"
ACTUAL -
actual tokens; token1="142104" & token2="500-00004|"
---------- BEGIN SOURCE ----------
import java.util.List;
import java.util.StringTokenizer;
import org.apache.commons.lang3.StringUtils;
import org.junit.Assert;
import com.google.common.base.Splitter;
public class TestToken {
public static void main(final String[] args) {
final String delim = "DELIM";
String token1 = "Text1";
String token2 = "Text2|Text3";
tokenize(token1, token2, delim);
token1 = "142104";
token2 = "500-00004|DUMMY";
tokenize(token1, token2, delim);
}
private static void tokenize(final String token1, final String token2, final String delim) {
final String input = token1 + delim + token2;
System.out.println("input=" + input);
// tokenize using guava Splitter
final List<String> tokens = Splitter.on(delim).trimResults().omitEmptyStrings().splitToList(input);
System.out.println("Splitter token1=" + tokens.get(0));
System.out.println("Splitter token2=" + tokens.get(1));
System.out.println();
Assert.assertEquals(token1, tokens.get(0));
Assert.assertEquals(token2, tokens.get(1));
// tokenize using util.StringTokenizer
final StringTokenizer tokenizer = new StringTokenizer(input, delim);
final String text1 = tokenizer.nextToken();
final String text2 = tokenizer.nextToken();
System.out.println("StringTokenizer token1=" + text1);
System.out.println("StringTokenizer token2=" + text2);
System.out.println();
Assert.assertEquals(token1, text1);
Assert.assertEquals(token2, text2);
// tokenize using apache.commons.lang3.StringUtils
final String[] split = StringUtils.split(input, delim);
System.out.println("StringUtils.split token1=" + split[0]);
System.out.println("StringUtils.split token2=" + split[1]);
System.out.println();
Assert.assertEquals(token1, split[0]);
Assert.assertEquals(token2, split[1]);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use com.google.common.base.Splitter
FREQUENCY : always
This issue was noticed in Java8. But it was reproducible in Java10 as well.
A DESCRIPTION OF THE PROBLEM :
For a certain input string StringTokenizer is not tokenizing according to the delimiter given. For instance, when the delimiter="DELIM" and input="Text1DELIMText2|Text3", it tokenizes correctly as token1="Text1" & token2="Text2|Text3". But when input="142104DELIM500-00004|DUMMY", it tokenizes as token1="142104" & token2="500-00004|". I expect token2="500-00004|DUMMY".
In the attached source code, i have demonstrated the behavior of org.apache.commons.lang3.StringUtils & com.google.common.base.Splitter. Please note that, StringUtils also behave in the wrong way, whereas Splitter behaves in the right way.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached source code. All asserts should pass.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
When delimiter="DELIM" and input="142104DELIM500-00004|DUMMY",
expected tokens; token1="142104" & token2="500-00004|DUMMY"
ACTUAL -
actual tokens; token1="142104" & token2="500-00004|"
---------- BEGIN SOURCE ----------
import java.util.List;
import java.util.StringTokenizer;
import org.apache.commons.lang3.StringUtils;
import org.junit.Assert;
import com.google.common.base.Splitter;
public class TestToken {
public static void main(final String[] args) {
final String delim = "DELIM";
String token1 = "Text1";
String token2 = "Text2|Text3";
tokenize(token1, token2, delim);
token1 = "142104";
token2 = "500-00004|DUMMY";
tokenize(token1, token2, delim);
}
private static void tokenize(final String token1, final String token2, final String delim) {
final String input = token1 + delim + token2;
System.out.println("input=" + input);
// tokenize using guava Splitter
final List<String> tokens = Splitter.on(delim).trimResults().omitEmptyStrings().splitToList(input);
System.out.println("Splitter token1=" + tokens.get(0));
System.out.println("Splitter token2=" + tokens.get(1));
System.out.println();
Assert.assertEquals(token1, tokens.get(0));
Assert.assertEquals(token2, tokens.get(1));
// tokenize using util.StringTokenizer
final StringTokenizer tokenizer = new StringTokenizer(input, delim);
final String text1 = tokenizer.nextToken();
final String text2 = tokenizer.nextToken();
System.out.println("StringTokenizer token1=" + text1);
System.out.println("StringTokenizer token2=" + text2);
System.out.println();
Assert.assertEquals(token1, text1);
Assert.assertEquals(token2, text2);
// tokenize using apache.commons.lang3.StringUtils
final String[] split = StringUtils.split(input, delim);
System.out.println("StringUtils.split token1=" + split[0]);
System.out.println("StringUtils.split token2=" + split[1]);
System.out.println();
Assert.assertEquals(token1, split[0]);
Assert.assertEquals(token2, split[1]);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Use com.google.common.base.Splitter
FREQUENCY : always