After jdk7u(JDK-4513622), java.lang.String.substring() returns a brand-new string object in case applications hold long String objects. We understand the rationale behind this decision and agree with it. On the other side, we observe that substrings sometimes take up a noticeable proportion in runtime allocation. eg. SpecJVM2008/xml.validation spends 1% cpu time and 6% allocation events on java.lang.String.substring().
We found JIT compilers can optimize user applications for some certain patterns because JRE has provided alternative APIs to do so.
consider the following code,
public static boolean foo(String s) {
return s.substring(1).startsWith("a");
}
foo() above can be transformed to this form.
public static boolean foo(String s) {
if("a".length() > s.length() - 1) return false;
return s.startsWith(prefix, 1);
}
The overloaded version of startsWith(String prefix, int toffset) can reduce the reference of substring. Hopefully, allocation and copyArray introduced by substring() can be wiped out.
There're more similar cases can be found here.
https://github.com/navyxliu/StringFunc/blob/master/note/substr_opt.md
This issue just focuses on the first idiomatic pattern. Once we solve it, it's easy to extend it to cover more cases.
We believe this api-level transformation can help a variety of parsers, tokenizers, textual data representations, such as json, URL/HTTP etc.
We found JIT compilers can optimize user applications for some certain patterns because JRE has provided alternative APIs to do so.
consider the following code,
public static boolean foo(String s) {
return s.substring(1).startsWith("a");
}
foo() above can be transformed to this form.
public static boolean foo(String s) {
if("a".length() > s.length() - 1) return false;
return s.startsWith(prefix, 1);
}
The overloaded version of startsWith(String prefix, int toffset) can reduce the reference of substring. Hopefully, allocation and copyArray introduced by substring() can be wiped out.
There're more similar cases can be found here.
https://github.com/navyxliu/StringFunc/blob/master/note/substr_opt.md
This issue just focuses on the first idiomatic pattern. Once we solve it, it's easy to extend it to cover more cases.
We believe this api-level transformation can help a variety of parsers, tokenizers, textual data representations, such as json, URL/HTTP etc.