Summary
Introduce entab and detab instance methods to java.lang.String.
Problem
Tab (U+0009)
characters have long been problematic with developers in
the fact that they can have different sematics depending on context. In
general terms a tab is interpreted as leave enough white space to
position the next character at the next tab stop. In graphic terms,
that usually means position the graphic cursor at the next horizontal
point described in a collection of stop points.
For developers working with source or data text usually displayed in fixed width fonts, a tab most often means advance to the next character position that is modulo n == 0. What differs between developers and contexts in these scenarios is the choice of n (ex. 4 or 8).
The introduction of raw string literals in JDK 12 brings this issue to the forefront in the fact that raw string literals containing tabs may not be interpreted as the developer intended.
Solution
Introduce a String instance method detab
that replaces tab (U+0009)
characters with sufficient space (U+0020) characters to align with the
tab stops intended by the developer.
Also, introduce the inverse String instance method entab
that replaces
space (U+0020) characters with tab (U+0009) characters to align with the
tab stops intended by the developer.
Example:
String s = "a\tb\tc\td".detab(4);
String t = s.entab(4);
Result:
a b c d
a\tb\tc\td
Example:
String r = `
abc
def
ghi
`.detab(8) // remove tabs in source before aligning
.align();
Result:
abc
def
ghi
Specification
/**
* Expands all tab (U+0009) code points within this string with sufficient
* space (U+0020) code points such that the resulting string is visually
* uneffected when displayed on a device with fixed tab settings of
* {@code tabWidth}.
* <blockquote><pre>
* Example:
* String withTabs = "a\tbc\tdef\tghij\tklmno\tpqrstuvwxyz";
* String withSpaces = withTabs.detab(4);
* System.out.println(withSpaces);
* System.out.println("^ ".repeat(8));
*
* Result:
* a bc def ghij klmno pqrstuvwxyz
* ^ ^ ^ ^ ^ ^ ^ ^
* </pre></blockquote>
* <p>
* Tab and space code points at the end of lines are removed.
*
* @param tabWidth number of code points between tab stops
*
* @return this string with tabs replaced with spaces
*
* @throws IllegalArgumentException if tabWidth is less than or equal to zero.
*
* @since 12
*/
public String detab(int tabWidth) {
/**
* Reduces space (U+0020) code points within this string with tab (U+0009)
* code points whenever the resulting string is shorter but visually
* uneffected when displayed on a device with fixed tab settings of {@code
* tabWidth}.
* <blockquote><pre>
* Example:
* String withSpaces = "a bc def ghij klmno pqrstuvwxyz";
* String withTabs = withSpaces.entab(4);
* System.out.println(withTabs.replace("\t", "\\t"));
*
* Result:
* a\tbc\tdef ghij\tklmno\tpqrstuvwxyz
* </pre></blockquote>
* <p>
* Tab and space code points at the end of lines are removed.
*
* @param tabWidth number of code points between tab stops
*
* @return this string with a subset of spaces replaced with tabs
*
* @throws IllegalArgumentException if tabWidth is less than or equal to zero.
*
* @since 12
*/
public String entab(int tabWidth) {
- csr of
-
JDK-8210717 String::detab, String::entab
-
- Closed
-