A new API Path.getExtension() was added by JDK-8057113. This works, but the return value is nullable, which can lead to errors. This can be avoided by making an API change. Since this API was just added in JDK 20, we still have the opportunity to fix it before JDK 20 ships. Therefore, I'm filing this as a P3 bug.
The getExtension() return value comprises three cases:
1) the extension is present and non-empty, e.g. "a/b/c/foo.jpg"
2) the extension is present but empty, e.g. "a/b/c/foo."
3) the extension is absent, e.g. "a/b/c/foo"
As currently defined, the method returns the following for the different cases:
1) "jpg"
2) "" (empty string)
3) null
The proposal is to change the return values for these cases as follows:
1) ".jpg"
2) "." (string consisting of a single period)
3) "" (empty string)
Essentially, the period character "." will now be included in the return value when an extension is present.
There is ample precedent for this inclusion. A survey of a broad variety of platforms reveals quite a split between period-included and period-excluded arrangements; neither dominates. Python is an example of another platform that includes the period in its notion of a file extension. For example:
==========
>>> from os.path import splitext
>>> splitext('a/b/c/foo')[1]
''
>>> splitext('a/b/c/foo.')[1]
'.'
>>> splitext('a/b/c/foo.jpg')[1]
'.jpg'
==========
The advantage of making this change is that the return value of getExtension() is never null. This helps avoid NullPointerExceptions. (Using Optional was previously considered and rejected, as it proved to be quite clumsy in actual usage. Also, nothing else in nio uses Optional, so it would be an outlier.) In addition, adjusting the API to include the period in the return value makes certain use cases work more smoothly. For example, consider removing the extension from a path string:
==========
// CURRENT
String removeExtension1(Path p) {
String s = p.toString();
String x = p.getExtension();
return s.substring(0, s.length() - (x == null ? 0 : 1 + x.length()));
}
// MODIFIED
String removeExtension2(Path p) {
String s = p.toString();
return s.substring(0, s.length() - p.getExtension().length());
}
==========
In the current API, the extension must be null-checked, and if it's non-null, the length of the substring to be removed must be incremented by one in order to include the period. The modified API avoids both of these problems.
Another example is obtaining a list of files that have a known extension. This can be done as follows:
==========
// CURRENT
List<Path> listJPGfiles1(Path dir) throws IOException {
try (var s = Files.list(dir)) {
return s.filter(p -> "jpg".equalsIgnoreCase(p.getExtension()))
.toList();
}
}
// MODIFIED
List<Path> listJPGfiles2(Path dir) throws IOException {
try (var s = Files.list(dir)) {
return s.filter(p -> p.getExtension().equalsIgnoreCase(".jpg"))
.toList();
}
}
==========
Using the current API, one can avoid NPEs by using "Yoda conditions", that is, making the string literal "jpg" be the receiver of an equals() or equalsIgnoreCase() call. This is effective but it's easy to forget to do this, which can lead to NPEs. Even when this technique is used, it's often disliked by developers, who find it reads unnaturally.
A CSR for this bug will be filed to cover the incremental specification changes compared to the previous CSR.
The release noteJDK-8297160 will also need to be updated.
The getExtension() return value comprises three cases:
1) the extension is present and non-empty, e.g. "a/b/c/foo.jpg"
2) the extension is present but empty, e.g. "a/b/c/foo."
3) the extension is absent, e.g. "a/b/c/foo"
As currently defined, the method returns the following for the different cases:
1) "jpg"
2) "" (empty string)
3) null
The proposal is to change the return values for these cases as follows:
1) ".jpg"
2) "." (string consisting of a single period)
3) "" (empty string)
Essentially, the period character "." will now be included in the return value when an extension is present.
There is ample precedent for this inclusion. A survey of a broad variety of platforms reveals quite a split between period-included and period-excluded arrangements; neither dominates. Python is an example of another platform that includes the period in its notion of a file extension. For example:
==========
>>> from os.path import splitext
>>> splitext('a/b/c/foo')[1]
''
>>> splitext('a/b/c/foo.')[1]
'.'
>>> splitext('a/b/c/foo.jpg')[1]
'.jpg'
==========
The advantage of making this change is that the return value of getExtension() is never null. This helps avoid NullPointerExceptions. (Using Optional was previously considered and rejected, as it proved to be quite clumsy in actual usage. Also, nothing else in nio uses Optional, so it would be an outlier.) In addition, adjusting the API to include the period in the return value makes certain use cases work more smoothly. For example, consider removing the extension from a path string:
==========
// CURRENT
String removeExtension1(Path p) {
String s = p.toString();
String x = p.getExtension();
return s.substring(0, s.length() - (x == null ? 0 : 1 + x.length()));
}
// MODIFIED
String removeExtension2(Path p) {
String s = p.toString();
return s.substring(0, s.length() - p.getExtension().length());
}
==========
In the current API, the extension must be null-checked, and if it's non-null, the length of the substring to be removed must be incremented by one in order to include the period. The modified API avoids both of these problems.
Another example is obtaining a list of files that have a known extension. This can be done as follows:
==========
// CURRENT
List<Path> listJPGfiles1(Path dir) throws IOException {
try (var s = Files.list(dir)) {
return s.filter(p -> "jpg".equalsIgnoreCase(p.getExtension()))
.toList();
}
}
// MODIFIED
List<Path> listJPGfiles2(Path dir) throws IOException {
try (var s = Files.list(dir)) {
return s.filter(p -> p.getExtension().equalsIgnoreCase(".jpg"))
.toList();
}
}
==========
Using the current API, one can avoid NPEs by using "Yoda conditions", that is, making the string literal "jpg" be the receiver of an equals() or equalsIgnoreCase() call. This is effective but it's easy to forget to do this, which can lead to NPEs. Even when this technique is used, it's often disliked by developers, who find it reads unnaturally.
A CSR for this bug will be filed to cover the incremental specification changes compared to the previous CSR.
The release note
- csr for
-
JDK-8298224 (fs) Re-visit Path.getExtension return value
-
- Closed
-
- duplicates
-
JDK-8298318 (fs) APIs for handling filename extensions
-
- Open
-
- relates to
-
JDK-8057113 (fs) Path should have a method to obtain the filename extension
-
- Resolved
-
-
JDK-8298303 (fs) temporarily remove Path.getExtension
-
- Resolved
-
- links to
-
Review openjdk/jdk/11545