FULL PRODUCT VERSION :
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
MacOSX - 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64
A DESCRIPTION OF THE PROBLEM :
If you have a file URI with a non-ASCII char and I try to make a Path using a call to Paths.getPath(uri) then any non-ASCII chars aren't handled correctly.
For example...
URI uri = new URI("file:///this/is/español/test.txt");
Path path = Paths.get(uri);
System.out.println("path toString() = " + path);
This prints "/this/is/espa�ol/test.txt" on Mac, but works on Windows. From that point on there is no way to get the original path. Calling...
System.out.println("path getName(2) = " + path.getName(2));
...also prints "espa�ol" and calling...
path.getName(2).toString().equals("español");
...returns true on Windows but false on Mac.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
String uriStr = "file:///this/is/español/test.txt";
System.out.println("URI String = " + uriStr);
URI uri = new URI(uriStr);
System.out.println("URI toString() = " + uri); // this is fine
Path path = Paths.get(uri);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path toString() = " + path);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path getName(2) = " + path.getName(2)); // fails on Mac
System.out.println(path.getName(2).toString().equals("español"));
// prints incorrect encoding of URI on Mac, correctly shows "español" on Windows
System.out.println("path URI toString() = " + path.toUri());
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
// there are the correct results from Windows
URI String = file:///this/is/español/test.txt
URI toString() = file:///this/is/español/test.txt
path toString() = /this/is/español/test.txt
path getName(2) = español
true
path URI toString() = file:///this/is/español/test.txt
ACTUAL -
URI String = file:///this/is/español/test.txt
URI toString() = file:///this/is/español/test.txt
path toString() = /this/is/espa�ol/test.txt
path getName(2) = espa�ol
false
path URI toString() = file:///this/is/espa%F1ol/test.txt
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
@Test
public void showURI2Path2StringResults() throws URISyntaxException
{
String uriStr = "file:///this/is/español/test.txt";
System.out.println("URI String = " + uriStr);
URI uri = new URI(uriStr);
System.out.println("URI toString() = " + uri); // this is fine
Path path = Paths.get(uri);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path toString() = " + path);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path getName(2) = " + path.getName(2)); // fails on Mac
System.out.println(path.getName(2).toString().equals("español"));
// prints incorrect encoding of URI on Mac, correctly shows "español" on Windows
System.out.println("path URI toString() = " + path.toUri());
}
---------- END SOURCE ----------
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
MacOSX - 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64
A DESCRIPTION OF THE PROBLEM :
If you have a file URI with a non-ASCII char and I try to make a Path using a call to Paths.getPath(uri) then any non-ASCII chars aren't handled correctly.
For example...
URI uri = new URI("file:///this/is/español/test.txt");
Path path = Paths.get(uri);
System.out.println("path toString() = " + path);
This prints "/this/is/espa�ol/test.txt" on Mac, but works on Windows. From that point on there is no way to get the original path. Calling...
System.out.println("path getName(2) = " + path.getName(2));
...also prints "espa�ol" and calling...
path.getName(2).toString().equals("español");
...returns true on Windows but false on Mac.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
String uriStr = "file:///this/is/español/test.txt";
System.out.println("URI String = " + uriStr);
URI uri = new URI(uriStr);
System.out.println("URI toString() = " + uri); // this is fine
Path path = Paths.get(uri);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path toString() = " + path);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path getName(2) = " + path.getName(2)); // fails on Mac
System.out.println(path.getName(2).toString().equals("español"));
// prints incorrect encoding of URI on Mac, correctly shows "español" on Windows
System.out.println("path URI toString() = " + path.toUri());
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
// there are the correct results from Windows
URI String = file:///this/is/español/test.txt
URI toString() = file:///this/is/español/test.txt
path toString() = /this/is/español/test.txt
path getName(2) = español
true
path URI toString() = file:///this/is/español/test.txt
ACTUAL -
URI String = file:///this/is/español/test.txt
URI toString() = file:///this/is/español/test.txt
path toString() = /this/is/espa�ol/test.txt
path getName(2) = espa�ol
false
path URI toString() = file:///this/is/espa%F1ol/test.txt
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
@Test
public void showURI2Path2StringResults() throws URISyntaxException
{
String uriStr = "file:///this/is/español/test.txt";
System.out.println("URI String = " + uriStr);
URI uri = new URI(uriStr);
System.out.println("URI toString() = " + uri); // this is fine
Path path = Paths.get(uri);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path toString() = " + path);
// prints unknown char for ñ on Mac, fine on Windows
System.out.println("path getName(2) = " + path.getName(2)); // fails on Mac
System.out.println(path.getName(2).toString().equals("español"));
// prints incorrect encoding of URI on Mac, correctly shows "español" on Windows
System.out.println("path URI toString() = " + path.toUri());
}
---------- END SOURCE ----------