FULL PRODUCT VERSION :
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Darwin Dans-Mac-mini.local 13.4.0 Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64 x86_64
EXTRA RELEVANT SYSTEM CONFIGURATION :
Connected to a NAS over Samba
A DESCRIPTION OF THE PROBLEM :
When using listFiles() to list files, if a file is named using NFD unicode normalisation, it is reported as NFC instead.
This has the effect, on non-HFS filesystems (such as one on network attached storage) of meaning the returned files do not exist() nor can they be opened for reading.
Here's a particular file that I am trying to use:
$ ls /Volumes/music.withoutart/2422/Johann_Sebastian_Bach/Orgelwerke_\(Karl_Richter\)_\(cd_3\)/06_Choral\,_BWV_768_Sei_Gegrüßet\,_Jesu_Gütig.flac | xxd
0000000: 2f56 6f6c 756d 6573 2f6d 7573 6963 2e77 /Volumes/music.w
0000010: 6974 686f 7574 6172 742f 3234 3232 2f4a ithoutart/2422/J
0000020: 6f68 616e 6e5f 5365 6261 7374 6961 6e5f ohann_Sebastian_
0000030: 4261 6368 2f4f 7267 656c 7765 726b 655f Bach/Orgelwerke_
0000040: 284b 6172 6c5f 5269 6368 7465 7229 5f28 (Karl_Richter)_(
0000050: 6364 5f33 292f 3036 5f43 686f 7261 6c2c cd_3)/06_Choral,
0000060: 5f42 5756 5f37 3638 5f53 6569 5f47 6567 _BWV_768_Sei_Geg
0000070: 7275 cc88 c39f 6574 2c5f 4a65 7375 5f47 ru....et,_Jesu_G
0000080: 75cc 8874 6967 2e66 6c61 630a u..tig.flac.
Note the NFD version of ü I have used in the ls command. This returns "cc88 c39f" as the hex encoded unicode value.
Now I copy and paste what I receive from listFiles():
$ printf "06-Choral,_BWV_768_Sei_Gegrüßet,_Jesu_Gütig.flac" | xxd
0000000: 3036 2d43 686f 7261 6c2c 5f42 5756 5f37 06-Choral,_BWV_7
0000010: 3638 5f53 6569 5f47 6567 72c3 bcc3 9f65 68_Sei_Gegr....e
0000020: 742c 5f4a 6573 755f 47c3 bc74 6967 2e66 t,_Jesu_G..tig.f
0000030: 6c61 63 lac
Here, the hex encoded value is "c3bc c39f". Thus, incorrect, and the file cannot be read.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1) Create an NFD file (hopefully this POSTs correctly) on a drive connected via network storage (e.g. via samba, nfs, whatever):
mkdir testDir
touch testDir/ü
Put this inside a "nfdtest" folder and expose it for sharing via samba.
2) Run the following simple program to see it [not] work:
import java.io.File;
public class NfdTest {
public static void main(String[] args) {
File testDir = new File("/Volumes/nfdtest/testDir");
File nfdEncodedFile = testDir.listFiles()[0];
System.out.println("Exists: " + nfdEncodedFile.exists());
}
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The program should output "true".
ACTUAL -
The program will output "false".
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
As above.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None that I have found.
The nio.Path API seems to have the same problem.
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Darwin Dans-Mac-mini.local 13.4.0 Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64 x86_64
EXTRA RELEVANT SYSTEM CONFIGURATION :
Connected to a NAS over Samba
A DESCRIPTION OF THE PROBLEM :
When using listFiles() to list files, if a file is named using NFD unicode normalisation, it is reported as NFC instead.
This has the effect, on non-HFS filesystems (such as one on network attached storage) of meaning the returned files do not exist() nor can they be opened for reading.
Here's a particular file that I am trying to use:
$ ls /Volumes/music.withoutart/2422/Johann_Sebastian_Bach/Orgelwerke_\(Karl_Richter\)_\(cd_3\)/06_Choral\,_BWV_768_Sei_Gegrüßet\,_Jesu_Gütig.flac | xxd
0000000: 2f56 6f6c 756d 6573 2f6d 7573 6963 2e77 /Volumes/music.w
0000010: 6974 686f 7574 6172 742f 3234 3232 2f4a ithoutart/2422/J
0000020: 6f68 616e 6e5f 5365 6261 7374 6961 6e5f ohann_Sebastian_
0000030: 4261 6368 2f4f 7267 656c 7765 726b 655f Bach/Orgelwerke_
0000040: 284b 6172 6c5f 5269 6368 7465 7229 5f28 (Karl_Richter)_(
0000050: 6364 5f33 292f 3036 5f43 686f 7261 6c2c cd_3)/06_Choral,
0000060: 5f42 5756 5f37 3638 5f53 6569 5f47 6567 _BWV_768_Sei_Geg
0000070: 7275 cc88 c39f 6574 2c5f 4a65 7375 5f47 ru....et,_Jesu_G
0000080: 75cc 8874 6967 2e66 6c61 630a u..tig.flac.
Note the NFD version of ü I have used in the ls command. This returns "cc88 c39f" as the hex encoded unicode value.
Now I copy and paste what I receive from listFiles():
$ printf "06-Choral,_BWV_768_Sei_Gegrüßet,_Jesu_Gütig.flac" | xxd
0000000: 3036 2d43 686f 7261 6c2c 5f42 5756 5f37 06-Choral,_BWV_7
0000010: 3638 5f53 6569 5f47 6567 72c3 bcc3 9f65 68_Sei_Gegr....e
0000020: 742c 5f4a 6573 755f 47c3 bc74 6967 2e66 t,_Jesu_G..tig.f
0000030: 6c61 63 lac
Here, the hex encoded value is "c3bc c39f". Thus, incorrect, and the file cannot be read.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1) Create an NFD file (hopefully this POSTs correctly) on a drive connected via network storage (e.g. via samba, nfs, whatever):
mkdir testDir
touch testDir/ü
Put this inside a "nfdtest" folder and expose it for sharing via samba.
2) Run the following simple program to see it [not] work:
import java.io.File;
public class NfdTest {
public static void main(String[] args) {
File testDir = new File("/Volumes/nfdtest/testDir");
File nfdEncodedFile = testDir.listFiles()[0];
System.out.println("Exists: " + nfdEncodedFile.exists());
}
}
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The program should output "true".
ACTUAL -
The program will output "false".
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
As above.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None that I have found.
The nio.Path API seems to have the same problem.
- relates to
-
JDK-8289689 (fs) Re-examine the need for normalization to Unicode Normalization Format D (macOS)
- Resolved