I've noticed that some servers of mine only work properly when started in a certain way.
When the server misbehaves, umlauts are not supported in a certain way, which can be tested with the following snippet:
Reproduce:
new java.io.File("/folder/ümläüuts").toPath()
And it creates the following exception:
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /folder/??ml????uts
at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:121)
at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:68)
at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
at java.base/java.io.File.toPath(File.java:2387)
at com.jpro.internal.server.JProInitializer$.$anonfun$initialize$4(JProInitializer.scala:120)
at com.jpro.internal.server.JProInitializer$.execute(JProInitializer.scala:70)
at com.jpro.internal.server.JProInitializer$.initialize(JProInitializer.scala:118)
at com.jpro.internal.server.JProInitializer$.init(JProInitializer.scala:32)
at com.jpro.internal.server.JProStarter$.$anonfun$startFromBoot$2(JProStarter.scala:34)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
Effects:
- Certain Filenames cannot be handled
- File Upload on Server fails in production (but works during development on Mac)
- Other unexpected effects
How does it happen?
This happens because of the following environment variable:
LC_CTYPE=UTF-8
This gets "inherited" from the Mac Terminal, but I think it should instead be "C.UTF-8"
Possible workaround are:
1. Setting to C.UTF-8
export LC_CTYPE=C.UTF-8
2. Unsetting LC_CTYPE
unset LC_CTYPE
3. Setting LC_ALL
export LC_CTYPE=C.UTF-8
4. SSH first to the machine and then use the command instead of providing the commandline directly in the commandline.
Fails:
> ssh ..... user@machine "java <your-command>"
Works:
> ssh ..... user@machine
> > java <your-command>
What does not work:
1. Setting the JVM encoding with parameters like -Dfile.encoding=UTF-8 or -Dsun.jnu.encoding=UTF-8 doesn't make a difference.
For now, I've added a check to my Product to verify umlauts work properly, so failing configurations fail hard, instead of having a nearly unnoticeable error. (Currently, only file upload fails on misconfiguration)
I think what happens is that JVM internally sets an "internal encoding for native code" based on these variables.
Of course, this is caused by SSH changing env variables, but I still think it should not break the behavior of the JVM.
When the server misbehaves, umlauts are not supported in a certain way, which can be tested with the following snippet:
Reproduce:
new java.io.File("/folder/ümläüuts").toPath()
And it creates the following exception:
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /folder/??ml????uts
at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:121)
at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:68)
at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
at java.base/java.io.File.toPath(File.java:2387)
at com.jpro.internal.server.JProInitializer$.$anonfun$initialize$4(JProInitializer.scala:120)
at com.jpro.internal.server.JProInitializer$.execute(JProInitializer.scala:70)
at com.jpro.internal.server.JProInitializer$.initialize(JProInitializer.scala:118)
at com.jpro.internal.server.JProInitializer$.init(JProInitializer.scala:32)
at com.jpro.internal.server.JProStarter$.$anonfun$startFromBoot$2(JProStarter.scala:34)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
Effects:
- Certain Filenames cannot be handled
- File Upload on Server fails in production (but works during development on Mac)
- Other unexpected effects
How does it happen?
This happens because of the following environment variable:
LC_CTYPE=UTF-8
This gets "inherited" from the Mac Terminal, but I think it should instead be "C.UTF-8"
Possible workaround are:
1. Setting to C.UTF-8
export LC_CTYPE=C.UTF-8
2. Unsetting LC_CTYPE
unset LC_CTYPE
3. Setting LC_ALL
export LC_CTYPE=C.UTF-8
4. SSH first to the machine and then use the command instead of providing the commandline directly in the commandline.
Fails:
> ssh ..... user@machine "java <your-command>"
Works:
> ssh ..... user@machine
> > java <your-command>
What does not work:
1. Setting the JVM encoding with parameters like -Dfile.encoding=UTF-8 or -Dsun.jnu.encoding=UTF-8 doesn't make a difference.
For now, I've added a check to my Product to verify umlauts work properly, so failing configurations fail hard, instead of having a nearly unnoticeable error. (Currently, only file upload fails on misconfiguration)
I think what happens is that JVM internally sets an "internal encoding for native code" based on these variables.
Of course, this is caused by SSH changing env variables, but I still think it should not break the behavior of the JVM.