Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters when using national characters

I'm trying to create some directories which have national symbols like "äöü" etc. Unfortunately I'm getting this exception whenever that is being attempted:

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /home/pi/myFolder/löwen
        at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
        at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
        at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
        at java.nio.file.Paths.get(Paths.java:84)
        at org.someone.something.file.PathManager.createPathIfNecessary(PathManager.java:161)
...
        at java.lang.Thread.run(Thread.java:744)

My code where it occurs looks like this:

public static void createPathIfNecessary(String directoryPath) throws IOException {
        Path path = Paths.get(directoryPath);
        // if directory exists?
        if (!Files.exists(path)) {
            Files.createDirectories(path);
        } else if (!Files.isDirectory(path)) {
            throw new IOException("The path " + path + " is not a directory as expected!");
        }
    }

I searched for possible solutions and most suggest to set the locale to UTF-8, so I thought I would get this fixed if I set the locale in Linux to UTF-8, but I found out that it has already been UTF-8 all the time, and despite newly setting it, I'm still having the same problem.

 $ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I'm not having this problem on Windows 7, it creates the directories perfectly, so I'm wondering whether I need to improve the java code to handle this situation better, or to change something in my Linux.

The Linux I'm running it on is a Raspbian on a Raspberry Pi 2:

$ cat /etc/*-release

    PRETTY_NAME="Raspbian GNU/Linux 7 (wheezy)"
    NAME="Raspbian GNU/Linux"
    VERSION_ID="7"
    VERSION="7 (wheezy)"
    ID=raspbian
    ID_LIKE=debian
    ANSI_COLOR="1;31"
    HOME_URL="http://www.raspbian.org/"
    SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
    BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

I am running my application on a Tomcat 7 Server (Java version is 1.8 I believe), my setenv.sh starts with: export JAVA_OPTS="-Dfile.encoding=UTF-8 ...

Does anybody have a solution to this problem? I need to be able to use those national symbols in directory/file names...

EDIT:

After adding the extra option Dsun.jnu.encoding=UTF-8 at the start of my setenv.sh for Tomcat and restarting something changed.

Currently the my start of setenv.sh looks like this

export JAVA_OPTS="-Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 

it seems like this exception is gone and the folder with the national symbols gets created, however the problem seems to not be solved completely, whenever I try to create/write to files within that directory, I now get:

java.io.FileNotFoundException: /home/pi/myFolder/löwen/Lowen.tmp (No such file or directory)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:156)
        at org.someone.something.MyFileWriter.downloadFiles(MyFileWriter.java:364)
        ...
        at java.lang.Thread.run(Thread.java:744)

The code where it happens looks like this:

// output here
File myOutputFile = new File(filePath);
FileOutputStream out = (new FileOutputStream(myOutputFile));
out.write(bytes);
out.close();

It seems to fail on (new FileOutputStream(myOutputFile)); when it's trying to initialize the FileOutputStream with the File object, which has the path created from a string which was retrieved from the path in the exception above and an added filename at the end.

So now the directory is created, however writing or creating anything inside it still results in the exception above, although the file inside it doesn't event contain national symbols.

Creating paths and files in them when they have no national symbols works as perfectly as it did before the change in setenv.sh, so it looks like the problem is connected to the national symbols within the path still...

like image 546
Arturas M Avatar asked Aug 27 '16 20:08

Arturas M


2 Answers

just set environment variables "LANG=en_US.UTF-8" or some other "xxx.UTF-8".(https://www.gnu.org/software/gettext/manual/html_node/Locale-Environment-Variables.html)

JNIEXPORT jboolean JNICALL
Java_java_io_UnixFileSystem_createDirectory(JNIEnv *env, jobject this,
                                            jobject file)
{
    jboolean rv = JNI_FALSE;
 
    WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
        if (mkdir(path, 0777) == 0) {
            rv = JNI_TRUE;
        }
    } END_PLATFORM_STRING(env, path);
    return rv;
}
#define WITH_PLATFORM_STRING(env, strexp, var)                                
    if (1) {                                                                  
        const char *var;                                                      
        jstring _##var##str = (strexp);                                       
        if (_##var##str == NULL) {                                            
            JNU_ThrowNullPointerException((env), NULL);                       
            goto _##var##end;                                                
        }                                                                     
        var = JNU_GetStringPlatformChars((env), _##var##str, NULL);           
        if (var == NULL) goto _##var##end;
 
#define WITH_FIELD_PLATFORM_STRING(env, object, id, var)                      
    WITH_PLATFORM_STRING(env,                                                 
                         ((object == NULL)                                    
                          ? NULL                                              
                          : (*(env))->GetObjectField((env), (object), (id))), 
                         var)
  1. Java natively translates all string to platform's local encoding in this method: jdk/src/share/native/common/jni_util.c - JNU_GetStringPlatformChars() . System property sun.jnu.encoding is used to determine the platform's encoding.

  2. The value of sun.jnu.encoding is set at jdk/src/solaris/native/java/lang/java_props_md.c - GetJavaProperties() using setlocale() method of libc. Environment variable LC_ALL is used to set the value of sun.jnu.encoding. Value given at the command prompt using -Dsun.jnu.encoding option to Java is ignored.

(from https://stackoverrun.com/cn/q/3020937)

like image 65
Hanson Avatar answered Nov 12 '22 05:11

Hanson


If the national characters are hardcoded in your source, convert the source file to the same encoding. You can use vim:

vim SourceClassWithHardcodedCharacters.java
:set fileencoding=utf-8<Enter>
:w<Enter>

If there is an issue, you will get a message ("unmappable character (...)").

For me, the issue is related either with 1. hardcoding characters in incorrect encoding or 2. losing the encoding somehow during passing the path to the method.

like image 1
Krzysztof Kaszkowiak Avatar answered Nov 12 '22 05:11

Krzysztof Kaszkowiak