Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OkHttp and UTF-8 character encoding

I have a question about OkHttp in Android and its support for character encoding, specifically using UTF-8 to support swedish characters å, ä and ö (and capitals ÅÄÖ).

The app we are building uses OkHttp to make GET and POST calls to our server system. The server runs on Tomcat behind Apache. Both Apache and Tomcat are configured to use UTF-8 character coding by default. I assume what's needed is that the http requests sent from the Android app to the server are equipped with a header containing something like "application/text; charset=utf-8".

I built this stripped-down code example to illustrate the issue. As you can see, i have included addHeader() on the request to set a header. I have also actively set a Charset on the RequestBody.

public static String testPost() throws IOException{
    OkHttpClient okHttpClient = new OkHttpClient();
    HttpUrl.Builder builder = new HttpUrl.Builder();
    HttpUrl httpUrl = builder.scheme("https")
                             .host("dev.ourdomainname.com")
                             .addPathSegment("characterencoding")
                             .build();
    Charset charset = Charset.forName(StandardCharsets.UTF_8.name());
    RequestBody requestBody = new FormBody.Builder(charset)
                                          .add("text", "xxåäöÅÄÖxx")
                                          .build();
    Request request = new Request.Builder()
            .url(httpUrl)
            .addHeader("Content-Type", "application/json; charset=utf-8")
            .post(requestBody)
            .build();
    Response response = okHttpClient.newCall(request).execute();
    return "test completed";
}

At the server end, i am logging the value of the parameter named text, which comes in as "xxåäö���xx", which of course is not good enough. I also have code that loops over all headers in the request and logs them. The output looks like below. Notice how there is no "application/text; charset=utf-8" header.

DEBUG 23 Jan 14:52:37.128 - testCharacterEncoding. text: xxåäö���xx
DEBUG 23 Jan 14:52:37.129 - Header: content-type with value: application/x-www-form-urlencoded
DEBUG 23 Jan 14:52:37.129 - Header: content-length with value: 45
DEBUG 23 Jan 14:52:37.129 - Header: host with value: dev.cqrify.com
DEBUG 23 Jan 14:52:37.129 - Header: connection with value: Keep-Alive
DEBUG 23 Jan 14:52:37.129 - Header: accept-encoding with value: gzip
DEBUG 23 Jan 14:52:37.129 - Header: user-agent with value: okhttp/3.9.1

So my question is: are we doing this the wrong way? If yes, what is the right way to do it? Worst case, this could be a bug in OkHttp, but i doubt it.

For comparison, i built a simple html form to make the exact same post, and the same string sent that way comes in as "xxåäöÅÄÖxx", which is correct.

like image 559
Mats Andersson Avatar asked Nov 08 '22 11:11

Mats Andersson


1 Answers

There are at least two different problems here.

1. Your Content-type header is (correctly) ignored

The Content-type header that you set is being overriden when you later call .post(requestBody) on your request object. This is because you are using a FormBuilder object to build your POST body, and this is specifically meant to be used for application/x-www-form-urlencoded forms. If you want to post JSON data, you should not be using this. Instead, try this:

public static final MediaType JSON = MediaType.parse("application/json; charset=utf-8");
OkHttpClient client = new OkHttpClient();

String post(String url, String json) throws IOException {
  RequestBody body = RequestBody.create(JSON, json);
  Request request = new Request.Builder()
      .url(url)
      .post(body)
      .build();
[...]

Here's the full source code from the official OkHttp examples.

2. Non-ASCII characters are garbled

Even if you stick with the application/x-www-form-urlencoded content type, non ASCII text should work fine. So what is happening in your case?

I suspect an encoding problem when you compile your source code; i.e. charset used by javac does not match charset of your Java source file. You may want to explicitly pass -encoding utf8 (or whatever encoding you are using in your source files) to javac, or even better, avoid any non-ASCII characters in your source code and use Unicode escapes instead. In this case, instead of xxåäöÅÄÖxx, you may want to use xx\u00E5\u00E4\u00F6\u00C5\u00C4\u00D6xx

like image 194
Grodriguez Avatar answered Nov 14 '22 20:11

Grodriguez