Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File charset changes to binary in docker container

Tags:

java

docker

I have an application which listens to the external feed on hourly basis and receives the feed JSON which is a chunked transfer encoding stream, the listener to the feed write the chunk to the file, after the whole stream is completed another thread parses the file and extracts the data. But now while writing the file the data is written in binary format even though I have specified the charset while writing.

    public void writeToFile(InputStream in){
     File feedFile = new File("/tmp/feed.json");
    try {
        FileUtils.touch(feedFile);
        StringWriter writer = new StringWriter();
        IOUtils.copy(in, writer, StandardCharsets.UTF_8);
        FileUtils.write(feedFile, writer.toString(), StandardCharsets.UTF_8,true);

    } catch (IOException e) {
        logger.error(Constants.FAILED_TO_WRITE_FEED_INTO_FILE,e);
    }
}

This code works fine on windows and linux box, but while inside docker container its written in binary format.

Docker container used Centos7

like image 260
Brajesh Pant Avatar asked May 28 '18 10:05

Brajesh Pant


People also ask

What is the file format docker?

A Dockerfile is a text document (without a file extension) that contains the instructions to set up an environment for a Docker container. You can build a Docker image using a Dockerfile. The command docker build .

Do docker containers have their own file system?

Docker containers make use of the Union File System (UFS), which works with a series of read-only layers that includes a final read-write layer on top. This system functions perfectly when a container doesn't need to save data.

How do you set a charset in Java?

Setting default character encoding or Charset Methods: There are various ways of specifying the default charset value in Java. java -Dfile. encoding="UTF-8" HelloWorld, we can specify UTF-8 charset. Method 2: Specifying the environment variable “JAVA_TOOLS_OPTIONS.”


1 Answers

Maybe the locale UTF-8 in the container doesn't exist?

You can see the current locale in your running container with cat /etc/locale.conf

If it's not LANG=en_US.utf8, you can follow the instruction from this StackOverflow post by user2915097:

# Set the locale
RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \
    locale-gen
ENV LANG en_US.UTF-8  
ENV LANGUAGE en_US:en  
ENV LC_ALL en_US.UTF-8

Source: How to set the locale inside a Ubuntu Docker container? https://stackoverflow.com/a/28406007/3756843

EDIT 1:

You should use InputStreamReader instead of InputStream because:

  • InputStream is made to handle binary data
  • InputStreamReader is made to handle text

You can find more information here.

like image 126
Paul Rey Avatar answered Oct 02 '22 13:10

Paul Rey