Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java special chars replace

I have a text: " Csuklási roham gyötörheti a svédeket, annyit emlegetik mostanság ismét a svéd modellt Magyarországon."

In that original text there are no line breaks at all.

When I email this text (with gmail), I get it encoded as the following:

Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable

Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon. 

In HTML:

Content-Type: text/html; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable


<span class=3D"Apple-style-span" style=3D"font-family: Helvetica, Verdana, = sans-serif; font-size: 15px; ">Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket= , annyit emlegetik mostans=E1g ism=E9t a sv=E9d modellt Magyarorsz=E1gon.

....

When I try to parse the email body as text/plain, I cannot get rid of the = sign in "mostans=E1g = ism=E9t" between the two words. Note that the same character is missing from the HTML encoded message. I don't have any idea what that special character might be, but I need to eliminate it to get back the original text.

I tried to replace '\n' but it's not that one, if I hit 'Enter' in the text, I can correctly replace it to whatever character I want it to. I also tried '\r', and '\t'.

So the question is, what am I missing? Where does that special character come from? Is it because of the charser and/or the transfer encoding? If so, what do I have to do to solve the problem and get the original text back.

Any help would be welcome.

Cheers, Balázs

like image 975
Balázs Németh Avatar asked Nov 11 '10 11:11

Balázs Németh


1 Answers

You need to use MimeUtility.Here is an example.

public class Mime {
    public static void main(String[] args) throws MessagingException,
            IOException {
        InputStream stringStream = new FileInputStream("mime");
        InputStream output = MimeUtility.decode(stringStream,
                "quoted-printable");
        System.out.println(convertStreamToString(output));
    }

    public static String convertStreamToString(InputStream is)
            throws IOException {
        /*
         * To convert the InputStream to String we use the Reader.read(char[]
         * buffer) method. We iterate until the Reader return -1 which means
         * there's no more data to read. We use the StringWriter class to
         * produce the string.
         */
        if (is != null) {
            Writer writer = new StringWriter();

            char[] buffer = new char[1024];
            try {
                Reader reader = new BufferedReader(new InputStreamReader(is,
                        "ISO8859_1"));
                int n;
                while ((n = reader.read(buffer)) != -1) {
                    writer.write(buffer, 0, n);
                }
            } finally {
                is.close();
            }
            return writer.toString();
        } else {
            return "";
        }
    }
}

The file 'mime' contains encoded text:

Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon.

UPDATE:

Using Guava library :

    InputSupplier<InputStream> supplier = new InputSupplier<InputStream>() {
        @Override
        public InputStream getInput() throws IOException {
            InputStream inStream = new FileInputStream("mime");
            InputStream decodedStream=null;
            try {
                decodedStream = MimeUtility.decode(inStream,
                "quoted-printable");
            } catch (MessagingException e) {
                e.printStackTrace();
            }
            return decodedStream;
        }
    };
    InputSupplier<InputStreamReader> result = CharStreams
    .newReaderSupplier(supplier, Charsets.ISO_8859_1);
    String ans = CharStreams.toString(result);
    System.out.println(ans);
like image 135
Emil Avatar answered Sep 24 '22 21:09

Emil