Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to add aliases for Java's Charset names

I'm getting an exception, buried way inside a 3rd party library, with a message like this:

java.io.UnsupportedEncodingException: BIG-5

I think this happening because Java doesn't define this name for java.nio.charset.Charset. Charset.forName("big5") is fine, but Charset.forName("big-5") throws the exception. (All these names appear to be case insensitive.)

This is different from "utf-8", which has some aliases to be more forgiving. For example, both Charset.forName("utf8") and Charset.forName("utf-8") work fine.

Question: is there a way to add the alias so that "big-5" maps to "big5"?

like image 773
Rob N Avatar asked Nov 29 '16 22:11

Rob N


People also ask

What is charset defaultCharset ()?

defaultCharset. public static Charset defaultCharset() Returns the default charset of this Java virtual machine. The default charset is determined during virtual-machine startup and typically depends upon the locale and charset of the underlying operating system.

Which character set Java uses?

Internally, Java uses the Unicode character set. Unicode is a two-byte extension of the one-byte ISO Latin-1 character set, which in turn is an eight-bit superset of the seven-bit ASCII character set.

What is Java default encoding?

encoding attribute, Java uses “UTF-8” character encoding by default. Character encoding basically interprets a sequence of bytes into a string of specific characters.


1 Answers

You can try the mail.mime.contenttypehandler system property:

In some cases JavaMail is unable to process messages with an invalid Content-Type header. The header may have incorrect syntax or other problems. This property specifies the name of a class that will be used to clean up the Content-Type header value before JavaMail uses it. The class must have a method with this signature: public static String cleanContentType(MimePart mp, String contentType) Whenever JavaMail accesses the Content-Type header of a message, it will pass the value to this method and use the returned value instead.

An example of this is:

import java.util.Arrays;
import javax.mail.Session;
import javax.mail.internet.ContentType;
import javax.mail.internet.MimeMessage;
import javax.mail.internet.MimePart;

public class FixEncodingName {

    public static void main(String[] args) throws Exception {
        MimeMessage msg = new MimeMessage((Session) null);
        msg.setText("test", "big-5");
        msg.saveChanges();
        System.out.println(msg.getContentType());
        System.out.println(Arrays.toString(msg.getHeader("Content-Type")));
    }

    public static String cleanContentType(MimePart p, String mimeType) {
        if (mimeType != null) {
            String newContentType = mimeType;
            try {
                ContentType ct = new ContentType(mimeType);
                String cs = ct.getParameter("charset");
                if ("big-5".equalsIgnoreCase(cs)) {
                    ct.setParameter("charset", "big5");
                    newContentType = ct.toString();
                }
            } catch (Exception ignore) {
                newContentType = newContentType.replace("big-5", "big5");
            }

            /*try { //Fix the header in the message.
                p.setContent(p.getContent(), newContentType);
                if (p instanceof Message) {
                    ((Message) p).saveChanges();
                }
            } catch (Exception ignore) {
            }*/
            return newContentType;
        }
        return mimeType;
    }
}

When run with -Dmail.mime.contenttypehandler=FixEncodingName will output:

text/plain; charset=big5
[text/plain; charset=big-5]
like image 69
jmehrens Avatar answered Sep 28 '22 15:09

jmehrens