Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java emoji conversion to xml: what libraries exist?

I am transforming MIME messages to XML so that I can submit them to a mail merge service as SOAP requests, but Emoji are giving me problems (the smiley ๐Ÿ˜ƒ for example, which I'd like to have converted to 😃).

I'm using XStream to handle my conversions but it doesn't properly encode emoji and other high/low surrogate pairs (see the example test case below). It is possible that I am missing some crucial xstream configuration component.

I have found this project that is based on this project which does conversions for specific Japanese cell phone providers via a hard-coded mapping, but I feel like this problem is probably solved more elegantly in existing Oracle or third-party (Apache, etc.) libraries.

From what I've read and heard NuSOAP addresses this issue for PHP but I'd like to stay in the Java/Groovy world for emoji conversion so I can use a compatible library.

What tools/approaches are you using to handle emoji conversion to XML on the JVM?

import junit.framework.TestCase;
import com.thoughtworks.xstream.XStream;

public class XStreamTest extends TestCase {
    public void testXStreamEmojiEncoding() {
        final String expected = "Open mouth smiley 😃 and two chicken heads followed by a period 🐔🐔.";
        final String original = "Open mouth smiley ๐Ÿ˜ƒ and two chicken heads followed by a period ๐Ÿ”๐Ÿ”.";

        final XStream xStream = new XStream();

        final String returned = xStream.toXML(original);

        assertEquals("<string>" + expected + "</string>", returned);
    }
}

The above test looks for an HTML decimal representation of the emoji but I'll accept other formats that will work for MIME.

like image 699
eebbesen Avatar asked Jun 16 '13 21:06

eebbesen


People also ask

Can you use Emojis in Java?

emoji-java is a lightweight java library that helps you use Emojis in your java applications.

How do you use emoji codes in Java?

If you mean e.g. ๐Ÿ˜€ 'GRINNING FACE' (U+1F600), then write "๐Ÿ˜€" if your source code is UTF-8, or "\uD83D\uDE00" if not.


1 Answers

I recently wrote a library for this: emoji-java
Here is the kind of output you would get:

String str = "An ๐Ÿ˜€awesome ๐Ÿ˜ƒstring with a few ๐Ÿ˜‰emojis!";
String result = EmojiParser.parseToAliases(myString);
System.out.println(myString);
// Prints:
// "An &#128512;awesome &#128515;string with a few &#128521;emojis!"

You can either add the jar to your project or use the maven dependency:

<dependency>
  <groupId>com.vdurmont<groupId>
  <artifactId>emoji-java<artifactId>
  <version>1.0.0</version> <!-- Or whatever the version will be when you read this post -->
</dependency>
like image 188
Vincent Durmont Avatar answered Oct 03 '22 13:10

Vincent Durmont