I'm getting emails from a client where they have nested a multipart/alternative message inside a multipart/mixed message. When I get the body of the message it just returns the multipart/alternative level when what I really want is the text/html part which is contained in the multipart/alternative.
I've looked through the javadocs for javax.mail and I can't find a simple way to get the body of a bodypart that is itself a multipart or skip the first multipart/mixed part and go into the multipart/alternative body to read the text/html and text/plain pieces.
The email structure looks like this:
...
Content-Type: multipart/mixed;
boundary="----=_Part_19487_1145362154.1418138792683"
------=_Part_19487_1145362154.1418138792683
Content-Type: multipart/alternative;
boundary="----=_Part_19486_1391901275.1418138792683"
------=_Part_19486_1391901275.1418138792683
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=ISO-8859-1
...
------=_Part_19486_1391901275.1418138792683
Content-Transfer-Encoding: 7bit
Content-Type: text/html; charset=ISO-8859-1
...
------=_Part_19486_1391901275.1418138792683--
------=_Part_19487_1145362154.1418138792683--
This is an outline of the code used to parse the emails:
Message [] found = fldr.search(searchCondition);
for (int i = 0; i < found.length; i++) {
Message m = found[i];
Object o = m.getContent();
if (o instanceof Multipart) {
log.info("**This is a Multipart Message. ");
Multipart mp = (Multipart)o;
log.info("The Multipart message has " + mp.getCount() + " parts.");
for (int j = 0; j < mp.getCount(); j++) {
BodyPart b = mp.getBodyPart(j);
// Loop if the content type is multipart then get the content that is in that part,
// make it the new container and restart the loop in that part of the message.
if (b.getContentType().contains("multipart")) {
mp = (Multipart)b.getContent();
j = 0;
continue;
}
log.info("This content type is " + b.getContentType());
if(!b.getContentType().contains("text/html")) {
continue;
}
Object o2 = b.getContent();
if (o2 instanceof String) {
<do things with content here>
}
}
}
}
It appears to keep stopping at the second boundary and not parsing anything further. In the case of the above message it stops at boundary="----=_Part_19486_1391901275.1418138792683" and never gets to the text of the message.
In this block :
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = 0;
continue;
}
You set j
to 0 and ask the loop to continue, hoping it will start again at zero. But the increment operation j++
will come before and your loop will start at 1, not 0.
Set j
to -1 to solve your issue.
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = -1;
continue;
}
I have tested your code and failed for me as well.
In my case, b.getContentType()
returns all uppercase characters (e.g. "TEXT/HTML; charset=UTF-8"). So I have converted that to lowercase and it worked.
String contentType=b.getContentType().toLowerCase(Locale.ENGLISH);
if(!contentType.contains("text/html")) {
continue;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With