This is the useful part of code:
java.util.List<Element> elems = src.getAllElements();
Iterator it = elems.iterator();
Element el;
String key,value,date="",place="";
String [] data;
int k=0;
Segment content;
String contentstr;
String classname;
while(it.hasNext()){
el = (Element)it.next();
if(el.getName().equals("span"))
{
classname=el.getAttributeValue("class");
if(classname.equals("edit_body"))
{
//java.util.List<Element> elemsinner = el.getChildElements();
//Iterator itinner = elemsinner.iterator();
content=el.getContent();
contentstr=content.toString();
if(true)
{
System.out.println("Done!");
System.out.println(classname);
System.out.println(contentstr);
}
}
}
}
No output. But if I remove the if(classname.equals("edit_body"))
condition it does print (in one of the iterations):
Done!
edit_body
"I honestly think it is better to be a failure at something you love than to be a success at something you hate."
Can't get the bug part... help!
I am using an external java library BTW for html parsing.
BTW there are two errors at the start of the output, which is there in both the cases, with or without if condition.:
Dec 20, 2012 11:53:11 AM net.htmlparser.jericho.LoggerProviderJava$JavaLogger error SEVERE: EndTag br at (r1992,c60,p94048) not recognised as type '/normal' because its name and closing delimiter are separated by characters other than white space
Dec 20, 2012 11:53:11 AM net.htmlparser.jericho.LoggerProviderJava$JavaLogger error SEVERE: Encountered possible EndTag at (r1992,c60,p94048) whose content does not match a registered EndTagType
Hope that wont cause the error
Ok guys, Somebody explain me please! "edit_body".equals(el.getAttributeValue("class")) worked!!
I had right now the exactly same problem.
I success to solve it by using: SomeStringVar.replaceAll("\\P{Print}","");
.
This command remove all the Unicode characters in the variant (characteres that you cant see- the strings look like equal, even they not really equal).
I use this command on each variant i needed in the equalization, and it works for me as well.
Looks like you are having leading or trailing whitespaces in your classname
.
Try using this: -
if(classname.trim().equals("edit_body"))
This will trim any of those whitespaces at the ends.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With