Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unidentified whitespace character in Java

On extracting some html from a web page, I have some elements containing text that end in an unknown or non-matching whitespace character (ie does not match "\\s"):

<span>Monday </span>

In java, to check what this character is, I am doing:

String s = getTheSpanContent();
char c = s.charAt(s.length() -1);
int i = (int) c;

and the value of i is: 160

Anyone know what this is? And how I can match for it?

Thanks

like image 677
Richard H Avatar asked Nov 09 '09 17:11

Richard H


1 Answers

It's a non-breaking space. According to the Pattern Javadocs, \\s matches [ \t\n\x0B\f\r], so you'll have to explicitly add \xA0 to your regex if you want to match it.

like image 140
Michael Myers Avatar answered Oct 10 '22 03:10

Michael Myers