Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected StreamTokenizer behavior in Android

Tags:

java

android

I'm encountering this bizarre problem: the same code produces different results in Native Java than in Android.

InputStreamReader reader = new InputStreamReader(in, "UTF-8");
BufferedReader m_reader = new BufferedReader(reader);
StreamTokenizer m_tokenizer = new StreamTokenizer(m_reader);
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
int c = m_reader.read();
System.out.println(c);
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());
m_tokenizer.nextToken();
System.out.println(m_tokenizer.toString());

Given the following Inputstream (read from a file)

(;FF[4]CA[UTF-8]

Native Java prints out

Token['('], line 1
Token[';'], line 1
Token[FF], line 1
Token['['], line 1
52
Token[']'], line 1
Token[CA], line 1

as expected. But in Android I got:

Token['('], line 1
Token[';'], line 1
Token[FF], line 1
Token['['], line 1
93
Token[n=4.0], line 1
Token[CA], line 1

Why does it behave differently in Android Java? In Android, somehow the character ']' is taken out from the stream before the tokenizer got there. I have read Java docs and Android docs and those classes seem to be identical.

My API level is set to 7. And I've tried on both Android 2.1 Emulator and Android 4.0 Emulator getting the same result. I've also tried running it on a real device and I got the same result as well.

like image 380
kvzrock Avatar asked Nov 05 '22 11:11

kvzrock


1 Answers

Basically, the Android StreamTokenizer implementation is messed up. From looking at the source code, nextToken() parses the character read by the previous nextToken() unless it's the first character in the stream. In my case, the '[' character is already read by the 3rd nextToken(). When the 4th nextToken() is called, number 4 is read but '[' is printed. Then read() reads ']' as expected. Then the 5th nextToken() prints out '4' which is already read in by the 4th nextToken() and it continues like that. So given the current implementation, can't mix read() and nextToken() together.

like image 79
kvzrock Avatar answered Nov 15 '22 04:11

kvzrock