Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I read input character-by-character in Java?

I am used to the c-style getchar(), but it seems like there is nothing comparable for java. I am building a lexical analyzer, and I need to read in the input character by character.

I know I can use the scanner to scan in a token or line and parse through the token char-by-char, but that seems unwieldy for strings spanning multiple lines. Is there a way to just get the next character from the input buffer in Java, or should I just plug away with the Scanner class?

The input is a file, not the keyboard.

like image 325
jergason Avatar asked May 01 '09 15:05

jergason


People also ask

How do you reference a character in Java?

Character ch = new Character('a'); The Java compiler will also create a Character object for you under some circumstances. For example, if you pass a primitive char into a method that expects an object, the compiler automatically converts the char to a Character for you.


2 Answers

Use Reader.read(). A return value of -1 means end of stream; else, cast to char.

This code reads character data from a list of file arguments:

public class CharacterHandler {     //Java 7 source level     public static void main(String[] args) throws IOException {         // replace this with a known encoding if possible         Charset encoding = Charset.defaultCharset();         for (String filename : args) {             File file = new File(filename);             handleFile(file, encoding);         }     }      private static void handleFile(File file, Charset encoding)             throws IOException {         try (InputStream in = new FileInputStream(file);              Reader reader = new InputStreamReader(in, encoding);              // buffer for efficiency              Reader buffer = new BufferedReader(reader)) {             handleCharacters(buffer);         }     }      private static void handleCharacters(Reader reader)             throws IOException {         int r;         while ((r = reader.read()) != -1) {             char ch = (char) r;             System.out.println("Do something with " + ch);         }     } } 

The bad thing about the above code is that it uses the system's default character set. Wherever possible, prefer a known encoding (ideally, a Unicode encoding if you have a choice). See the Charset class for more. (If you feel masochistic, you can read this guide to character encoding.)

(One thing you might want to look out for are supplementary Unicode characters - those that require two char values to store. See the Character class for more details; this is an edge case that probably won't apply to homework.)

like image 59
McDowell Avatar answered Oct 11 '22 15:10

McDowell


Combining the recommendations from others for specifying a character encoding and buffering the input, here's what I think is a pretty complete answer.

Assuming you have a File object representing the file you want to read:

BufferedReader reader = new BufferedReader(     new InputStreamReader(         new FileInputStream(file),         Charset.forName("UTF-8"))); int c; while((c = reader.read()) != -1) {   char character = (char) c;   // Do something with your character } 
like image 26
roryparle Avatar answered Oct 11 '22 13:10

roryparle