Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use Java 8 Streams with an InputStream?

I would like to wrap a java.util.streams.Stream around an InputStream to process one Byte or one Character at a time. I didn't find any simple way of doing this.

Consider the following exercise: We wish to count the number of times each letter appears in a text file. We can store this in an array so that tally[0] will store the number of times a appears in the file, tally[1] stores the number of time b appears and so on. Since I couldn't find a way of streaming the file directly, I did this:

 int[] tally = new int[26];
 Stream<String> lines = Files.lines(Path.get(aFile)).map(s -> s.toLowerCase());
 Consumer<String> charCount = new Consumer<String>() {
   public void accept(String t) {
      for(int i=0; i<t.length(); i++)
         if(Character.isLetter(t.charAt(i) )
            tall[t.charAt(i) - 'a' ]++;
   }
 };
 lines.forEach(charCount);

Is there a way of accomplishing this without using the lines method? Can I just process each character directly as a Stream or Stream instead of creating Strings for each line in the text file.

Can I more direcly convert java.io.InputStream into java.util.Stream.stream ?

like image 309
Thorn Avatar asked May 23 '14 17:05

Thorn


People also ask

Does Java 8 support streams?

Java 8 offers the possibility to create streams out of three primitive types: int, long and double. As Stream<T> is a generic interface, and there is no way to use primitives as a type parameter with generics, three new special interfaces were created: IntStream, LongStream, DoubleStream.

How does Java 8 streams work internally?

Since JDK 8, a spliterator method has been included in every collection, so Java Streams use the Spliterator internally to iterate through the elements of a Stream. Java provides implementations of the Spliterator interface, but you can provide your own implementation of Spliterator if for whatever reason you need it.

How do you read input stream data?

InputStream. read() method reads the next byte of the data from the the input stream and returns int in the range of 0 to 255. If no byte is available because the end of the stream has been reached, the returned value is -1.


1 Answers

First, you have to redefine your task. You are reading characters, hence you do not want to convert an InputStream but a Reader into a Stream.

You can’t re-implement the charset conversion that happens, e.g. in an InputStreamReader, with Stream operations as there can be n:m mappings between the bytes of the InputStream and the resulting chars.

Creating a stream out of a Reader is a bit tricky. You will need an iterator to specify a method for getting an item and an end condition:

PrimitiveIterator.OfInt it=new PrimitiveIterator.OfInt() {
    int last=-2;
    public int nextInt() {
      if(last==-2 && !hasNext())
          throw new NoSuchElementException();
      try { return last; } finally { last=-2; }
    }
    public boolean hasNext() {
      if(last==-2)
        try { last=reader.read(); }
        catch(IOException ex) { throw new UncheckedIOException(ex); }
      return last>=0;
    }
};

Once you have the iterator you can create a stream using the detour of a spliterator and perform your desired operation:

int[] tally = new int[26];
StreamSupport.intStream(Spliterators.spliteratorUnknownSize(
  it, Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL), false)
// now you have your stream and you can operate on it:
  .map(Character::toLowerCase)
  .filter(c -> c>='a'&&c<='z')
  .map(c -> c-'a')
  .forEach(i -> tally[i]++);

Note that while iterators are more familiar, implementing the new Spliterator interface directly simplifies the operation as it doesn’t require to maintain state between two methods that could be called in arbitrary order. Instead, we have just one tryAdvance method which can be mapped directly to a read() call:

Spliterator.OfInt sp = new Spliterators.AbstractIntSpliterator(1000L,
    Spliterator.ORDERED | Spliterator.IMMUTABLE | Spliterator.NONNULL) {
        public boolean tryAdvance(IntConsumer action) {
            int ch;
            try { ch=reader.read(); }
            catch(IOException ex) { throw new UncheckedIOException(ex); }
            if(ch<0) return false;
            action.accept(ch);
            return true;
        }
    };
StreamSupport.intStream(sp, false)
// now you have your stream and you can operate on it:
…

However, note that if you change your mind and are willing to use Files.lines you can have a much easier life:

int[] tally = new int[26];
Files.lines(Paths.get(file))
  .flatMapToInt(CharSequence::chars)
  .map(Character::toLowerCase)
  .filter(c -> c>='a'&&c<='z')
  .map(c -> c-'a')
  .forEach(i -> tally[i]++);
like image 156
Holger Avatar answered Sep 21 '22 16:09

Holger