Lazily reading file paragraph by paragraph

Question

I've got some data stored in a file where each block of interest is stored in a paragraph like so:

hello
there

kind

people
of

stack
overflow

I have tried reading each paragraph with the following code, but it does not work:

paragraphs = File.open("hundreds_of_gigs").lazy.to_enum.grep(/.*

/) do |p| 
  puts p
end

With the regex I am trying to say: "match anything that ends with two newlines"

What am I doing wrong?

Any lazy way of solving this appreciated. The terser the method, the better.

dfherr · Accepted Answer

IO#readline(" ") will do what you want. File is a subclass of IO and has all it's methods even though they are not stated on the File rubydoc page.

It reads line by line, where a line end is the given seperator.

E.g.:

f = File.open("your_file")
f.readline("

") => "hello
there

"
f.readline("

") => "kind

"
f.readline("

") => "people
of

"
f.readline("

") => "stack
overflow

"

Each call to readline lazy reads one line of the file starting from top.

Or you can use IO#each_line(" ") to iterate over the file.

E.g.:

File.open("your_file").each_line("

") do |line|
  puts line
end

=> "hello
there

"
=> "kind

"
=> "people
of

"
=> "stack
overflow

"

Lazily reading file paragraph by paragraph

Tags:

ruby

lazy-evaluation

The Unfun Cat

1 Answers

dfherr

Recent Activity

Donate For Us

Lazily reading file paragraph by paragraph

Tags:

ruby

lazy-evaluation

The Unfun Cat

1 Answers

dfherr

Related questions

Recent Activity

Donate For Us