Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: Split a file in linux in 10 pieces only by blank lines

I am currently working with some files to parse with a Scala app. The problem is that the files are too large so they always end up throwing an exception in the heap size (and I've tried with the max heap size I can and still no use).

Now, the files looks like this:

This is
one paragraph
for Scala
to parse

This is
another paragraph
for Scala
to parse

Yet another
paragraph

And so on. Basically I would like to take all this files and split them in 10 or 20 pieces each, but I have to be sure a paragraph is not splitted in half in the results. Is there any way of doing this?

Thank you!

like image 819
crscardellino Avatar asked Dec 16 '22 00:12

crscardellino


1 Answers

csplit file.txt /^$/ {*}

csplit splits a file separated by the specified pattern.

/^$/ matches empty lines.

{*} repeats the previous pattern indefinitely.

like image 131
Marco Roy Avatar answered Jan 13 '23 11:01

Marco Roy