I am currently working with some files to parse with a Scala app. The problem is that the files are too large so they always end up throwing an exception in the heap size (and I've tried with the max heap size I can and still no use).
Now, the files looks like this:
This is
one paragraph
for Scala
to parse
This is
another paragraph
for Scala
to parse
Yet another
paragraph
And so on. Basically I would like to take all this files and split them in 10 or 20 pieces each, but I have to be sure a paragraph is not splitted in half in the results. Is there any way of doing this?
Thank you!
csplit file.txt /^$/ {*}
csplit
splits a file separated by the specified pattern.
/^$/
matches empty lines.
{*}
repeats the previous pattern indefinitely.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With