I have a large text file with content set up like this:
---
title: Lorim Ipsum Dolar
---
Lorim ipsum content
---
title: Excelvier whatever
---
Lorim ipsum content goes here.
I'm trying to split up this file into individual files using csplit
.
The individual files would have content formatted like this:
---
title: Lorim Ipsum Dolar
---
Lorim ipsum content
I was hoping to be able to regex the ---, newline & title like so ---\ntitle
But I'm not able to select it with…
csplit -k products.txt '/---[^\n]title/' {99}
I've tried lots of variations to no avail. I keeping getting "no match".
csplit reads the input file one line at a time and applies the regex to each line. It is therefore not possible to match a regex across multiple lines.
One way around this is to massage the input file first, replacing ---\ntitle:
with a single line pattern that csplit can match. For example, using sed:
sed 'N;s/---\ntitle: /===\n' products.txt | csplit -k - '/===/' {*}
sed 'N;s/===\n/---\ntitle: /' -i xx*
This replaces ---\ntitle:
with a single line ===
, then has csplit split when it sees that pattern. Passing -
as a file name tells csplit to read from stdin. The second sed command reverses the change.
You could use a regular expression that matches until the end of the line ($
)
What do you think about:
csplit -k products.txt '/^title:/' {99}
Try using {*}
instead of {99}
to fix match not found
problem.
This might work for you:
csplit -z products.txt '/^title/-1' '{*}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With