I have a text file with a marker somewhere in the middle: <pre class="prettyprint"><code>one two three blah-blah *MARKER* blah-blah four five six ... </code></pre> I just need to split this file in two files, first containing everything before MARKER, and second one containing everything after MARKER. It seems it can be done in one line with awk or sed, I just can't figure out how. I tried the easy way — using <code>csplit</code>, but csplit doesn't play well with Unicode text.

you can do it easily with awk <pre class="prettyprint"><code>awk -vRS="MARKER" '{print $0>NR".txt"}' file </code></pre>

Try this: <pre class="prettyprint"><code>awk '/MARKER/{n++}{print >"out" n ".txt" }' final.txt </code></pre> It will read input from final.txt and produces out1.txt, out2.txt, etc...

<pre class="prettyprint"><code>sed -n '/MARKER/q;p' inputfile > outputfile1 sed -n '/MARKER/{:a;n;p;ba}' inputfile > outputfile2 </code></pre> Or all in one: <pre class="prettyprint"><code>sed -n -e '/MARKER/! w outputfile1' -e'/MARKER/{:a;n;w outputfile2' -e 'ba}' inputfile </code></pre>

split text file in two using bash script

Tags:

text

bash

split

sed

awk

I have a text file with a marker somewhere in the middle:

one
two
three
blah-blah *MARKER* blah-blah
four
five
six
...

I just need to split this file in two files, first containing everything before MARKER, and second one containing everything after MARKER. It seems it can be done in one line with awk or sed, I just can't figure out how.

I tried the easy way — using csplit, but csplit doesn't play well with Unicode text.

261

asked Sep 04 '10 22:09

Sergey Kovalev

4 Answers

you can do it easily with awk

awk -vRS="MARKER" '{print $0>NR".txt"}' file

144

answered Oct 01 '22 02:10

ghostdog74

Try this:

awk '/MARKER/{n++}{print >"out" n ".txt" }' final.txt

It will read input from final.txt and produces out1.txt, out2.txt, etc...

answered Oct 01 '22 01:10

Leniel Maccaferri

sed -n '/MARKER/q;p' inputfile > outputfile1
sed -n '/MARKER/{:a;n;p;ba}' inputfile > outputfile2

Or all in one:

sed -n -e '/MARKER/! w outputfile1' -e'/MARKER/{:a;n;w outputfile2' -e 'ba}' inputfile

answered Oct 01 '22 01:10

Dennis Williamson

The split command will almost do what you want:

$ split -p '\*MARKER\*' splitee 
$ cat xaa
one
two
three
$ cat xab
blah-blah *MARKER* blah-blah
four
five
six
$ tail -n+2 xab
four
five
six

Perhaps it's close enough for your needs.

I have no idea if it does any better with Unicode than csplit, though.

answered Oct 01 '22 01:10

Marcelo Cantos

Related questions
                            
                                regular expression extract string after a colon in bash
                            
                                How do I get a user's friendly username on UNIX?
                            
                                Remove all backup files in Linux using Shell script recursively [duplicate]
                            
                                Replacing values in large table using conversion table
                            
                                how to make changes in .bashrc effective in current terminal [duplicate]
                            
                                Bash substring with regular expression
                            
                                Define schemes in an xcode phonegap project from terminal
                            
                                Sampling without replacement using awk
                            
                                Can git pre-receive hooks evaluate the incoming commit?
                            
                                Disabling user input during an infinite loop in bash
                            
                                Sphinx-quickstart doesn't work
                            
                                Bash for loop with spaces
                            
                                Pipe symbol | in AWK field delimiter
                            
                                How to turn command output to a live status update?
                            
                                Bash get return value of a command and exit with this value
                            
                                Difference between single and double quotes in awk
                            
                                use environment variable if set otherwise use default value in makefile
                            
                                AWS S3 cmd to automatically make file public while uploading
                            
                                Pipenv on VSCode: Why is (pipenv) not displayed on the terminal?
                            
                                How do you handle the "Too many files" problem when working in Bash?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

split text file in two using bash script

Tags:

text

bash

split

sed

awk

Sergey Kovalev

People also ask

4 Answers

ghostdog74

Leniel Maccaferri

Dennis Williamson

Marcelo Cantos

Recent Activity

Donate For Us