Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting one file into multiple files

Tags:

bash

shell

sed

perl

I have a large file like below, I want to split this file into multiple files. Each file should be break after ENDMDL. For the following file there will be three output files with name pose1.av, pose2.av and pose3.av.

MODEL        1
SML    170  O   PRO A  17      16.893   3.030   0.799  1.00  1.00           O
SML    171  OXT PRO A  17      18.167   2.722   2.597  1.00  1.00           O
TER     172      PRO A  17
ENDMDL
MODEL        2
SML      4  CG  ARG A   1      -2.171  -7.105  -4.278  1.00  1.00           C
SML      5  CD  ARG A   1      -1.851  -8.581  -4.022  1.00  1.00           C
SML    113  HD1 HIS A  12       2.465  -8.206   5.062  1.00  1.00           H
TER     114      HIS A  12
ENDMDL
MODEL        3
SML    101  N   HIS A  12       3.765  -3.995   7.233  1.00  1.00           N
SML    102  CA  HIS A  12       2.584  -4.736   6.934  1.00  1.00           C
TER     103      HIS A  12
ENDMDL
like image 257
user_newbie Avatar asked Mar 27 '26 05:03

user_newbie


2 Answers

A rather efficient one, using bash and sed:

n=0
while IFS= read -r firstline; do
    { echo "$firstline"; sed '/^ENDMDL$/q'; } > "pose$((++n)).av"
done < file

It's much more efficient than the other Bash answer: the output file is only opened once, and most of the parsing is done by sed, and not by bash.

like image 61
gniourf_gniourf Avatar answered Mar 29 '26 19:03

gniourf_gniourf


csplit can do this out of the box

csplit -z -s -f pose -b "%01d.av" file '/^ENDMDL$/+1' '{*}'
like image 29
iruvar Avatar answered Mar 29 '26 21:03

iruvar