I have a big file around 60GB.
I need to get n middle lines of the file. I am using a command with head and tail like
tail -m file |head -n >output.txt
where m,n are numbers
The general structure of the file is like below with set of records (comma separated columns.) Each line can be of different length(say max 5000 chars).
col1,col2,col3,col4...col10
Is there any other way that I can take n middle lines with less time, because the current command is taking lot of time to execute?
With sed you can at least remove the pipeline:
sed -n '600000,700000p' file > output.txt
will print lines 600000 through 700000.
awk 'FNR>=n && FNR<=m'
followed by name of the file.
It might be more efficient to use the split utility, because with tail and head in pipe you scan some parts of the file twice.
split -l <k> <file> <prefix>
Where k is the number of lines you want to have in each file, and the (optional) prefix is added to each output file name.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With