how to use sed to extract lines in specified order?

Tags:

I have a file which is ~50,000 lines long , and I need to retrieve specific lines. I have tried the following command :

sed -n 'Np;Np;Np' inputFile.txt > outputFile.txt

( 'N' being the specific lines, I want to extract )

This works fine, but the command extracts the lines in ORDER (i.e. it RE-ORDERS my input) ex. if I try:

sed -n '200p;33p;40,000p' inputFile.txt > outputFile.txt

I get a text file with the lines ordered as: 33, 200, 40,000 (which doesn't work for my purpose). Is there a way to maintain the order in which lines appear in the command?

473

asked Oct 04 '16 10:10

JazFlo

3 Answers

You have to hold on to line 33 until after you've seen line 200:

sed -n '33h; 200{p; g; p}; 40000p' file

See the manual for further explanation: https://www.gnu.org/software/sed/manual/html_node/Other-Commands.html

awk might be more readable:

awk '
    NR == 33    {line33 = $0} 
    NR == 200   {print; print line33} 
    NR == 40000 {print}
' file

If you have an arbitrary number of lines to print in a specific order, you can generalize this:

awk -v line_order="11 3 5 1" '
    BEGIN {
        n = split(line_order, inorder)
        for (i=1; i<=n; i++) linenums[inorder[i]]
    }
    NR in linenums {cache[NR]=$0}
    END {for (i=1; i<=n; i++) print cache[inorder[i]]}
' file

answered Oct 17 '22 17:10

glenn jackman

with perl, saves input lines in hash variable with line number as key

$ seq 12 20 | perl -nle '
@l = (5,2,3,1);
$a{$.} = $_ if( grep { $_ == $. } @l );
END { print $a{$_} foreach @l } '
16
13
14
12

$. is line number and grep { $_ == $. } @l checks if that line number is present in the array @l which contains desired lines in order required

as a one-liner, @l declaration inside BEGIN to avoid initialization every iteration and also ensuring no blank lines if line number is out of range:

$ seq 50000 > inputFile.txt
$ perl -nle 'BEGIN{@l=(200,33,40000)} $a{$.}=$_ if(grep {$_ == $.} @l); END { $a{$_} and print $a{$_} foreach (@l) }' inputFile.txt > outputFile.txt
$ cat outputFile.txt
200
33
40000

For small enough input, can save the lines in an array and print indexes required. Note the adjustment made as index starts with 0

$ seq 50000 | perl -e '$l[0]=0; push @l,<>; print @l[200,33,40000]'
200
33
40000

Solution with head and tail combo:

$ for i in 200 33 40000; do head -"${i}" inputFile.txt | tail -1 ; done
200
33
40000

Performance comparison for input file seq 50000 > inputFile.txt

$ time perl -nle 'BEGIN{@l=(200,33,40000)} $a{$.}=$_ if(grep {$_ == $.} @l); END { $a{$_} and print $a{$_} foreach (@l) }' inputFile.txt > outputFile.txt

real    0m0.044s
user    0m0.036s
sys 0m0.000s

$ time awk -v line_order="200 33 40000" '
    BEGIN {
        n = split(line_order, inorder)
        for (i=1; i<=n; i++) linenums[inorder[i]]
    }
    NR in linenums {cache[NR]=$0}
    END {for (i=1; i<=n; i++) print cache[inorder[i]]}
' inputFile.txt > outputFile.txt

real    0m0.019s
user    0m0.016s
sys 0m0.000s

$ time for i in 200 33 40000; do sed -n "${i}{p;q}" inputFile.txt ; done > outputFile.txt

real    0m0.011s
user    0m0.004s
sys 0m0.000s

$ time sed -n '33h; 200{p; g; p}; 40000p' inputFile.txt > outputFile.txt

real    0m0.009s
user    0m0.008s
sys 0m0.000s

$ time for i in 200 33 40000; do head -"${i}" inputFile.txt | tail -1 ; done > outputFile.txt

real    0m0.007s
user    0m0.000s
sys 0m0.000s

answered Oct 17 '22 18:10

Sundeep

Can you use also other bash commands? In that case this works:

for i in 200 33 40000; do 
    sed -n "${i}p" inputFile.txt
done > outputFile.txt

Probably this is slower than using array within sed, but it is more practical.

answered Oct 17 '22 17:10

Riccardo Petraglia

Related questions
                            
                                Why is zero padding needed in sockaddr_in?
                            
                                why do perl, ruby use /dev/urandom
                            
                                What is the purpose of the "-i" and "-t" options for the "docker exec" command?
                            
                                Is docker-machine required on linux?
                            
                                Failed at step CHDIR spawning "/usr/bin/dotnet": No such file or directory
                            
                                How do I include only used symbols when statically linking with gcc?
                            
                                Wait until a certain process (knowing the "pid") end
                            
                                How to give file permission to a specific user in a Group?
                            
                                How can I use grep to match but without printing the matches?
                            
                                Why disabling interrupts disables kernel preemption and how spin lock disables preemption
                            
                                Is Microsoft SQL Server Express available for production in Linux?
                            
                                Sandboxing in Linux
                            
                                setitimer, SIGALRM & multithread process (linux, c)
                            
                                Change working directory in shell with a python script
                            
                                Proper usage of volatile sig_atomic_t
                            
                                Make's output, the number in the brackets
                            
                                what is the relationship between X11 and gnome? [closed]
                            
                                What is active memory and inactive memory [closed]
                            
                                How to get environment of a program while debugging it in GDB
                            
                                How to delete a cron job with Ansible?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to use sed to extract lines in specified order?

Tags:

linux

bash

unix

sed