Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete all lines before the first and after the last occurrence of a string?

Tags:

regex

grep

sed

awk

cat grab.txt

My Dashboard
Fnfjfjf. random test
00:50

1:01:56
My Notes
No data found.

                                
Change Language                                                                                                                  + English                                                          

Submit


Estimation of Working Capital Lecture 1

Estimation of Working Capital Lecture 2

Estimation of Working Capital Lecture 3

Money Market Lecture 254

Money Market Lecture 255

Money Market Lecture 256

International Trade Lecture 257

International Trade Lecture 258

International Trade Lecture 259
Terms And Conditions
84749473837373
Random text fifjfofifofjfkfkf

I need to filter this text after doing the following

  1. Delete all lines before the first occurrence of word - Lecture
  2. Delete all lines after the last occurrence of word - Lecture
  3. Remove all empty lines

Expected output

Estimation of Working Capital Lecture 1
Estimation of Working Capital Lecture 2
Estimation of Working Capital Lecture 3
Money Market Lecture 254
Money Market Lecture 255
Money Market Lecture 256
International Trade Lecture 257
International Trade Lecture 258
International Trade Lecture 259

What have I tried so far?

cat grab.txt | sed -r '/^\s*$/d; /Lecture/,$!d'

After searching for a bit and some trial-error, I am able to remove empty lines and remove all lines before the first occurrence but unable to remove all lines after the last occurrence.

Note - Even tho my existing command is using sed, its fine if the answer is in awk, perl or grep

like image 772
Sachin Avatar asked Jun 21 '20 02:06

Sachin


People also ask

How to delete line using sed?

To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.

How to delete a line in linux?

First, bring your cursor to the line you want to delete. Press the “Esc” key to change the mode. Now type, “:d”, and press “Enter” to delete the line or quickly press “dd”.


1 Answers

Could you please try following. Written and tested with shown samples with GNU awk.

awk '
/Lecture/{
  found=1
}
found && NF{
  val=(val?val ORS:"")$0
}
END{
  if(val){
    match(val,/.*Lecture [0-9]+/)
    print substr(val,RSTART,RLENGTH)
  }
}'  Input_file

Explanation: Adding detailed explanation for above.

awk '                                        ##Starting awk program from here.
/Lecture/{                                   ##Checking if a line has Lecture keyword then do following.
  found=1                                    ##Setting found to 1 here.
}
found && NF{                                 ##Checking if found is SET and line is NOT NULL then do following.
  val=(val?val ORS:"")$0                     ##Creating va and keep adding its value in it.
}
END{                                         ##Starting END block of this code here.
  if(val){                                   ##Checking condition if val is set then do following.
    match(val,/.*Lecture [0-9]+/)            ##Matching regex till Lecture digits in its value.
    print substr(val,RSTART,RLENGTH)         ##Printing sub string of matched values here to print only matched values.
  }
}' Input_file                                ##Mentioning Input_file name here.
like image 90
RavinderSingh13 Avatar answered Nov 12 '22 12:11

RavinderSingh13