Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save modifications in place with NON GNU awk

I have come across a question(on SO itself) where OP has to do edit and save operation into Input_file(s) itself.

I know for a single Input_file we could do following:

awk '{print "test here..new line for saving.."}' Input_file > temp && mv temp Input_file

Now lets say we need to make changes in same kind of format of files(assume .txt here).

What I have tried/thought for this problem: Its approach is going through a for loop of .txt files and calling single awk is a painful and NOT recommended process, since it will waste unnecessary cpu cycles and for more number of files it would be more slow.

So what possibly could be done here to perform inplace edit for multiple files with a NON GNU awk which does not support inplace option. I have also gone through this thread Save modifications in place with awk but there is nothing much for NON GNU awk vice and changing multiple files inplace within awk itself, since a non GNU awk will not have inplace option to it.

NOTE: Why I am adding bash tag since, in my answer part I have used bash commands to rename temporary files to their actual Input_file names so adding it.



EDIT: As per Ed sir's comment adding an example of samples here, though purpose of this thread's code could be used by generic purpose inplace editing too.

Sample Input_file(s):

cat test1.txt
onetwo three
tets testtest

cat test2.txt
onetwo three
tets testtest

cat test3.txt
onetwo three
tets testtest

Sample of expected output:

cat test1.txt
1
2

cat test2.txt
1
2

cat test3.txt
1
2
like image 484
RavinderSingh13 Avatar asked Feb 04 '23 16:02

RavinderSingh13


1 Answers

Since main aim of this thread is how to do inplace SAVE in NON GNU awk so I am posting first its template which will help anyone in any kind of requirement, they need to add/append BEGIN and END section in their code keeping their main BLOCK as per their requirement and it should do the inplace edit then:

NOTE: Following will write all its output to output_file, so in case you want to print anything to standard output please only add print... statement without > (out) in following.

Generic Template:

awk -v out_file="out" '
FNR==1{
close(out)
out=out_file count++
rename=(rename?rename ORS:"") "mv \047" out "\047 \047" FILENAME "\047"
}
{
    .....your main block code.....
}
END{
 if(rename){
   system(rename)
 }
}
' *.txt


Specific provided sample's solution:

I have come up with following approach within awk itself (for added samples following is my approach to solve this and save output into Input_file itself)

awk -v out_file="out" '
FNR==1{
  close(out)
  out=out_file count++
  rename=(rename?rename ORS:"") "mv \047" out "\047 \047" FILENAME "\047"
}
{
  print FNR > (out)
}
END{
  if(rename){
    system(rename)
  }
}
' *.txt

NOTE: this is only a test for saving edited output into Input_file(s) itself, one could use its BEGIN section, along with its END section in their program, main section should be as per the requirement of specific question itself.

Fair warning: Also since this approach makes a new temporary out file in path so better make sure we have enough space on systems, though at final outcome this will keep only main Input_file(s) but during operations it needs space on system/directory



Following is a test for above code.

Execution of program with an example: Lets assume following are the .txt Input_file(s):

cat << EOF > test1.txt
onetwo three
tets testtest
EOF

cat << EOF > test2.txt
onetwo three
tets testtest
EOF

cat << EOF > test3.txt
onetwo three
tets testtest
EOF

Now when we run following code:

awk -v out_file="out" '
FNR==1{
  close(out)
  out=out_file count++
  rename=(rename?rename ORS:"") "mv \047" out "\047 \047" FILENAME "\047"
}
{
  print "new_lines_here...." > (out)
}
END{
  if(rename){
    system("ls -lhtr;" rename)
  }
}
' *.txt

NOTE: I have place ls -lhtr in system section intentionally to see which output files it is creating(temporary basis) because later it will rename them into their actual name.

-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test2.txt
-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test1.txt
-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test3.txt
-rw-r--r-- 1 runner runner  38 Dec  9 05:33 out2
-rw-r--r-- 1 runner runner  38 Dec  9 05:33 out1
-rw-r--r-- 1 runner runner  38 Dec  9 05:33 out0

When we do a ls -lhtr after awk script is done with running, we could see only .txt files in there.

-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test2.txt
-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test1.txt
-rw-r--r-- 1 runner runner  27 Dec  9 05:33 test3.txt


Explanation: Adding a detailed explanation of above command here:

awk -v out_file="out" '                                    ##Starting awk program from here, creating a variable named out_file whose value SHOULD BE a name of files which are NOT present in our current directory. Basically by this name temporary files will be created which will be later renamed to actual files.
FNR==1{                                                    ##Checking condition if this is very first line of current Input_file then do following.
  close(out)                                               ##Using close function of awk here, because we are putting output to temp files and then renaming them so making sure that we shouldn't get too many files opened error by CLOSING it.
  out=out_file count++                                     ##Creating out variable here, whose value is value of variable out_file(defined in awk -v section) then variable count whose value will be keep increment with 1 whenever cursor comes here.
  rename=(rename?rename ORS:"") "mv \047" out "\047 \047" FILENAME "\047"     ##Creating a variable named rename, whose work is to execute commands(rename ones) once we are done with processing all the Input_file(s), this will be executed in END section.
}                                                          ##Closing BLOCK for FNR==1  condition here.
{                                                          ##Starting main BLOCK from here.
  print "new_lines_here...." > (out)                       ##Doing printing in this example to out file.
}                                                          ##Closing main BLOCK here.
END{                                                       ##Starting END block for this specific program here.
  if(rename){                                              ##Checking condition if rename variable is NOT NULL then do following.
    system(rename)                                         ##Using system command and placing renme variable inside which will actually execute mv commands to rename files from out01 etc to Input_file etc.
  }
}                                                          ##Closing END block of this program here.
' *.txt                                                    ##Mentioning Input_file(s) with their extensions here.
like image 182
RavinderSingh13 Avatar answered Feb 06 '23 15:02

RavinderSingh13