Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing a String Pattern with another sequence in unix

Tags:

unix

awk

I want replace the String TaskID_1 with a sequence starting from 1001 and this TaskID_1 can exists any many number of lines in my input file. Similarly i need to replace all occurrences of TASKID_2 in my input file with next sequence value 1002.

Input file:

12345|45345|TaskID_1|dksj|kdjfdsjf|12
1245|425345|TaskID_1|dksj|kdjfdsjf|12
1234|25345|TaskID_2|dksj|kdjfdsjf|12
123425|65345|TaskID_2|dksj|kdjfdsjf|12
123425|15325|TaskID_1|dksj|kdjfdsjf|12
11345|55315|TaskID_2|dksj|kdjfdsjf|12
6345|15345|TaskID_3|dksj|kdjfdsjf|12
72345|25345|TaskID_4|dksj|kdjfdsjf|12
9345|411345|TaskID_3|dksj|kdjfdsjf|12

The output file should look like:

12345|45345|1001|dksj|kdjfdsjf|12
1245|425345|1001|dksj|kdjfdsjf|12
1234|25345|1002|dksj|kdjfdsjf|12
123425|65345|1002|dksj|kdjfdsjf|12
123425|15325|1001|dksj|kdjfdsjf|12
11345|55315|1002|dksj|kdjfdsjf|12
6345|15345|1003|dksj|kdjfdsjf|12
72345|25345|1004|dksj|kdjfdsjf|12
9345|411345|1003|dksj|kdjfdsjf|12
like image 267
Ramkumar Avatar asked Dec 19 '12 12:12

Ramkumar


People also ask

How do you replace some text pattern with another text pattern in a file?

`sed` command is one of the ways to do replacement task. This command can be used to replace text in a string or a file by using a different pattern.

Which command is used to replace a pattern?

Explanation: For replacing a pattern with another pattern we use substitution command which is represented by 's'.


1 Answers

Here's one way using awk:

awk 'BEGIN { FS=OFS="|" } { $3=1000 + NR }1' file

Or less verbosely:

awk -F '|' '{ $3=1000 + NR }1' OFS='|' file

Results:

12345|45345|1001|dksj|kdjfdsjf|12
1245|425345|1002|dksj|kdjfdsjf|12
1234|25345|1003|dksj|kdjfdsjf|12
123425|65345|1004|dksj|kdjfdsjf|12
123425|15325|1005|dksj|kdjfdsjf|12
11345|55315|1006|dksj|kdjfdsjf|12
6345|15345|1007|dksj|kdjfdsjf|12
72345|25345|1008|dksj|kdjfdsjf|12
9345|411345|1009|dksj|kdjfdsjf|12

For the first example, the file separator and output file separator are set to a single pipe character. This is set in the BEGIN block, so that it is executed only once, and not on every line of input. We then set the third column to be equal to 1000 plus an incrementing variable. We could use ++i as this variable, but we could instead use NR (which is short for record number/line number) and this would therefore avoid the need to create an extra variable. The 1 on the end enables printing by default. A more verbose solution would look like:

awk 'BEGIN { FS=OFS="|" } { $3=1000 + NR; print }' file

EDIT:

Using the updated data file, try:

awk 'BEGIN { FS=OFS="|" } { sub(/.*_/,"",$3); $3+=1000 }1' file

Results:

12345|45345|1001|dksj|kdjfdsjf|12
1245|425345|1001|dksj|kdjfdsjf|12
1234|25345|1002|dksj|kdjfdsjf|12
123425|65345|1002|dksj|kdjfdsjf|12
123425|15325|1001|dksj|kdjfdsjf|12
11345|55315|1002|dksj|kdjfdsjf|12
6345|15345|1003|dksj|kdjfdsjf|12
72345|25345|1004|dksj|kdjfdsjf|12
9345|411345|1003|dksj|kdjfdsjf|12
like image 199
Steve Avatar answered Nov 10 '22 00:11

Steve