I have 2 text files. File1 has about 1,000 lines and File2 has 20,000 lines. An extract of File1 is as follows:
Thrust
Alien Breed Special Edition '92
amidar
mario
mspacman
Bubble Bobble (Japan)
An extract of File2 is as follows:
005;005;Arcade-Vertical;;;;;;;;;;;;;;
Alien Breed Special Edition '92;Alien Breed Special Edition '92;Amiga;;1992;Team 17;Action / Shooter;;;;;;;;;;
Alien 8 (Japan);Alien 8 (Japan);msx;;1987;Nippon Dexter Co., Ltd.;Action;1;;;;;;;;;
amidar;amidar;Arcade-Vertical;;;;;;;;;;;;;;
Bubble Bobble (Japan);Bubble Bobble (Japan);msx2;;;;;;;;;;;;;;
Buffy the Vampire Slayer - Wrath of the Darkhul King (USA, Europe);Buffy the Vampire Slayer - Wrath of the Darkhul King (USA, Europe);Nintendo Game Boy Advance;;2003;THQ;Action;;;;;;;;;;
mario;mario;FBA;;;;;;;;;;;;;;
mspacman;mspacman;Arcade-Vertical;;;;;;;;;;;;;;
Thrust;Thrust;BBC Micro;;;;;;;;;;;;;;
Thunder Blade (1988)(U.S. Gold)[128K];Thunder Blade (1988)(U.S. Gold)[128K];ZX Spectrum;;;;;;;;;;;;;;
Thunder Mario v0.1 (SMB1 Hack);Thunder Mario v0.1 (SMB1 Hack);Nintendo NES Hacks 2;;;;;;;;;;;;;;
In File3 (the output file), using grep, sed, awk or a bash script, I would like to achieve the following output:
Thrust;Thrust;BBC Micro;;;;;;;;;;;;;;
Alien Breed Special Edition '92;Alien Breed Special Edition '92;Amiga;;1992;Team 17;Action / Shooter;;;;;;;;;;
amidar;amidar;Arcade-Vertical;;;;;;;;;;;;;;
mario;mario;FBA;;;;;;;;;;;;;;
mspacman;mspacman;Arcade-Vertical;;;;;;;;;;;;;;
Bubble Bobble (Japan);Bubble Bobble (Japan);msx2;;;;;;;;;;;;;;
When I used Grep, for example, to produce File3, I found that it automatically sorts the file contents. I would like to keep the same order as for File1.
An example of code I have used which winds up sorting File3 (which I don't want) is as follows:
grep -F -w -f /home/pi/.attract/stats/File1.txt /home/pi/.attract/stats/File2.txt > /home/pi/.attract/stats/File3.txt
the grep Command in Bash For searching a specific keyword, phrase, or pattern in a file, there is a specialized command in a Bash script, grep . We can use this command to display the lines before and after the keyword matched in a specific file. This command uses flags like -A , -B , and -C .
Grep, sed, and AWK are all standard Linux tools that are able to process text. Each of these tools can read text files line-by-line and use regular expressions to perform operations on specific parts of the file. However, each tool differs in complexity and what can be accomplished.
Tools like sed (stream editor) and grep (global regular expression print) are powerful ways to save time and make your work faster. Before diving deep into the use cases, I would like to briefly explain regular expressions (regexes), which are necessary for the text manipulation that we will do later.
awk – this command is a useful and powerful command used for pattern matching as well as for text processing. This command will display only the third column from the long listing of files and directories. sed – this is a powerful command for editing a 'stream' of text.
Using awk:
$ awk -F\; 'NR==FNR{a[$1]=$0;next}$1 in a{print a[$1]}' file2 file1
Output:
Thrust;Thrust;BBC Micro;;;;;;;;;;;;;;
Alien Breed Special Edition '92;Alien Breed Special Edition '92;Amiga;;1992;Team 17;Action / Shooter;;;;;;;;;;
amidar;amidar;Arcade-Vertical;;;;;;;;;;;;;;
mario;mario;FBA;;;;;;;;;;;;;;
mspacman;mspacman;Arcade-Vertical;;;;;;;;;;;;;;
Bubble Bobble (Japan);Bubble Bobble (Japan);msx2;;;;;;;;;;;;;;
Explained:
awk -F\; '
NR==FNR { # process file2
a[$1]=$0 # hash record to a, use $1 as key
next # process next record
}
($1 in a) { # if file1 entry is found in hash a
print a[$1] # output it
}' file2 file1 # mind the order. this way file1 dictates the output order
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With