I'm trying to remove duplicate lines from a file and update the file. For some reason I have to write it to a new file and replace it. Is this the only way?
awk '!seen[$0]++' .gitignore > .gitignore
awk '!seen[$0]++' .gitignore > .gitignore_new && mv .gitignore_new .gitignore
Redirecting to the same output file as input file like:
awk '!seen[$0]++' .gitignore > .gitignore
will end with an empty file. This is because using the >
operator, the shell will open and truncate the file before the command get's executed. Meaning you'll lose all your data.
With newer versions of GNU awk you can use the -i inplace
option to edit the file in place:
awk -i inplace '!seen[$0]++' .gitignore
If you don't have a recent version of GNU awk, you'll need to create a temporary file:
awk '!seen[$0]++' .gitignore > .gitignore.tmp
mv .gitignore.tmp .gitignore
Another alternative is to use the sponge
program from moreutils
:
awk '!seen[$0]++' .gitignore | sponge .gitignore
sponge
will soak all stdinput and open the output file after that. This effectively keeps the input file intact before writing to it.
Thomas, I believe the problem is that you are reading from it and writing to it on the same command. This is why you must put to a temporary file first.
The > does overwrite, so you are using the correct redirect operator
- Redirect output from a command to a file on disk. Note: if the file already exist, it will be erased and overwritten without warning, so be careful.
Example: ps -ax >processes.txt Use the ps command to get a list of processes running on the system, and store the output in a file named processes.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With