Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shell script - search and replace text in multiple files using a list of strings

Tags:

replace

unix

I have a file "changesDictionary.txt" containing (a variable number of) pairs of key-value strings.

e.g.

"textToSearchFor" = "theReplacementText"

(The format of the dictionary is unimportant, and be changed as required.)

I need to iterate through the contents of a given directory, including sub-directories. For each file encountered with the extension ".txt", we search for each of the keys in changesDictionary.txt, replacing each found instance with the replacement string value.

i.e. a search and replace over multiple files, but using a list of search/replace terms rather than a single search/replace term.

How could I do this? (I have studied single search/replace examples, but do not understand how to do multiple searches within a file.)

The implementation (bash, perl, whatever) is not important as long as I can run it from the command line in Mac OS X. Thanks for any help.

like image 826
SirRatty Avatar asked Mar 16 '09 00:03

SirRatty


1 Answers

I'd convert your changesDictionary.txt file to a sed script, with... sed:

$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' \
      changesDictionary.txt  > changesDictionary.sed

Note, any special characters for either regular expressions or sed expressions in your dictionary will be falsely interpreted by sed, so your dictionary can either only have only the most primitive search-and-replacements, or you'll need to maintain the sed file with valid expressions. Unfortunately, there's no easy way in sed to either shut off regular expression and use only string matching or quote your searches and replacements as "literals".

With the resulting sed script, use find and xargs -- rather than find -exec -- to convert your files with the sed script as quickly as possible, by processing them more than one at a time.

$ find somedir -type f -print0 \
   | xargs -0 sed -i -f changesDictionary.sed

Note, the -i option of sed edits files "in-place", so be sure to make backups for safety, or use -i~ to create tilde-backups.

Final note, using search and replaces can have unintended consequences. Will you have searches that are substrings of other searches? Here's an example.

$ cat changesDictionary.txt
"fix" = "broken"
"fixThat" = "Fixed"
$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' changesDictionary.txt  \
   | tee changesDictionary.sed
s/fix/broken/g
s/fixThat/Fixed/g
$ mkdir subdir
$ echo fixThat > subdir/target.txt
$ find subdir -type f -name '*.txt' -print0 \
   | xargs -0 sed -i -f changesDictionary.sed
$ cat subdir/target.txt
brokenThat

Should "fixThat" have become "Fixed" or "brokenThat"? Order matters for sed script. Similarly, a search and replace can be search and replaced more than once -- changing "a" to "b", may be changed by another search-and-replace later from "b" to "c".

Perhaps you've already considered both of these, but I mention because I've tried what you were doing before and didn't think of it. I don't know of anything that simply does the right thing for doing multiple search and replacements at once. So, you need to program it to do the right thing yourself.

like image 106
ashawley Avatar answered Sep 23 '22 19:09

ashawley