I have a relatively large csv/text data file (33mb) that I need to do a global search and replace the delimiting character on. (The reason is that there doesn't seem to be a way to get SQLServer to escape/handle double quotes in the data during a table export, but that's another story...)
I successfully accomplished a Textmate search and replace on a smaller file, but it's choking on this larger file.
It seems like command line grep may be the answer, but I can't quite grasp the syntax, ala:
grep -rl OLDSTRING . | xargs perl -pi~ -e ‘s/OLDSTRING/NEWSTRING/’
So in my case I'm searching for the '^' (caret) character and replacing with '"' (double-quote).
grep -rl " grep_test.txt | xargs perl -pi~ -e 's/"/^'
That doesn't work and I'm assuming it has to do with the escaping of the doublequote or something, but I'm pretty lost. Help anyone?
(I suppose if anyone knows how to get SQLServer2005 to handle double quotes in a text column during export to csv, that'd really solve the core issue.)
Your perl substitution seems to be wrong. Try:
grep -rl \" . | xargs perl -pi~ -e 's/\^/"/g'
Explanation:
grep : command to find matches
-r : to recursively search
-l : to print only the file names where match is found
\" : we need to escape " as its a shell meta char
. : do the search in current working dir
perl : used here to do the inplace replacement
-i~ : to do the replacement inplace and create a backup file with extension ~
-p : to print each line after replacement
-e : one line program
\^ : we need to escape caret as its a regex meta char to mean start anchor
sed -i.bak 's/\^/"/g' mylargefile.csv
Update: you can also use Perl as rein has suggested
perl -i.bak -pe 's/\^/"/g' mylargefile.csv
But on big files, sed may run a bit faster than Perl, as my result shows on a 6million line file
$ tail -4 file
this is a line with ^
this is a line with ^
this is a line with ^
$ wc -l<file
6136650
$ time sed 's/\^/"/g' file >/dev/null
real 0m14.210s
user 0m12.986s
sys 0m0.323s
$ time perl -pe 's/\^/"/g' file >/dev/null
real 0m23.993s
user 0m22.608s
sys 0m0.630s
$ time sed 's/\^/"/g' file >/dev/null
real 0m13.598s
user 0m12.680s
sys 0m0.362s
$ time perl -pe 's/\^/"/g' file >/dev/null
real 0m23.690s
user 0m22.502s
sys 0m0.393s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With