Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find and Replace with Spaces using Sed Mac Terminal

I have a .CSV file with over 500,000 lines that I need to:

  1. find all 'space double quote space' sequences and replace with nothing
  2. find all 'space double quote' sequences and replace with nothing
  3. find all double quotes and replace with nothing

Example of .CSV line:

"DISH Hartford & New Haven  (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

** Required output**

DISH Hartford & New Haven  (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

I need to remove all double quotes (") and spaces in front of and behind the commas (,).

I've tried

$ cd /Users/Leonna/Downloads/
$ cat bs-B2Bformat.csv | sed s/ " //g

This gives me the 'command incomplete' greater than prompt, so I then tried:

$ cat bs-B2Bformat.csv | sed s/ " //g
sed: 1: "s/": unterminated substitute pattern
$ cat bs-B2Bformat.csv |sed s/ \" //g
sed: 1: "s/": unterminated substitute pattern
$

There are too many lines for me to edit in Excel (Excel won't load all the lines) or even a text editor. How can I fix this?

like image 282
Leonna Sylvester Avatar asked Sep 17 '13 02:09

Leonna Sylvester


1 Answers

Quoted from here:

For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension.

Based on that, I would suggest the following, which, as Jonathan Leffler pointed out, is portable across GNU and BSD implementations.

sed -E 's/[[:space:]]?"[[:space:]]?//g' <path/to/file>

The -E flag enables extended regular expressions on BSD implementations. On GNU sed it is undocumented, but as discussed here, it enables compatibility with the BSD standard.

Quoted from the manual for BSD sed:

-E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's).

Applying the above command on a file containing the following single line

"DISH Hartford & New Haven (Hartford)", "206", "FBNHD", " 06028", " East Windsor Hill", "CT", "Hartford County"

it yields

DISH Hartford & New Haven (Hartford),206,FBNHD,06028,East Windsor Hill,CT,Hartford County

like image 144
brunocodutra Avatar answered Sep 24 '22 22:09

brunocodutra