Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing spaces from the fields in a pipe-delimited file using shell script

Tags:

shell

sed

I am new to UNIX Shell scripting.

I need help in removing leading and trailing blank spaces from the fields. But I need to retain the spaces between the words.

Please have a look at the data sample and the desired result below to understand my problem.

Data Sample :

1-B48980007       |82984788|317      |ALQ|     |4423271    |              0|  |

I0000000000000000000245729|28887957|IL FR    |   |     |00000000573|              0|  |

I0000000000000000000245715|13822348|RPVIPPR  |   |     |00000000298|              0|  |

I0000000000000000000245721|15348717|AN BV    |   |     |00000001526|              0|  |

Desired Result:

1-B48980007|82984788|317|ALQ||4423271|0||

I0000000000000000000245729|28887957|IL FR|||00000000573|0||

I0000000000000000000245715|13822348|RPVIPPR|||00000000298|0||

I0000000000000000000245721|15348717|AN BV|||00000001526|0||

But I am getting the output as below on using the below command:

sed 's/ *\|/\|/g' file_name > testOP

pipeline('|') is a delimiter in my file. I need to remove the spaces before and after the pipeline but need to retain the spaces between the words for example: "IL FR" and "AN BV".

1-B48980007     |82984788|317|ALQ||4423271|           0||

I0000000000000000000245729|28887957|IL FR|  ||00000000573|            0||

I0000000000000000000245715|13822348|RPVIPPR|    ||00000000298|            0||

I0000000000000000000245721|15348717|AN BV|  ||00000001526|            0||

Any help is greatly appreciated.

Thanks, Savitha

like image 358
Savitha Avatar asked Nov 04 '11 05:11

Savitha


People also ask

How do I get rid of extra spaces in Linux?

s/[[:space:]]//g; – as before, the s command removes all whitespace from the text in the current pattern space.

How do I remove spaces from a line in Unix?

The command gsub(/ /,"") removes blanks from those lines.


3 Answers

Using:

sed -e 's/ *| */|/g' file_name

gives the desired result:

1-B48980007|82984788|317|ALQ||4423271|0||

I0000000000000000000245729|28887957|IL FR|||00000000573|0||

I0000000000000000000245715|13822348|RPVIPPR|||00000000298|0||

I0000000000000000000245721|15348717|AN BV|||00000001526|0||

Note that this approach removes only space characters. To exclude all whitespace, tab characters must be accounted for as well. With any POSIX-compliant implementation of sed, you could do this:

sed -e 's/[[:space:]]*|[[:space:]]*/|/g' file_name

Or, with GNU extensions to the regex:

sed -e 's/\s*|\s*/|/g' file_name
like image 105
Michael J. Barber Avatar answered Oct 08 '22 05:10

Michael J. Barber


This might work:

sed 's/\s*|\s*/|/g' input_file

EDIT: removed unnecessary parens and alternation

like image 21
potong Avatar answered Oct 08 '22 04:10

potong


I resolved the issue with the below sed statement:

sed -e 's/ *\|/\|/g' -e 's/press_tab_key_here*\|press_tab_key_here*/\|/g' -e 's/\| */\|/g' file_name

to remove the tab spaces, I had to press "tab" key. '\t' didn't work in my case.

Thanks Michael, Potong and Triplee for all the help and support. :)

like image 26
Savitha Avatar answered Oct 08 '22 03:10

Savitha