Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace groups of text all together with gVim

Tags:

regex

vim

Consider the following data:

Class   Gender  Condition   Tenis
A   Male    Fail Fail   33
A   Female  Fail NotFail    23
S  Male     Yellow     14
BC  Male    Happy Elephant  44

I have a comma separated value with unformatted tabulation (it varies among tabs and whitespaces).

In one specific column I have compound words which I would like to eliminate the space. In the above example, I would like to replace "Fail " with "Fail_" and "Happy" with "Happy_".

The result would be the following:

Class   Gender  Condition   Tenis
A   Male    Fail_Fail   33
A   Female  Fail_NotFail    23
S  Male     Yellow     14
BC  Male    Happy_Elephant  44

I already managed to do that in two steps:

:%s/Fail /Fail_/g
:%s/Happy /Happy_/g

Question: As I'm very new to gVim I am trying to implement these replacements all together, but I could not find how to do that*.

After this step, I will tabulate my data with the following:

:%s/\s\+/,/g

And get the final result:

Number,Gender,Condition,Tenis
A,Male,Fail_Fail,33
A,Female,Fail_NotFail,23
S,Male,Yellow,14
BC,Male,Happy_Elephant,44

On SO, I searched for [vim] :%s two is:question and some variations, but I could not find a related thread, so I guess I am lacking the correct terminology.


Edit: This is the actual data (with more than 1 million rows). The problem starts in the 12th column (e.g. "Fail Planting" should be "Fail_Planting").

SP1     51F001      3   1   1   2   3   2001    52  52  H   Normal          17,20000076 23,39999962 NULL    NULL
SP1     51F001      3   1   1   2   3   2001    53  53  F   Fail Planting   0   0   NULL    NULL
SP1     51F001      3   1   1   2   3   2001    54  54  N   Normal          13,89999962 0   NULL    NULL
like image 717
Andre Silva Avatar asked Dec 01 '22 01:12

Andre Silva


2 Answers

You can use an expression on the right hand side of the substitution.

:%s/\(Fail\|Happy\) \|\s\+/\= submatch(0) =~# '^\s\+$' ? ',' : submatch(1).'_'/g

So this finds Fail or Happy or whitespace and then converts checks to see if the matched part is completely whitespace. It it is replace by a comma if it is not use the captured part and append an underscore. submatch(0) is the whole match and submatch(1) is the first capture group.

Take a look at :h sub-replace-expression. If you want to do something very complex define you can define a function.


Very magic version

:%s/\v(Fail|Happy) |\s+/\= submatch(0) =~# '^\v\s+$' ? ',' : submatch(1).'_'/g
like image 66
FDinoff Avatar answered Dec 02 '22 13:12

FDinoff


You have all the parts you just need to combine them together with |. Example:

:%s/\>\s\</_/g|%s/\s\+/,/g

I am using \> and \< to find words that only have one space between them so we can replace it with _.

For more help see:

:h /\>
:h :range
:h :bar
like image 38
Peter Rincker Avatar answered Dec 02 '22 15:12

Peter Rincker