I have comma separated tokens, I need to convert them to tokens in colons separated by spaces. I wanted to use regular expression in notepad++ but had a problem.
My input is:
aaaaa, bbb ,cccc, hhhh, fff,t
I would like to get as result:
aaaaa     bbb       cccc     hhhh      fff       t
Each token gets exactly 10 characters
My problem is how to make the output exactly 10 characters?
A normal “Find and Replace” can't do that, but it's possible with “Regular Expressions”. In Notepad++ press Ctr+H to open the “Find and Replace” window. Under Search Mode: choose “Regular expression” and then check the “matches newline” checkbox.
If you find any unnecessary commas in data then you can get them removed, owing to various functions, like TRIM, SUBSTITUTE, FIND, LEN, REPLACE or you can use FIND & REPLACE (CTRL + H). You can choose from several methods to remove them.
I see this as a two step process. Step One replace all the commas with 10 spaces. Step Two capture 10 characters and all trailing spaces, and replace with just the 10 captured characters.
,\s*|\s*$

Replace with: __________ these are unbars, but you should really use ten or more spaces.
Live Demo: https://regex101.com/r/mR1eS9/1
Sample Text
aaaaa, bbb ,cccc, hhhh, fff,t
After Replacement
aaaaa          bbb           cccc          hhhh          fff          t                    
123456789,123456789,123456789,123456789,123456789,123456789,123456789,123456789
Note: I inserted the number line here to help illustrate the number and position of characters
(.{10})[^\S\n\r]*

Replace with: $1
Live Demo: https://regex101.com/r/uL8oO7/2
Sample Text
Because this is step two, the sample text is the output from step one above
aaaaa          bbb           cccc          hhhh          fff          t                    
After Replacement
aaaaa     bbb       cccc      hhhh      fff       t         
123456789,123456789,123456789,123456789,123456789,123456789,123456789,123456789
Note: I inserted the number line here to help illustrate the number and position of characters
Regex computation model is so simple that it cannot count. However, in situations when you have only nine possible non-empty matches you can run nine separate global replacements to cover all possibilities (underscores _ are used in place of spaces  for clarity):
Search         Replacement
-------------  -----------
(?<=\b\S{9}),\s  _
(?<=\b\S{8}),\s  __
(?<=\b\S{7}),\s  ___
(?<=\b\S{6}),\s  ____
...
(?<=\b\S{1}),\s  _________
Each replacement operation matches a comma, space pair that follows x non-space characters, and replaces them with 10-x spaces.
Perhaps a solution with a programming language might be better to read and comprehend.
Find code samples for PHP and Python below (can easily be adopted to other languages as well):
<?php
$string = "aaaaa, bbb ,cccc, hhhh, fff,t";
$regex = '~(\w+)(\s*,|$)~';
# look for word characters, followed by spaces (or not) 
# and a comma or the end of the string
$string = preg_replace_callback(
    $regex,
    function($match) {
        return str_pad($match[1], 10);
    },
    $string);
echo $string;
# aaaaa      bbb       cccc       hhhh       fff       t         
?>
See a demo on ideone.com.
import re
string = "aaaaa, bbb ,cccc, hhhh, fff,t";
def repl(match):
    return match.group(1).ljust(10)
rx = r'(\w+)(\s*,|$)'
string = re.sub(rx, repl, string)
print string
A demo on ideone.com as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With