Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most readable regex to extract a second word with no trailing spaces from comma-separated string?

I have an array of strings of the form:

@source = (
     "something,something2,third"
    ,"something,something3   ,third"
    ,"something,something4"
    ,"something,something 5" # Note the space in the middle of the word
);

I need a regex which will extract the second of the comma separated words, BUT without the trailing spaces, putting those second words in an array.

@expected_result = ("something2","something3","something4","something 5");

What is the most readable way of achieving this?

I have 3 possibilities, neither of which seems optimal readability wise:

  1. Pure regex and then capture $1

    @result = map { (/[^,]+,([^,]*[^, ]) *(,|$)/ )[0] } @source;
    
  2. Split on commas (this is NOT a CSV so no parsing needed), then trim:

    @result = map { my @s = split(","), $s[1] =~ s/ *$//; $s[1] } @source;
    
  3. Put split and trim into nested maps

    @result = map { s/ *$//; $_ } map { (split(","))[1] } @source;
    

Which one of these is better? Any other even more readable alternative I'm not thinking of?

like image 546
DVK Avatar asked Dec 17 '22 02:12

DVK


1 Answers

Of those possibilities, I think #2 is the clearest, though I think I'd adjust it slightly to include the spaces in the split:

@result = map { my @s = split(/ *(?:,|$)/); $s[1] } @source;

(For that matter, I might actually write /[ ]*(?:,|$)/, with a no-op character class, just so it's a bit more visible what the * is quantifying.)

Edited to add: Whoops, I had a stupid mistake before, where this wouldn't remove the trailing space from something like "foo, bar ". Now that I've fixed that mistake, the result isn't so nice and simple, and I'm no longer sure if I recommend the above!

like image 181
ruakh Avatar answered May 11 '23 12:05

ruakh