Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using LINQ remove vowels from string

Tags:

c#

lambda

I want remove vowels from string array i did it with foreach loops but now want to perform it with using LINQ or Lambda expression

I have tried the following code LINQ

string[] strArray = new string[] { "cello", "guitar", "violin"};
string[] vowels = new string[] { "a", "e", "i", "o", "u" };

var vNovowels = from vitem in strArray
                from vowel in vowels
                where vitem.Contains(vowel)
                select vitem.Replace(vowel, "");

foreach (var item in vNovowels)
{
    Console.WriteLine(item); 
}

But i am not getting what is expected .

Output i am getting with above query is :-

cllo
cell
guitr
gutar
gitar
voln
vilin

Desired output :

cll
gtr
vln
like image 535
ghargedeepak Avatar asked Feb 20 '14 09:02

ghargedeepak


3 Answers

You can accomplish this very efficiently using regular expressions to match all vowels and replace them with empty strings:

var strArray = new List<string> { "cello", "guitar", "violin" };
var pattern = @"[aeiou]";
var noVowels = strArray.Select(item => 
                  Regex.Replace(item, pattern, "", RegexOptions.IgnoreCase));
foreach (var item in noVowels) {         
    Console.WriteLine(item); 
}

This returns the outputs that you are looking for.

Your original attempt did not work because it evaluated each word separately for every unique vowel that it contained.

Update: I did some basic benchmarking of this solution versus Mathias' HashSet<char> based solution (benchmark code here), including both Compile and Noncompiled versions of the Regex version. I ran it against an array of 2582 lorem-ipsum words, iterating 10 million times against the set (so going at ~25 billion words), running it in LinqPad, taking the average of 3 runs:

                  Init Each Time              Init One Time
                avg ms      % diff          avg ms     % diff
Regex            586          +1%            586          -
Regex Compiled   581          -              593         +1%
HashSet         2550        +339%            641        +10%

It turns out that if you only initialize the HashSet and pattern string one time, then they have very similar performance. Regex beats out Hashset, but only barely (80 ms faster over 25 billion words) and Regex Compiled and Noncompiled perform almost identically. However, if you initialize the HashSet every single time you run it, then it kills performance for the HashSet approach.

The takeaway is that if you want to use the HashSet approach, be sure to initialize your HashSet only once per set of chars that you want to exclude.

like image 155
Yaakov Ellis Avatar answered Oct 26 '22 05:10

Yaakov Ellis


Although Yaakov's reg-ex solution is much better in terms of elegancy and efficiency, you can use Where for the sake of learning:

string[] strArray = new string[] { "cello", "guitar", "violin" };
var vowels = new HashSet<char>("aeiou"); // or: { 'a', 'e', 'i', 'o', 'u' };

var vNovowels2 = from vitem in strArray
                 select new string(vitem.Where(c => !vowels.Contains(c)).ToArray());

foreach (var item in vNovowels2)
{
    Console.WriteLine(item);
}
like image 32
Matthias Meid Avatar answered Oct 26 '22 05:10

Matthias Meid


Regex Replace is best way to do this.

string[] strArray = new string[] { "cello", "guitar", "violin" };

var rx = new Regex("^a|e|i|o|u", RegexOptions.IgnoreCase);

var vNovowels = from vitem in strArray
                select rx.Replace(vitem, string.Empty);

foreach (var item in vNovowels)
{
    Console.WriteLine(item);
}
like image 1
bskehil Avatar answered Oct 26 '22 03:10

bskehil