string[] lines3 = new string[100];
List<string> lines2 = new List<string>();
lines3 = Regex.Split(s1, @"\s*,\s*");
if (!lines2.Contains(lines3.ToString()))
{
lines2.AddRange(lines3.Distinct().ToArray());
}
I have checked all the spaces etc but i still get duplicate values in my lines2 List
I have to remove my duplicate values here itself
If you don't want duplicates, use a Set instead of a List . To convert a List to a Set you can use the following code: // list is some List of Strings Set<String> s = new HashSet<String>(list); If really necessary you can use the same construction to convert a Set back into a List .
In C# programming, collections like ArrayList, List , simply adds values in it without checking any duplication. To avoid such a duplicate data store, . NET provides a collection name set. This is a collection type with distinct items.
What are duplicates in a list? If an integer or string or any items in a list are repeated more than one time, they are duplicates.
You can use Enumerable.Except to get distinct items from lines3 which is not in lines2:
lines2.AddRange(lines3.Except(lines2));
If lines2 contains all items from lines3 then nothing will be added. BTW internally Except uses Set<string>
to get distinct items from second sequence and to verify those items present in first sequence. So, it's pretty fast.
Your this check:
if (!lines2.Contains(lines3.ToString()))
is invalid. You are checking if your lines2
contains System.String[]
since lines3.ToString()
will give you that. You need to check if item from lines3
exists in lines2
or not.
You can iterate each item in lines3
check if it exists in the lines2
and then add it. Something like.
foreach (string str in lines3)
{
if (!lines2.Contains(str))
lines2.Add(str);
}
Or if your lines2
is any empty list, then you can simply add the lines3
distinct values to the list like:
lines2.AddRange(lines3.Distinct());
then your lines2
will contain distinct values.
Use a HashSet<string>
instead of a List<string>
. It is prepared to perform a better performance because you don't need to provide checks for any items. The collection will manage it for you. That is the difference between a list
and a set
. For sample:
HashSet<string> set = new HashSet<string>();
set.Add("a");
set.Add("a");
set.Add("b");
set.Add("c");
set.Add("b");
set.Add("c");
set.Add("a");
set.Add("d");
set.Add("e");
set.Add("e");
var total = set.Count;
Total is 5
and the values are a
, b
, c
, d
, e
.
The implemention of List<T>
does not give you nativelly. You can do it, but you have to provide this control. For sample, this extension method
:
public static class CollectionExtensions
{
public static void AddItem<T>(this List<T> list, T item)
{
if (!list.Contains(item))
{
list.Add(item);
}
}
}
and use it:
var list = new List<string>();
list.AddItem(1);
list.AddItem(2);
list.AddItem(3);
list.AddItem(2);
list.AddItem(4);
list.AddItem(5);
If you don't want duplicates in a list, use a HashSet
. That way it will be clear to anyone else reading your code what your intention was and you'll have less code to write since HashSet
already handles what you are trying to do.
You could use a simple Union
+ Distinct
:
var lines = lines2.Union(lines3).Distinct();
That will add all the items from the second list into the first list, and then return all the unique strings in the combined list. Not likely to perform well with large lists, but it's simple.
Reference: http://msdn.microsoft.com/en-us/library/bb341731.aspx
If your check would have worked, it would have either added all the items, or none at all. However, calling the ToString
method on an array returns the name of the data type, not the contents of the array, and the Contains
method can only look for a single item, not a collection of items anyway.
You have to check each string in the array:
string[] lines3;
List<string> lines2 = new List<string>();
lines3 = Regex.Split(s1, @"\s*,\s*");
foreach (string s in lines3) {
if (!lines2.Contains(s)) {
lines2.Add(s);
}
}
However, if you start with an empty list, you can use the Distinct
method to remove the duplicates, and you only need a single line of code:
List<string> lines2 = Regex.Split(s1, @"\s*,\s*").Distinct().ToList();
If you want to save distinct values into a collection you could try HashSet Class. It will automatically remove the duplicate values and save your coding time. :)
use HashSet
it's better
take a look here : http://www.dotnetperls.com/hashset
Use a HashSet
along with your List
:
List<string> myList = new List<string>();
HashSet<string> myHashSet = new HashSet<string>();
public void addToList(string s) {
if (myHashSet.Add(s)) {
myList.Add(s);
}
}
myHashSet.Add(s)
will return true
if s
is not exist in it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With