I'm writing a program that will simply read 2 different .csv files containing following information:
file 1 file2
AA,2.34 BA,6.45
AB,1.46 BB,5.45
AC,9.69 BC,6.21
AD,3.6 AC,7.56
Where first column is string
, second is double
.
So far I have no difficulty in reading those files and placing values to the List:
firstFile = new List<KeyValuePair<string, double>>();
secondFile = new List<KeyValuePair<string, double>>();
I'm trying to instruct my program:
AA
) double
in this case), and if in this case match found, add the entire row to the separate List
. Something similar to the below pseudo-code:
for(var i=0;i<firstFile.Count;i++)
{
firstFile.Column[0].value[i].SearchMatchesInAnotherFile(secondFile.Column[0].values.All);
if(MatchFound)
{
CompareCorrespondingDoubles();
if(true)
{
AddFirstValueToList();
}
}
}
Instead of List
I tried to use Dictionary
but this data structure is not sorted and no way to access the key by the index.
I'm not asking for the exact code to provide, rather the question is:
What would you suggest to use as an appropriate data structure for this program so that I can investigate myself further?
C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...
In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.
C is an imperative procedural language supporting structured programming, lexical variable scope, and recursion, with a static type system. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions, all with minimal runtime support.
KeyValuePair
is actually only used for Dictionary
s. I suggest to create your own custom type:
public class MyRow
{
public string StringValue {get;set;}
public double DoubleValue {get;set;}
public override bool Equals(object o)
{
MyRow r = o as MyRow;
if (ReferenceEquals(r, null)) return false;
return r.StringValue == this.StringValue && r.DoubleValue == this.DoubleValue;
}
public override int GetHashCode()
{
unchecked { return StringValue.GetHashCode ^ r.DoubleValue.GetHashCode(); }
}
}
And store the files in lists of this type:
List<MyRow> firstFile = ...
List<MyRow> secondFile = ...
Then you can determine the intersection (all elements that occure in both lists) via LINQ's Intersect
method:
var result = firstFile.Intersect(secondFile).ToList();
It's necessary to override Equals
and GetHashCode
, because otherwise Intersect
would only make a reference comparison. Alternativly you could implement an own IEqualityComparer<MyRow, MyRow>
that does the comparison and pass it to the appropriate Intersect
overload, too.
But if you can ensure that the keys (the string values are unique), you can also use a
Dictionary<string, double> firstFile = ...
Dictionary<string, double> secondFile = ...
And in this case use this LINQ statement:
var result = new Dictionary<string, double>(
firstFile.Select(x => new { First = x, Second = secondFile.FirstOrDefault(y => x.Key == y.Key) })
.Where(x => x.Second?.Value == x.First.Value));
which had a time complexity of O(m+n) while the upper solution would be O(m*n) (for m and n being the row counts of the two files).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With