I have a CVS file with over 1 Million rows of data. I am planning to read them in parallel to improve efficiency. Can I do something like the following or is there a more efficient method?
namespace ParallelData
{
public partial class ParallelData : Form
{
public ParallelData()
{
InitializeComponent();
}
private static readonly char[] Separators = { ',', ' ' };
private static void ProcessFile()
{
var lines = File.ReadLines("BigData.csv");
var numbers = ProcessRawNumbers(lines);
var rowTotal = new List<double>();
var totalElements = 0;
foreach (var values in numbers)
{
var sumOfRow = values.Sum();
rowTotal.Add(sumOfRow);
totalElements += values.Count;
}
MessageBox.Show(totalElements.ToString());
}
private static List<List<double>> ProcessRawNumbers(IEnumerable<string> lines)
{
var numbers = new List<List<double>>();
/*System.Threading.Tasks.*/
Parallel.ForEach(lines, line =>
{
lock (numbers)
{
numbers.Add(ProcessLine(line));
}
});
return numbers;
}
private static List<double> ProcessLine(string line)
{
var list = new List<double>();
foreach (var s in line.Split(Separators, StringSplitOptions.RemoveEmptyEntries))
{
double i;
if (Double.TryParse(s, out i))
{
list.Add(i);
}
}
return list;
}
private void button2_Click(object sender, EventArgs e)
{
ProcessFile();
}
}
}
I'm not sure it's a good idea. Depending on your hardware, the CPU won't be a bottleneck, the disk read speed will.
Another point: if your storage hardware is a magnetic hard disk, then then disk read speed is strongly related to how the file is physically stored in the disk; if the file is not fragmented (i.e. all file chunks are sequentially stored on the disk), you'll have better performances if you read line by line sequentially.
One solution would be to read the whole file in one time (if you have enough memory space, for 1 million row it should be OK) using File.ReadAllLines
, store all lines in a string array, then process (i.e. parse using string.Split
...etc.) in your Parallel.Foreach
, if the rows order is not important.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With