Charting massive amounts of data

Tags:

We are currently using ZedGraph to draw a line chart of some data. The input data comes from a file of arbitrary size, therefore, we do not know what the maximum number of datapoints in advance. However, by opening the file and reading the header, we can find out how many data points are in the file.

The file format is essentially [time (double), value (double)]. However, the entries are not uniform in the time axis. There may not be any points between say t = 0 sec and t = 10 sec, but there might be 100K entires between t = 10 sec and t = 11 sec, and so on.

As an example, our test dataset file is ~2.6 GB and it has 324M points. We'd like to show the entire graph to the user and let her navigate through the chart. However, loading up 324M points to ZedGraph not only is impossible (we're on a 32-bit machine), but also not useful since there is no point of having so many points on the screen.

Using the FilteredPointList feature of ZedGraph also appears to be out of question, since that requires loading the entire data first and then performing filtering on that data.

So, unless we're missing anything, it appears that our only solution is to -somehow- decimate the data, however as we keep working on it, we're running into a lot of problems:

1- How do we decimate data that is not arriving uniformly in time?

2- Since the entire data can't be loaded into memory, any algorithm needs to work on the disk and so needs to be designed carefully.

3- How do we handle zooming in and out, especially, when the data is not uniform on the x-axis.

If data was uniform, upon initial load of the graph, we could Seek() by predefined amount of entries in the file, and choose every N other samples and feed it to ZedGraph. However, since the data is not uniform, we have to be more intelligent in choosing the samples to display, and we can't come up with any intelligent algorithm that would not have to read the entire file.

I apologize since the question does not have razor-sharp specificity, but I hope I could explain the nature and scope of our problem.

We're on Windows 32-bit, .NET 4.0.

556

asked Jan 27 '11 15:01

SomethingBetter

1 Answers

I've needed this before, and it's not easy to do. I ended up writing my own graph component because of this requirement. It turned out better in the end because I put in all the features we needed.

Basically, you need to get the range of data (min and max possible/needed index values), subdivide it into segments (let's say 100 segments), and then determine a value for each segment by some algorithm (average value, median value, etc.). Then you plot based on those summarized 100 elements. This is much faster than trying to plot millions of points :-).

So what I am saying is similar to what you are saying. You mention you do not want to plot every X element because there might be a long stretch of time (index values on the x-axis) between elements. What I am saying is that for each subdivision of data determine what is the best value, and take that as the data point. My method is index value-based, so in your example of no data between the 0 sec and 10-sec index values I would still put data points there, they would just have the same values among themselves. The point is to summarize the data before you plot it. Think through your algorithms to do that carefully, there are lots of ways to do so, choose the one that works for your application. You might get away with not writing your own graph component and just write the data summarization algorithm.

128

answered Oct 02 '22 18:10

Gabriel Magana

Related questions
                            
                                Order of regular expression operator (..|.. ... ..|..)
                            
                                Authentication and Authorization without Entity Framework in ASP.NET 5 MVC 6
                            
                                Difference between decimal.Round and Math.Round
                            
                                Is there a TypeScript equivalent to C#'s attributes
                            
                                How to draw with .NET Core?
                            
                                Where are Entity Framework Tools in EF6 and Visual Studio 2017?
                            
                                HttpWebRequest in .NET Core 2.0 throwing 302 Found Exception
                            
                                How can I access the browsers localStorage in Blazor?
                            
                                How to implement drag and drop in Blazor?
                            
                                SelectNodes not working on stackoverflow feed
                            
                                Singleton with finalizer but not IDisposable
                            
                                Fastest serializer and deserializer with lowest memory footprint in C#?
                            
                                Batch multiple select statements when calling Oracle from ADO.NET
                            
                                C#/WPF: Make a GridViewColumn Visible=false?
                            
                                Read from a growing file in C#?
                            
                                How can I find out a COM port number of a bluetooth device in c#?
                            
                                Sending mhtml emails - C#
                            
                                How to set a proxy for Webbrowser Control without effecting the SYSTEM/IE proxy
                            
                                F# yield! operator - Implementation and possible C# equivalents
                            
                                How to use custom Controls in WPF

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Charting massive amounts of data

Tags:

c#

.net

large-files

charts

zedgraph

SomethingBetter

People also ask

1 Answers

Gabriel Magana

Recent Activity

Donate For Us