Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Architecture behind Teafiles and teahouse charting library?

Tags:

c#

time-series

I came across an open source .Net library called Teafiles.net which handles time series storage and retrieval. The proprietary product, teahouse, can chart such time series. I wonder whether the teahouse product is also available as source code, whether open source or paid license. I am interested in the technology behind being able to only load datapoints that are visible on the current chart view and how to implement similar solution.

I'm looking to implement something similar and was wondering whether anyone has come across similar technology or knows whether the paid teahouse license is also available with source code.

like image 383
Matt Avatar asked Oct 03 '22 20:10

Matt


1 Answers

I am currently developing a trending solution based on the ZedGraph library, and I am using TeaFiles to cache a huge amount of data that comes from a database.

I dont know exactly what kind of technology stands behind the TeaHouse solution. But I also used an approach for displaying a set of points that are between two dates from a huge amount of data coming from TeaFile.

The ZedGraph library has a FilteredPointList object which performs an automatic data point decimation. It includes a SetBounds method that allow you to choose the range of dates you want to display, and the maximum amount of points you want to show. Normally, it corresponds to the actual width of your view.

The FilteredPointList (original source code) uses two arrays of double that contains the XY data. It is easy to adapt this class to a TeaFilePointList by replacing the arrays by a TeaFile object, considering T as a structure that contains a DateTime and a double property.

The implementation is not optimal, but I started this way. I may update this code later to include the MemoryMappedFile feature of TeaFile. It will be much faster this way.

public class TeaFilePointList : IPointList
{
    TeaFile<point> tf;

    private int _maxPts = -1;
    private int _minBoundIndex = -1;
    private int _maxBoundIndex = -1;

    struct point
    {
        public TeaTime.Time x;
        public double y;
    }

    public TeaFilePointList(DateTime[] x, double[] y)
    {
        tf = TeaFile<point>.Create(Path.GetRandomFileName() + ".tea");
        for (var i = 0; i < x.Length; i++)
            tf.Write(new point() { x = x[i], y = y[i] });
    }

    public void SetBounds(double min, double max, int maxPts)
    {
        _maxPts = maxPts;

        // find the index of the start and end of the bounded range

        var xmin = (DateTime)new XDate(min);
        var xmax = (DateTime)new XDate(max);

        int first = tf.BinarySearch(xmin, item => (DateTime)item.x);
        int last = tf.BinarySearch(xmax, item => (DateTime)item.x);

        // Make sure the bounded indices are legitimate
        // if BinarySearch() doesn't find the value, it returns the bitwise
        // complement of the index of the 1st element larger than the sought value

        if (first < 0)
        {
            if (first == -1)
                first = 0;
            else
                first = ~(first + 1);
        }

        if (last < 0)
            last = ~last;

        _minBoundIndex = first;
        _maxBoundIndex = last;
    }

    public int Count
    {
        get
        {
            int arraySize = (int)tf.Count;

            // Is the filter active?
            if (_minBoundIndex >= 0 && _maxBoundIndex >= 0 && _maxPts > 0)
            {
                // get the number of points within the filter bounds
                int boundSize = _maxBoundIndex - _minBoundIndex + 1;

                // limit the point count to the filter bounds
                if (boundSize < arraySize)
                    arraySize = boundSize;

                // limit the point count to the declared max points
                if (arraySize > _maxPts)
                    arraySize = _maxPts;
            }

            return arraySize;
        }
    }

    public PointPair this[int index]
    {
        get
        {
            if (_minBoundIndex >= 0 && _maxBoundIndex >= 0 && _maxPts >= 0)
            {
                // get number of points in bounded range
                int nPts = _maxBoundIndex - _minBoundIndex + 1;

                if (nPts > _maxPts)
                {
                    // if we're skipping points, then calculate the new index
                    index = _minBoundIndex + (int)((double)index * (double)nPts / (double)_maxPts);
                }
                else
                {
                    // otherwise, index is just offset by the start of the bounded range
                    index += _minBoundIndex;
                }
            }

            double xVal, yVal;
            if (index >= 0 && index < tf.Count)
                xVal = new XDate(tf.Items[index].x);
            else
                xVal = PointPair.Missing;

            if (index >= 0 && index < tf.Count)
                yVal = tf.Items[index].y;
            else
                yVal = PointPair.Missing;

            return new PointPair(xVal, yVal, PointPair.Missing, null);
        }
    }

    public object Clone()
    {
        throw new NotImplementedException(); // I'm lazy...
    }

    public void Close()
    {
        tf.Close();
        tf.Dispose();
        File.Delete(tf.Name);
    }
}

The hardest part was to implement a BinarySearch for TeaFile for quick-searching a record using a DateTime. I looked at the Array.BinarySearch implementation by using a decompiler, and I wrote the extension below:

public static int BinarySearch<T, U>(this TeaFile<T> tf, U target, Func<T, U> indexer) where T : struct
{
    var lo = 0;
    var hi = (int)tf.Count - 1;
    var comp = Comparer<U>.Default;

    while(lo <= hi)
    {
        var median = lo + (hi - lo >> 1);
        var num = comp.Compare(indexer(tf.Items[median]), target);
        if (num == 0)
            return median;
        if (num < 0)
            lo = median + 1;
        else
            hi = median - 1;
    }

    return ~lo;
}

If ZedGraph does not fit your needs, at least you got the idea. The decimation algorithm used in the FilteredPointList class is pretty good and can be adapted to suit your needs another way.

like image 126
Larry Avatar answered Oct 11 '22 20:10

Larry