I'm writing code in a C# library to do clustering on a (two-dimensional) dataset - essentially breaking the data up into groups or clusters. To be useful, the library needs to take in "generic" or "custom" data, cluster it, and return the clustered data. To do this, I need to assume that each datum in the dataset being passed in has a 2D vector associated with it (in my case <code>Lat</code>, <code>Lng</code> - I'm working with co-ordinates). My first thought was to use generic types, and pass in two lists, one list of the generic data (i.e. <code>List<T></code>) and another of the same length specifying the 2D vectors (i.e. <code>List<Coordinate></code>, where <code>Coordinate</code> is my class for specifying a lat, lng pair), where the lists correspond to each other by index. But this is quite tedious because it means that in the algorithm I have to keep track of these indices somehow. My next thought was to use inferfaces, where I define an interface <pre class="prettyprint"><code>public interface IPoint { double Lat { get; set; } double Lng { get; set; } } </code></pre> and ensure that the data that I pass in implements this interface (i.e. I can assume that each datum passed in has a <code>Lat</code> and a <code>Lng</code>). But this isn't really working out for me either. I'm using my C# library to cluster stops in a transit network (in a different project). The class is called <code>Stop</code>, and this class is also from an external library, so I can't implement the interface for that class. What I did then was inherit from <code>Stop</code>, creating a class called <code>ClusterableStop</code>which looks like this: <pre class="prettyprint"><code>public class ClusterableStop : GTFS.Entities.Stop, IPoint { public ClusterableStop(Stop stop) { Id = stop.Id; Code = stop.Code; Name = stop.Name; Description = stop.Description; Latitude = stop.Latitude; Longitude = stop.Longitude; Zone = stop.Zone; Url = stop.Url; LocationType = stop.LocationType; ParentStation = stop.ParentStation; Timezone = stop.Timezone; WheelchairBoarding = stop.WheelchairBoarding; } public double Lat { get { return this.Latitude; } } public double Lng { get { return this.Longitude; } } } </code></pre> which as you can see implements the <code>IPoint</code> interface. Now I use the constructor for <code>ClusterableStop</code> to first convert all <code>Stop</code>s in the dataset to <code>ClusterableStop</code>s, then run the algorithm and get the result as <code>ClusterableStop</code>s. This isn't really what I want, because I want to do things to the <code>Stop</code>s based on what cluster they fall in. I can't do that because I've actually instantiated new stops, namely <code>ClusterableStop</code>s !! I can still acheive what I want to, because e.g. I can retrieve the original objects by Id. But surely there is a much more elegant way to accomplish all of this? Is this the right way to be using interfaces? It seemed like such a simple idea - passing in and getting back custom data - but turned out to be so complicated.

Since all you need is to associate a (latitude, longitude) pair to each element of 2D array, you could make a method that takes a delegate, which produces an associated position for each datum, like this: <pre class="prettyprint"><code>ClusterList Cluster<T>(IList<T> data, Func<int,Coordinate> getCoordinate) { for (int i = 0 ; i != data.Count ; i++) { T item = data[i]; Coordinate coord = getCoord(i); ... } } </code></pre> It is now up to the caller to decide how <code>Coordinate</code> is paired with each element of data. Note that the association by list position is not the only option available to you. Another option is to pass a delegate that takes the item, and returns its coordinate: <pre class="prettyprint"><code>ClusterList Cluster<T>(IEnumerable<T> data, Func<T,Coordinate> getCoordinate) { foreach (var item in data) { Coordinate coord = getCoord(item); ... } } </code></pre> Although this approach is better than the index-based one, in cases when the coordinates are not available on the object itself, it requires the caller to keep some sort of an associative container on <code>T</code>, which must either play well with hash-based containers, or be an <code>IComparable<T></code>. The first approach places no restrictions on <code>T</code>. In your case, the second approach is preferable: <pre class="prettyprint"><code>var clustered = Cluster( myListOfStops , stop => new Coordinate(stop.Latitude, stop.Longitude) ); </code></pre>

Passing in and returning custom data - are interfaces the right approach?

Tags:

c#

interface

generics

I'm writing code in a C# library to do clustering on a (two-dimensional) dataset - essentially breaking the data up into groups or clusters. To be useful, the library needs to take in "generic" or "custom" data, cluster it, and return the clustered data.

To do this, I need to assume that each datum in the dataset being passed in has a 2D vector associated with it (in my case Lat, Lng - I'm working with co-ordinates).

My first thought was to use generic types, and pass in two lists, one list of the generic data (i.e. List<T>) and another of the same length specifying the 2D vectors (i.e. List<Coordinate>, where Coordinate is my class for specifying a lat, lng pair), where the lists correspond to each other by index. But this is quite tedious because it means that in the algorithm I have to keep track of these indices somehow.

My next thought was to use inferfaces, where I define an interface

public interface IPoint
{
    double Lat { get; set; }
    double Lng { get; set; }
}

and ensure that the data that I pass in implements this interface (i.e. I can assume that each datum passed in has a Lat and a Lng).

But this isn't really working out for me either. I'm using my C# library to cluster stops in a transit network (in a different project). The class is called Stop, and this class is also from an external library, so I can't implement the interface for that class.

What I did then was inherit from Stop, creating a class called ClusterableStopwhich looks like this:

public class ClusterableStop : GTFS.Entities.Stop, IPoint
{        

    public ClusterableStop(Stop stop)
    {
        Id = stop.Id;
        Code = stop.Code;
        Name = stop.Name;
        Description = stop.Description;
        Latitude = stop.Latitude;
        Longitude = stop.Longitude;
        Zone = stop.Zone;
        Url = stop.Url;
        LocationType = stop.LocationType;
        ParentStation = stop.ParentStation;
        Timezone = stop.Timezone;
        WheelchairBoarding = stop.WheelchairBoarding;
    }
    public double Lat
    {
        get
        {
            return this.Latitude;
        }
    }

    public double Lng
    {
        get
        {
            return this.Longitude;
        }
    }
}

which as you can see implements the IPoint interface. Now I use the constructor for ClusterableStop to first convert all Stops in the dataset to ClusterableStops, then run the algorithm and get the result as ClusterableStops.

This isn't really what I want, because I want to do things to the Stops based on what cluster they fall in. I can't do that because I've actually instantiated new stops, namely ClusterableStops !!

I can still acheive what I want to, because e.g. I can retrieve the original objects by Id. But surely there is a much more elegant way to accomplish all of this? Is this the right way to be using interfaces? It seemed like such a simple idea - passing in and getting back custom data - but turned out to be so complicated.

823

asked Jan 05 '17 13:01

Chris Marais

2 Answers

Since all you need is to associate a (latitude, longitude) pair to each element of 2D array, you could make a method that takes a delegate, which produces an associated position for each datum, like this:

ClusterList Cluster<T>(IList<T> data, Func<int,Coordinate> getCoordinate) {
    for (int i = 0 ; i != data.Count ; i++) {
        T item = data[i];
        Coordinate coord = getCoord(i);
        ...
    }
}

It is now up to the caller to decide how Coordinate is paired with each element of data.

Note that the association by list position is not the only option available to you. Another option is to pass a delegate that takes the item, and returns its coordinate:

ClusterList Cluster<T>(IEnumerable<T> data, Func<T,Coordinate> getCoordinate) {
    foreach (var item in data) {
        Coordinate coord = getCoord(item);
        ...
    }
}

Although this approach is better than the index-based one, in cases when the coordinates are not available on the object itself, it requires the caller to keep some sort of an associative container on T, which must either play well with hash-based containers, or be an IComparable<T>. The first approach places no restrictions on T.

In your case, the second approach is preferable:

var clustered = Cluster(
    myListOfStops
,   stop => new Coordinate(stop.Latitude, stop.Longitude)
);

173

answered Sep 26 '22 01:09

Sergey Kalinichenko

Have you considered using Tuples to do the work - sometimes this is a useful way of associating two classes without creating a whole new class. You can create a list of tuples:

List<Tuple<Point, Stop>>

where Point is the thing you cluster on.

answered Sep 27 '22 01:09

08Dc91wk

Related questions
                            
                                Understanding Builder Pattern in C#
                            
                                C# - Receiving strange character from HttpWebResponse
                            
                                How to run two cake file one by one based on first cake file compilation success?
                            
                                TypeInitializationException thrown by aleagpu
                            
                                Task.WaitAll throws OperationCanceledException [closed]
                            
                                API Call in C# using JSON
                            
                                Building a thread-safe GUID increment'er
                            
                                Must async methods be supported by OS or is async program level feature
                            
                                SignedCms alternative in .NET Core
                            
                                UWP / Visual Studio: How to make different builds variants?
                            
                                The type 'Microsoft.SqlServer.Types.SqlGeography' exists in both 'Microsoft.SqlServer.Types.dll' and 'Microsoft.SqlServer.Types.dll'
                            
                                Reusable linq select query in Entity Framework
                            
                                Simple Injector: Cyclic Graph Error
                            
                                Getting the name of the declaring class?
                            
                                Microsoft.Extensions.Configuration binding dictionary with colons in key
                            
                                keeping an object alive in C#
                            
                                AspNetCore MVC - return RedirectToAction is getting ignored
                            
                                How to align right edges of a control and ToolTip Message in C#
                            
                                Is this the right way of using Dapper or am I doing it all wrong?
                            
                                Should methods in a web app be public or internal?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With