Representing Ternary Plot Data For Lookups

Tags:

I am trying to represent the data on this ternary graph for lookup. I would like to be able to feed all three values into a function and get back which area category these values fall into. The approach I am looking at is translating each area into a polygon using Cartesian coordinates. Then translating the supplied values into a point and checking if that point lies inside which polygon. I am pretty sure that will work but it just seems like an overkill for a simple lookup?

I am asking if there is a way to represent the data on this graph in an array or an object so that a simple lookup would be enough?

295

asked Sep 20 '12 19:09

Lukasz

2 Answers

You could translate the above into the Cartesian coordinate system, but there I think you're missing the important aspect of what it would take to actually do this.

In the above diagram, you are looking at the polygons as two-dimensional structures, but you have a three-dimensional point. It's not impossible, after all, we see the triangle on a two-dimensional plane above, but the translation into Cartesian space isn't exactly straightforward.

Rather, this looks very much like a graphical representation of a decision tree. It looks like readings for the amount of clay, silt and sand were taken to develop this representation.

Basically, you have a tuple of three values, and you want to get a classification depending on those values.

In the case above, you can model the decision tree for the classification "silt" as:

// These are approximate based on above.  Each branch of the
// tree is evaluated on one value out of all the values.
if (silt >= .8)
{
    // True case.  Check sand content.
    if (sand >= .20)
    {
        // Something else, other branches.
    }
    else
    {
        // False case, can possibly be silt.
        if (clay >= .125)
        {
            // Something else, other branches.
        }
        else
        {
            // Leaf, this is a classification.
            // Can return more strongly typed classification if you want.
            return "silt";
        }            
    }

}
else
{
    // Something else, other branches.
}

Each if/else statement represents a branch in the decision tree. At each branch, you'll want to evaluate value of the variable that at that point in the branch gives you the most information gain (pretty much what it sounds like, how many classifications can you make off that split?) which is based on the entropy (or uncertainty) that making a decision on that variable has.

The trees can be generated automatically or, you can manually code it yourself. While the latter is possible, I strongly suggest an automated/code-based method to develop it. I'd strongly suggest taking a look at Accord.NET (which requires AForge.NET, both are excellent). To get you started, you should take a look at this blog post showing how to create decision trees using Accord.NET.

Whichever way you go, you'll ultimately get a function that takes the three values (clay, silt, and sand) and returns your classification, traversing the tree depending on the values of each.

Note that you aren't necessarily going to have a one-to-one set of criteria (branches) that will map to each classification (as seen in the first code sample). Depending on the number of vertices on the polygons in the example above, you will need additional branches to handle those situations.

If you have the original sample data, then you just have to run the sample data through your decision tree builder and it should create a decision tree like the one above.

If you don't have the original sample data, you can create it using the vertices from above and classifying those. For example:

silt    sand    clay    classification
----    ----    ----    --------------
   0      50     100    clay (top point)
 100       0      50    silt (right bottom point)
  50     100       0    sand (left bottom point)
  15      45      40    sandy clay OR clay cloam OR clay (depending on splits)
...

Regarding the last row (and subsequent ones), the decision tree will set a boundary based on those values, and because it's continuous it will usually make a decision based on all values greater than or equal to that value.

179

answered Oct 05 '22 22:10

casperOne

The way to do this is as you have suggested, using the 'point in polygon approach', I mean, when you do this visually, that is exactly what you are doing in your head. The problem with the accepted answer, is that whilst in this case (and simple cases) you might be able to build a decision tree, for a situation where there are many more categories, this process would become inextricably more complicated IMHO. Perhaps categories overlap, perhaps categories are only a single point, the process that I propose here, these artefacts would have no consequence to the result.

I have already covered HERE, the production of the USDA soil classification diagram, the result I attach below, based off the dataset provided in the ggtern package.

enter image description here

But what may not be so clear, is that the ggtern package has some functions that make this a little less cumbersome than it has to be. Specifically, there are some internal functions in the ggtern package (routinely used in the backend) to make the transformations necessary in order to evaluate the point in polygon truth table for each point against the reference categories.

Such approach is fairly straight forward with the ddply function from the plyr package, and, the point.in.polygon function, from the sp package.

First let us load the necessary packages, and load the USDA data from ggtern. Let us also create some sample data, testing this process for a point lying on a vertice, and a point lying in the centre of a classification region!.

library(ggtern)
library(sp)
library(plyr)

#The Main Data to lookup against
data(USDA)

#The sample Data (Try a point at a vertice, and a point in the middle...)
testData = rbind(data.frame(Clay=.4,Sand=.2,Silt=.4), #Vertice point
                 data.frame(Clay=1,Sand=1,Silt=1)/3)  #Simple middle point

I then suggest to use the internal function, transform_tern_to_cart(...), to convert both datasets to cartesian coordinates.

#Do the transformation to cartesian
USDA[,c("x","y")]     = transform_tern_to_cart(USDA$Clay,USDA$Sand,USDA$Silt)
testData[,c("x","y")] = transform_tern_to_cart(testData$Clay,testData$Sand,testData$Silt)

By using a combination of ddply(...) and apply(...), we can then test each point in the test set, against each category in the reference set via use of the point.in.polygon(...) function.

#Create a function to do the lookup
lookup <- function(data=testData,lookupdata=USDA,groupedby="Label"){  
  if(!groupedby %in% colnames(lookupdata))
    stop("Groupedby value is not a column of the lookupdata")

  #For each row in the data
  outer = apply(data[,c("x","y")],1,function(row){

    #for each groupedby in the lookupdata
    inner = ddply(lookupdata,groupedby,function(df){
      if(point.in.polygon(row[1],row[2],df$x,df$y) > 0) #Is in polygon?
        return(df) #Return a valid dataframe
      else
        return(NULL) #Return nothing
    })

    #Extract the groupedby data from the table
    inner = unique(inner[,which(colnames(inner) == groupedby)])

    #Join together in csv string and return to 'outer'
    return(paste(as.character(inner),collapse=","))
  })

  #Combine with the original data and return
  return(cbind(data,Lookups=outer))
}

Which can then be called in the following manner:

#Execute
lookup()

You will notice that the first point satisfies four (4) categories, and the second, only one (1), as to be expected.

answered Oct 05 '22 22:10

Nicholas Hamilton

Related questions
                            
                                Decimal Ternary Not Working
                            
                                Precision of double after decimal point
                            
                                Cannot implicitly convert MyType<Foo> to MyType<IFoo>
                            
                                Client-side certificate in a Metro app for Windows Azure Service Management
                            
                                Use of BAL in 3 tier architecture?How to call methods from DAL to BAL
                            
                                Throwing HttpResponseException in WebAPI action method returning empty 200 response
                            
                                How should the '\t' character be handled within XML attribute values?
                            
                                Function to clone an arbitrary object
                            
                                Can anyone recommend a way to check if a class can be serialized as XML?
                            
                                How to prevent an infinite loop in ASP.net
                            
                                Call C# webservices from javascript and consume it (json format) [closed]
                            
                                Asynchronous command execution with user confirmation
                            
                                Using polymorphism for collision detection in an elegant way
                            
                                why a # character is added to the url?
                            
                                linq nested list contains
                            
                                Conditional WHERE in LINQ
                            
                                call was rejected by callee
                            
                                Deserializing a List of Objects that contain a Dictionary
                            
                                Will a serverside redirect (HttpResponse.Redirect) hit the load balancer again?
                            
                                How can I use Generics to create a way of making an IEnumerable from an enum?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With