Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting statistics for nodes from a regression tree in the party pagckage

Tags:

r

tree

party

I am using the party package in R.

I would like to get various statistics (mean, median, etc) from various nodes of the resultant tree, but I cannot see how to do this. For example

airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq, 
                   controls = ctree_control(maxsurrogate = 3))
airct
plot(airct)

results in a tree with 4 terminal nodes. How would I get the mean airquality for each of those nodes?

like image 901
Peter Flom Avatar asked Dec 15 '22 15:12

Peter Flom


1 Answers

I can't get which variable of the node is the airquality. But I show you here how to customize your tree plot:

innerWeights <- function(node){
  grid.circle(gp = gpar(fill = "White", col = 1))
  mainlab <- node$psplit$variableName
  label   <- paste(mainlab,paste('prediction=',round(node$prediction,2) ,sep= ''),sep= '\n')
  grid.text( label= label,gp = gpar(col='red'))
}

plot(airct, inner_panel = innerWeights)

enter image description here

Edit to get statistics by node

library(gridExtra)

innerWeights <- function(node){
  dat <- round_any(node$criterion$statistic,0.01)
  grid.table(t(dat))
}
plot(airct, inner_panel = innerWeights)

enter image description here

like image 144
agstudy Avatar answered Jan 12 '23 10:01

agstudy