Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set costs matrix for C5.0 Package in R?

I have googled much in the web, but don't find any useful description for the 'costs' parameter for C5.0 function in R. From the C5.0 R manual book, it just says "a matrix of costs associated with the possible errors. The matrix should have C columns and rows where C is the number of class levels". It does not tell me whether the row or the column is the predicated result by the model.

Can anyone help?

like image 932
bourneli Avatar asked Feb 15 '23 14:02

bourneli


1 Answers

Here is a quote from the help page of C5.0 (version 0.1.0-15):

The cost matrix should by CxC, where C is the number of classes. Diagonal elements are ignored. Columns should correspond to the true classes and rows are the predicted classes. For example, if C = 3 with classes Red, Blue and Green (in that order), a value of 5 in the (2,3) element of the matrix would indicate that the cost of predicting a Green sample as Blue is five times the usual value (of one).

Following the example in the help page, this would be a cost matrix:

cost.matrix <- matrix(c(
  NA, 2, 4,
  3, NA, 5,
  7, 1, NA

), 3, 3, byrow=TRUE)

rownames(cost.matrix) <- colnames(cost.matrix) <- c("Red", "Blue", "Green")

cost.matrix

      Red Blue Green
Red    NA    2     4
Blue    3   NA     5
Green   7    1    NA

This would mean the following:

  • Predicting a red sample as blue is 3 times the value as the usual value (one)
  • Predicting a red sample as green is 7 times the value as the usual
  • Predicting a blue sample as red is 2 times the ususal value
  • Predicting a blue sample as green is 1 times the ususal value
  • Predicting a green sample as red is 4 times the ususal value
  • Predicting a green sample as blue is 5 times the usual value
like image 73
COOLSerdash Avatar answered Feb 18 '23 12:02

COOLSerdash