Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use a trained binary file in golearn

Tags:

go

I have this trainingdata in a csv:

RoomType, HasWater
bathroom, true
living_room, false
storage, false
kitchen, true
...

Using golearn to train data. Decission tree (not sure if decission tree is the correct one?)

func TrainData() {
...
    //Do a training-test split
    trainData, testData := base.InstancesTrainTestSplit(rawData, 0.50)
    tree := trees.NewID3DecisionTree(0.6)
...
    tree.Save('decissiontree.h')
}

So I have saved the result to this binary file.

I want be able to query my file with a label (a roomType) of the probability the room has water or not.

func HasWater(roomType: string):float64 {
    tree := trees.NewID3DecisionTree(0.6)
    tree.Load("model.h");
   // What next??
}

I don't find any examples in golearn how you actually use your trained binary file. I guess I should load the file. But then what?

Sorry for basic question. Totally new to ML (and GO).

like image 446
Joe Avatar asked Feb 11 '21 05:02

Joe


1 Answers

You have a training.csv which you used for training and testing if your model is working as expected. Once your model is working as expected you saved it for later use.

training.csv

RoomType, HasWater
bathroom, true
living_room, false
storage, false
kitchen, true
...
// Reading training.csv file
testData, err := base.ParseCSVToInstances("training.csv", false)
if err != nil {
    panic(err)
}
// spliting train and test data
trainData, testData := base.InstancesTrainTestSplit(testData, 0.50)
// used the trainData for training model
...
// testing if model is working as expected
predictions, err := tree.Predict(testData)
...
// Saved model for later work
tree.Save('decissiontree.h')

Now the model is trained and working as expected. we need to test model if it is working on unknown data different from training and test data you can load the previous saved trained binary and use that for testing unknown data for prediction

unknown.csv

RoomType, HasWater
ocean, true
truck, false
car, false
lake, true
...
tree := &DecisionTreeNode{}
err := tree.Load("model.h");

unknownTestData, err := base.ParseCSVToInstances("unknown.csv", false)
if err != nil {
  panic(err)
}

predictions, err := tree.Predict(unknownTestData)
  • Example to create data yourself and pass to model for predition
// Create a new, empty DenseInstances
newInst := base.NewDenseInstances()

// Create some Attributes 
attrs := make([]base.Attribute, 2)
attrs[0] = new(base.CategoricalAttribute)
attrs[0].SetName("0")
attrs[1] = new(base.CategoricalAttribute)
attrs[1].SetName("1")

// Add the attributes
newSpecs := make([]base.AttributeSpec, len(attrs))
newSpecs[0] = newInst.AddAttribute(attrs[0])
newSpecs[1] = newInst.AddAttribute(attrs[1])

newInst.Extend(4)

newInst.Set(newSpecs[0], 0, newSpecs[0].GetAttribute().GetSysValFromString(strings.TrimSpace("RoomType")))
newInst.Set(newSpecs[1], 0, newSpecs[1].GetAttribute().GetSysValFromString(strings.TrimSpace("HasWater")))

newInst.Set(newSpecs[0], 1, newSpecs[0].GetAttribute().GetSysValFromString(strings.TrimSpace("ocean")))
newInst.Set(newSpecs[1], 1, newSpecs[1].GetAttribute().GetSysValFromString(strings.TrimSpace("true")))
newInst.Set(newSpecs[0], 2, newSpecs[0].GetAttribute().GetSysValFromString(strings.TrimSpace("truck")))
newInst.Set(newSpecs[1], 2, newSpecs[1].GetAttribute().GetSysValFromString(strings.TrimSpace("false")))
newInst.Set(newSpecs[0], 3, newSpecs[0].GetAttribute().GetSysValFromString(strings.TrimSpace("lake")))
newInst.Set(newSpecs[1], 3, newSpecs[1].GetAttribute().GetSysValFromString(strings.TrimSpace("true")))

predictions, err := tree.Predict(newInst)

Here i only used CategoricalAttribute but there are more Attributes type available in golearn like FloatAttribute, BinaryAttribute

Note: unkwown.csv includes data which is not included in training or test data which can be used to check how well model is performing on unknown data.

like image 126
Chandan Avatar answered Oct 18 '22 21:10

Chandan