I'm using ML.NET 0.7 and have a MulticlassClassification model with the following result class:
public class TestClassOut
{
public string Id { get; set; }
public float[] Score { get; set; }
public string PredictedLabel { get; set; }
}
I'd like to know the scores and the corresponding labels on the Scores
property. Feels like I should be able to make the property a Tuple<string,float>
or similar to get the label that the score represents.
I understand that there was a method on V0.5:
model.TryGetScoreLabelNames(out scoreLabels);
But can't seem to find the equivalent in V0.7.
Can this be done? if so how?
This is probably not the answer you're looking for, but I ended up copying the code from TryGetScoreLabelNames (it's in the Legacy namespace as of 0.7) and tweaking it to use the schema from my input data. The dataView below is an IDataView I created from my prediction input data so I could get the schema off of it.
public bool TryGetScoreLabelNames(out string[] names, string scoreColumnName = DefaultColumnNames.Score)
{
names = (string[])null;
Schema outputSchema = model.GetOutputSchema(dataView.Schema);
int col = -1;
if (!outputSchema.TryGetColumnIndex(scoreColumnName, out col))
return false;
int valueCount = outputSchema.GetColumnType(col).ValueCount;
if (!outputSchema.HasSlotNames(col, valueCount))
return false;
VBuffer<ReadOnlyMemory<char>> vbuffer = new VBuffer<ReadOnlyMemory<char>>();
outputSchema.GetMetadata<VBuffer<ReadOnlyMemory<char>>>("SlotNames", col, ref vbuffer);
if (vbuffer.Length != valueCount)
return false;
names = new string[valueCount];
int num = 0;
foreach (ReadOnlyMemory<char> denseValue in vbuffer.DenseValues())
names[num++] = denseValue.ToString();
return true;
}
I also asked this question in gitter for ml.net (https://gitter.im/dotnet/mlnet) and got this response from Zruty0
my best suggestion is to convert labels to 0..(N-1) beforehand, then train, and then inspect the resulting 'Score' column. It'll be a vector of size N, with per-class scores. PredictedLabel is actually just argmax(Score), and you can get the 2nd and other candidates by sorting Score
If you have a static set of classes this might be a better option, but my situation has an ever-growing set of classes.
This was asked a while ago, but I think that this is still a very relevant question that surprisingly has not got a lot of traction and isn't mentioned (as of the time of writing) in any of the Microsoft ML.NET tutorials. The sample code above needs a bit of tweaking to get it to work with v1.5 (preview), so I thought I'd post how I got it working for anyone else who stumbles across this.
In ConsumeModel.cs
(assuming you're using the Model Builder in Visual Studio):
...
// Use model to make prediction on input data
ModelOutput result = predEngine.Predict(input);
var labelNames = new List<string>();
var column = predEngine.OutputSchema.GetColumnOrNull("label");
if (column.HasValue)
{
VBuffer<ReadOnlyMemory<char>> vbuffer = new VBuffer<ReadOnlyMemory<char>>();
column.Value.GetKeyValues(ref vbuffer);
foreach (ReadOnlyMemory<char> denseValue in vbuffer.DenseValues())
labelNames.Add(denseValue.ToString());
}
...
The end result that labelNames
is now a parallel collection to result.Score
. Just keep in mind that changes to the generated files could get overwritten if you rebuild the model using Model Builder.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With