Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix "column has values of Type which is not the same as an earlier observed Type "

Tags:

c#

ml.net

I am creating a machine learning model, I want to read different values from a text file and process them using a CustomMapping. The program throws a System.InvalidOperationException when running the CustomMapping.

I have already narrowed down the cause to my CustomMapping function, the text file I am reading from doesn't have any empty values. I have already double checked all my variable declarations and I have made sure that they are all using the correct types. My hunch is that the custom mapping is interpreting 1's and 0's as boolean values rather than floats although I see no reason for it to do that.

Apologies for the massive dump, the question is regarding type issues though, so I figured it would be important to show everything.

My pipeline:

var pipeline = context.Transforms.CustomMapping<ProfileInput, ProfileProcess>(ProfileMapping.Transform, nameof(ProfileMapping))
.Append(context.Transforms.Concatenate("Features", "isBanned", "profileVisibility", "profileConfigured", "lastLogOff", "commentPermission", "timeCreated", "friendCount", "gameBannedFriendsCount", "vacBannedFriendsCount", "gameBannedFriendsPercent", "vacBannedFriendsPercent"));

My CustomMapping:

public static void Transform(ProfileInput input, ProfileProcess output)
{
  if (input.numberGameBans > 0 || input.numberVacBans > 0)
    output.isBanned = false;

  output.gameBannedFriendsPercent = input.gameBannedFriendsCount / input.friendCount;
  output.vacBannedFriendsPercent = input.vacBannedFriendsCount / input.friendCount;
  output.profileVisibility = input.profileVisibility;
  output.profileConfigured = input.profileConfigured;
  output.lastLogOff = input.lastLogOff;
  output.commentPermission =  input.commentPermission;
  output.timeCreated = input.timeCreated;
  output.friendCount = input.friendCount;
  output.gameBannedFriendsCount = input.gameBannedFriendsCount;
  output.vacBannedFriendsCount = input.vacBannedFriendsCount;
}

ProfileInput :

public class ProfileInput
{
  [LoadColumn(0)]
  public bool commentPermission;
  [LoadColumn(1)]
  public float lastLogOff;
  [LoadColumn(2)]
  public bool profileConfigured;
  [LoadColumn(3)]
  public float profileVisibility;
  [LoadColumn(4)]
  public float timeCreated;
  [LoadColumn(5)]
  public float numberVacBans;
  [LoadColumn(6)]
  public float numberGameBans;
  [LoadColumn(7)]
  public float vacBannedFriendsCount;
  [LoadColumn(8)]
  public float gameBannedFriendsCount;
  [LoadColumn(9)]
  public float friendCount;
}

ProfileProcess :

public class ProfileProcess
{
  public bool isBanned;
  public float profileVisibility;
  public bool profileConfigured;
  public float lastLogOff;
  public bool commentPermission;
  public float timeCreated;
  public float friendCount;
  public float gameBannedFriendsCount;
  public float vacBannedFriendsCount;
  public float gameBannedFriendsPercent;
  public float vacBannedFriendsPercent;
}

When running pipeline.fit() I get the following exception:

System.InvalidOperationException: 'Column 'profileVisibility' has values of R4which is not the same as earlier observed type of Bool.'

I expected it to successfully complete the code without throwing the error, the actual output would be a TransformerChain model - I understand that the pipeline doesn't have a trainer yet so the model would be good-for-nothing as it stands right now.

like image 568
MouseAndKeyboard Avatar asked Dec 17 '22 18:12

MouseAndKeyboard


2 Answers

The context.Transforms.Concatenate concatenates columns of the same type. The type gets defined by the first input column, in your case that is "isBanned". Since that is a bool, Concatenate expects the next value to be a bool too.

If you are going to concatenate the columns together, without doing any other pre-processing to them, you can directly load them as floats, (0/1) rather than booleans.

like image 195
amy8374 Avatar answered Apr 12 '23 22:04

amy8374


All you need to do is OneHotEncode your non-float columns

.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "isBannedEncoded", inputColumnName: "isBanned"))
.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "profileConfiguredEncoded", inputColumnName: "profileConfigured"))
.Append(context.Transforms.Categorical.OneHotEncoding(outputColumnName: "commentPermissionEncoded", inputColumnName: "commentPermission"))

.Append(context.Transforms.Concatenate("Features", "isBannedEncoded", "profileVisibility", "profileConfiguredEncoded", "lastLogOff", "commentPermissionEncoded", "timeCreated", "friendCount", "gameBannedFriendsCount", "vacBannedFriendsCount", "gameBannedFriendsPercent", "vacBannedFriendsPercent"));

Hope that helps

like image 25
Judson Avatar answered Apr 12 '23 23:04

Judson