I was playing with Mahout and found that the FileDataModel accepts data in the format
userId,itemId,pref(long,long,Double).
I have some data which is of the format
String,long,double
What is the best/easiest method to work with this dataset on Mahout?
One way to do this is by creating an extension of FileDataModel. You'll need to override the readUserIDFromString(String value) method to use some kind of resolver do the conversion. You can use one of the implementations of IDMigrator, as Sean suggests.
For example, assuming you have an initialized MemoryIDMigrator, you could do this:
@Override
protected long readUserIDFromString(String stringID) {
long result = memoryIDMigrator.toLongID(stringID);
memoryIDMigrator.storeMapping(result, stringID);
return result;
}
This way you could use memoryIDMigrator to do the reverse mapping, too. If you don't need that, you can just hash it the way it's done in their implementation (it's in AbstractIDMigrator).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With