Dataset<Row> dataFrame = ... ;
StringIndexerModel labelIndexer = new StringIndexer()
.setInputCol("label")
.setOutputCol("indexedLabel")
.fit(dataFrame);
VectorIndexerModel featureIndexer = new VectorIndexer()
.setInputCol("s")
.setOutputCol("indexedFeatures")
.setMaxCategories(4)
.fit(dataFrame);
IndexToString labelConverter = new IndexToString()
.setInputCol("prediction")
.setOutputCol("predictedLabel")
.setLabels(labelIndexer.labels());
What is StringIndexer, VectorIndexer, IndexToString and what is the difference between them? How and When should I use them?
I know only about those two:
StringIndexer and VectorIndexer
StringIndexer:
VectorIndexer:
Take a look here for example: https://mingchen0919.github.io/learning-apache-spark/StringIndexer-and-VectorIndexer.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With