I am using the kmodes python library. Can some one explain what the parameters mean?
Link: https://github.com/nicodv/kmodes#huang97
km = kmodes.KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)
I know n_clusters is the number of clusters to group the data into, but what are the other parameters?
From the source code:
Parameters
-----------
n_clusters : int, optional, default: 8
The number of clusters to form as well as the number of
centroids to generate.
max_iter : int, default: 300
Maximum number of iterations of the k-modes algorithm for a
single run.
cat_dissim : func, default: matching_dissim
Dissimilarity function used by the algorithm for categorical variables.
Defaults to the matching dissimilarity function.
init : {'Huang', 'Cao', 'random' or an ndarray}, default: 'Cao'
Method for initialization:
'Huang': Method in Huang [1997, 1998]
'Cao': Method in Cao et al. [2009]
'random': choose 'n_clusters' observations (rows) at random from
data for the initial centroids.
If an ndarray is passed, it should be of shape (n_clusters, n_features)
and gives the initial centroids.
n_init : int, default: 10
Number of time the k-modes algorithm will be run with different
centroid seeds. The final results will be the best output of
n_init consecutive runs in terms of cost.
verbose : int, optional
Verbosity mode.
So init
is just the method used for initialisation, while n_init
is the number of times the algorithm will be run, with the best output selected from those independent runs.
verbose
just dictates how much output gets passed to stdout (i.e. telling you what stage the algorithm is at etc).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With