Will pandas dataframe object work with sklearn kmeans clustering?

Tags:

dataset is pandas dataframe. This is sklearn.cluster.KMeans

 km = KMeans(n_clusters = n_Clusters)   km.fit(dataset)   prediction = km.predict(dataset)

This is how I decide which entity belongs to which cluster:

 for i in range(len(prediction)):      cluster_fit_dict[dataset.index[i]] = prediction[i]

This is how dataset looks:

 A 1 2 3 4 5 6  B 2 3 4 5 6 7  C 1 4 2 7 8 1  ...

where A,B,C are indices

Is this the correct way of using k-means?

668

asked Jan 19 '15 02:01

Dark Knight

1 Answers

Assuming all the values in the dataframe are numeric,

# Convert DataFrame to matrix mat = dataset.values # Using sklearn km = sklearn.cluster.KMeans(n_clusters=5) km.fit(mat) # Get cluster assignment labels labels = km.labels_ # Format results as a DataFrame results = pandas.DataFrame([dataset.index,labels]).T

Alternatively, you could try KMeans++ for Pandas.

192

answered Sep 29 '22 12:09

user666

Related questions
                            
                                How to make a list of n numbers in Python and randomly select any number?
                            
                                Find number of columns in csv file
                            
                                Neural Network training with PyBrain won't converge
                            
                                Can you create a Python list from a string, while keeping characters in specific keywords together?
                            
                                Pandas: append dataframe to another df
                            
                                module 'matplotlib' has no attribute 'verbose'
                            
                                Glade or no glade: What is the best way to use PyGtk?
                            
                                How to retrieve a variable's name in python at runtime?
                            
                                Searching a sorted list? [closed]
                            
                                remove colorbar from figure in matplotlib
                            
                                When to use == and when to use is?
                            
                                Python: avoiding if condition for this code?
                            
                                Valid characters in a python class name
                            
                                raise statement on a conditional expression
                            
                                Which is the most efficient way to iterate through a list in python?
                            
                                SciPy/Python install on Ubuntu
                            
                                How do you join two tables on a foreign key field using django ORM?
                            
                                How to install python packages without root privileges?
                            
                                check for file existence in Python 3 [duplicate]
                            
                                Finding k closest numbers to a given number

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Will pandas dataframe object work with sklearn kmeans clustering?

Tags:

python

pandas

cluster-analysis

k-means

scikit-learn

Dark Knight

People also ask

1 Answers

user666

Recent Activity

Donate For Us