I am trying to use kmeans clustering in scipy, exactly the one present here: http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans.html#scipy.cluster.vq.kmeans What I am trying to do is to convert a list of list such as the following: <pre class="prettyprint"><code>data without_x[ [0, 0, 0, 0, 0, 0, 0, 20.0, 1.0, 48.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1224.0, 125.5, 3156.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 22.5, 56.0, 41.5, 85.5, 0, 0, 0, 0, 0, 0, 0, 0, 1495.0, 3496.5, 2715.0, 5566.5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ] </code></pre> into a ndarry in order to use it with the Kmeans method. When I try to convert the list of list into the ndarray I get an empty array, thus voiding the whole analysis. The length of the ndarray is variable and it depends on the number of samples gathered. But I can get that easily with the len(data_without_x) Here is a snippet of the code that returns the empty list. <pre class="prettyprint"><code>import numpy as np import "other functions" data, data_without_x = data_preparation.generate_sampled_pdf() nodes_stats, k, list_of_list= result_som.get_number_k() data_array = np.array(data_without_x) whitened = whiten(data_array) centroids, distortion = kmeans(whitened, int(k), iter=100000) </code></pre> and this is what I get as output just saving in a simple log file: <pre class="prettyprint"><code>___________________________ this is the data array[[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] ___________________________ This is the whitened array[[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]] ___________________________ </code></pre> Does anybody have a clue about what happens when I try to convert the list of list into a numpy.array? Thanks for your help

That is exactly how to convert a list of lists to an ndarray in python. Are you sure your <code>data_without_x</code> is filled correctly? On my machine: <pre class="prettyprint"><code>data = [[1,2,3,4],[5,6,7,8]] data_arr = np.array(data) data_arr array([[1,2,3,4], [5,6,7,8]]) </code></pre> Which is the behavior I think you're expecting Looking at your input you have a lot of zeros...keep in mind that the print out doesn't show all of it. You may just be seeing all the "zeros" from your input. Examine a specific non zero element to be sure

List of List to ndarray

Tags:

python

multidimensional-array

numpy

scipy

k-means

I am trying to use kmeans clustering in scipy, exactly the one present here:

http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans.html#scipy.cluster.vq.kmeans

What I am trying to do is to convert a list of list such as the following:

data without_x[
[0, 0, 0, 0, 0, 0, 0, 20.0, 1.0, 48.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1224.0, 125.5, 3156.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 22.5, 56.0, 41.5, 85.5, 0, 0, 0, 0, 0, 0, 0, 0, 1495.0, 3496.5, 2715.0, 5566.5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]

into a ndarry in order to use it with the Kmeans method. When I try to convert the list of list into the ndarray I get an empty array, thus voiding the whole analysis. The length of the ndarray is variable and it depends on the number of samples gathered. But I can get that easily with the len(data_without_x)

Here is a snippet of the code that returns the empty list.

import numpy as np
import "other functions"

data, data_without_x = data_preparation.generate_sampled_pdf()
nodes_stats, k, list_of_list= result_som.get_number_k()

data_array = np.array(data_without_x)
whitened = whiten(data_array)
centroids, distortion = kmeans(whitened, int(k), iter=100000)

and this is what I get as output just saving in a simple log file:

___________________________
this is the data array[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ..., 
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]
___________________________
This is the whitened array[[ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 ..., 
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]
 [ nan  nan  nan ...,  nan  nan  nan]]
___________________________

Does anybody have a clue about what happens when I try to convert the list of list into a numpy.array?

Thanks for your help

417

asked Jul 03 '13 13:07

MixturaDementiae

1 Answers

That is exactly how to convert a list of lists to an ndarray in python. Are you sure your data_without_x is filled correctly? On my machine:

data = [[1,2,3,4],[5,6,7,8]]
data_arr = np.array(data)

data_arr
array([[1,2,3,4],
       [5,6,7,8]])

Which is the behavior I think you're expecting

Looking at your input you have a lot of zeros...keep in mind that the print out doesn't show all of it. You may just be seeing all the "zeros" from your input. Examine a specific non zero element to be sure

answered Oct 14 '22 16:10

sedavidw

Related questions
                            
                                Is it possible to detect conflicting method names in Python?
                            
                                Scikit-learn feature selection for regression data
                            
                                Writing binary files in python to be read by C
                            
                                Enumerate all elements in Selenium Python bindings for Appium
                            
                                Relocatable (self-contained) Python built from source tarball inside virtualenv environment?
                            
                                Python: Getting text of a Regex match
                            
                                Numerical Integration over a Matrix of Functions, SymPy and SciPy
                            
                                Closing "Python Requests" connection from another thread
                            
                                MemoryError while pickling data in python
                            
                                Flask : TypeError: 'str' object is not callable
                            
                                Python - comparing elements of list with 'neighbour' elements
                            
                                Accessing web request globally in Tornado
                            
                                Using Spritesheets in Tkinter
                            
                                Communicate with subprocess without waiting for the subprocess to terminate on windows
                            
                                Undefined reference to `PyString_FromString'
                            
                                Python Error: 5.7.0 must issue a starttls command first
                            
                                Python plot frequency of fft.rfft
                            
                                Is it possible to plot within user-defined function with python and matplotlib?
                            
                                How to set the offset of timestamps in a pandas dataframe?
                            
                                How to return cost, grad as tuple for scipy's fmin_cg function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With