Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Put a 2d Array into a Pandas Series

Tags:

I have a 2D Numpy array that I would like to put in a pandas Series (not a DataFrame):

>>> import pandas as pd
>>> import numpy as np
>>> a = np.zeros((5, 2))
>>> a
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])

But this throws an error:

>>> s = pd.Series(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 227, in __init__
    raise_cast_failure=True)
  File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 2920, in _sanitize_array
    raise Exception('Data must be 1-dimensional')
Exception: Data must be 1-dimensional

It is possible with a hack:

>>> s = pd.Series(map(lambda x:[x], a)).apply(lambda x:x[0])
>>> s
0    [0.0, 0.0]
1    [0.0, 0.0]
2    [0.0, 0.0]
3    [0.0, 0.0]
4    [0.0, 0.0]

Is there a better way?

like image 901
zemekeneng Avatar asked Aug 09 '16 00:08

zemekeneng


People also ask

Can you create a Pandas series from an array?

A pandas Series is very similar to a 1-dimensional NumPy array, and we can create a pandas Series by using a NumPy array. To do this we need to import the NumPy module, as it is a prerequisite for the pandas package no need to install it separately.

How do you convert a 2D array into a DataFrame?

How do you convert an array to a DataFrame in Python? To convert an array to a dataframe with Python you need to 1) have your NumPy array (e.g., np_array), and 2) use the pd. DataFrame() constructor like this: df = pd. DataFrame(np_array, columns=['Column1', 'Column2']) .

How do you convert Ndarray to series?

A NumPy array can be converted into a Pandas series by passing it in the pandas. Series() function.

How to convert a NumPy array to a pandas series?

A NumPy array can be converted into a Pandas series by passing it in the pandas.Series () function. Example 1 : import numpy as np. import pandas as pd. array = np.array ( [10, 20, 1, 2, 3, 4, 5, 6, 7]) print("Numpy array is :") display (array)

How to create a pandas series from array without index?

Let’s see how to create a Pandas Series from the array. Method #1: Create a series from array without index. In this case as no index is passed, so by default index will be range (n) where n is array length.

How to convert a series to an array in Python?

Write a Pandas program to convert a given Series to an array. Sample Solution: Python Code : import pandas as pd import numpy as np s1 = pd.Series(['100', '200', 'python', '300.12', '400']) print("Original Data Series:") print(s1) print("Series to an array") a = np.array(s1.values.tolist()) print (a) Sample Output:

What is pandas series in Python?

Pandas Series is a one-dimensional labelled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). It has to be remembered that unlike Python lists, a Series will always contain data of the same type. Let’s see how to create a Pandas Series from the array.


2 Answers

Well, you can use the numpy.ndarray.tolist function, like so:

>>> a = np.zeros((5,2))
>>> a
array([[ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.],
       [ 0.,  0.]])
>>> a.tolist()
[[0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0], [0.0, 0.0]]
>>> pd.Series(a.tolist())
0    [0.0, 0.0]
1    [0.0, 0.0]
2    [0.0, 0.0]
3    [0.0, 0.0]
4    [0.0, 0.0]
dtype: object

EDIT:

A faster way to accomplish a similar result is to simply do pd.Series(list(a)). This will make a Series of numpy arrays instead of Python lists, so should be faster than a.tolist which returns a list of Python lists.

like image 108
bpachev Avatar answered Oct 30 '22 13:10

bpachev


 pd.Series(list(a))

is consistently slower than

pd.Series(a.tolist())

tested 20,000,000 -- 500,000 rows

a = np.ones((500000,2))

showing only 1,000,000 rows:

%timeit pd.Series(list(a))
1 loop, best of 3: 301 ms per loop

%timeit pd.Series(a.tolist())
1 loop, best of 3: 261 ms per loop
like image 33
Merlin Avatar answered Oct 30 '22 12:10

Merlin