Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

turning a two dimensional array into a two column dataframe pandas

Tags:

python

pandas

if I have the following, how do I make pd.DataFrame() turn this array into a dataframe with two columns. What's the most efficient way? My current approach involves creating copies out of each into a series and making dataframes out of them.

From this:

([[u'294 (24%) L', u'294 (26%) R'],
  [u'981 (71%) L', u'981 (82%) R'],])

to

x    y
294  294
981  981

rather than

x
[u'294 (24%) L', u'294 (26%) R']

my current approach. Looking for something more efficient

numL = pd.Series(numlist).map(lambda x: x[0])
    numR = pd.Series(numlist).map(lambda x: x[1])

    nL = pd.DataFrame(numL, columns=['left_num'])
    nR = pd.DataFrame(numR, columns=['right_num'])

    nLR = nL.join(nR)

    nLR

UPDATE**

I noticed that my error simply comes down to when you pd.DataFrame() a list versus a series. WHen you create a dataframe out of a list, it merges the items into the same column. Not so with a list. That solved my problem in the most efficient way.

like image 724
user3314418 Avatar asked May 05 '14 21:05

user3314418


People also ask

How do I turn a 2D array into a DataFrame in Python?

How do you convert an array to a DataFrame in Python? To convert an array to a dataframe with Python you need to 1) have your NumPy array (e.g., np_array), and 2) use the pd. DataFrame() constructor like this: df = pd. DataFrame(np_array, columns=['Column1', 'Column2']) .

How do I convert two arrays into a DataFrame?

We can use np. column_stack() to combine two 1-D arrays X and Y into a 2-D array. Then, we can use pd. DataFrame to change it into a dataframe.

Is pandas DataFrame two-dimensional array?

DataFrame. DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object.


1 Answers

data = [[u'294 (24%) L', u'294 (26%) R'],  [u'981 (71%) L', u'981 (82%) R'],]
    
clean_data = [[int(item.split()[0]) for item in row] for row in data]

# clean_data: [[294, 294], [981, 981]]
    
pd.DataFrame(clean_data, columns=list('xy'))

#         x    y
#    0  294  294
#    1  981  981
#
#    [2 rows x 2 columns]
like image 96
unutbu Avatar answered Oct 07 '22 13:10

unutbu