Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert a list into a pandas dataframe

Tags:

python

pandas

I have the following code:

rows =[]
for dt in new_info:
    x =  dt['state']
    est = dt['estimates']

    col_R = [val['choice'] for val in est if val['party'] == 'Rep']
    col_D = [val['choice'] for val in est if val['party'] == 'Dem']

    incumb = [val['party'] for val in est if val['incumbent'] == True ]

    rows.append((x, col_R, col_D, incumb))

Now I want to convert my rows list into a pandas data frame. Structure of my rows list is shown below and my list has 32 entries.

enter image description here

When I convert this into a pandas data frame, I get the entries in the data frame as a list. :

pd.DataFrame(rows, columns=["State", "R", "D", "incumbent"])  

enter image description here

But I want my data frame like this

enter image description here

The new info variable looks like this enter image description here

like image 619
Elizabeth Susan Joseph Avatar asked Jan 30 '15 01:01

Elizabeth Susan Joseph


People also ask

Can we create DataFrame from list?

The pandas DataFrame can be created by using the list of lists, to do this we need to pass a python list of lists as a parameter to the pandas. DataFrame() function. Pandas DataFrame will represent the data in a tabular format, like rows and columns.


1 Answers

Since you mind the objects in the columns being lists, I would use a generator to remove the lists wrapping your items:

import pandas as pd
import numpy as np
rows = [(u'KY', [u'McConnell'], [u'Grimes'], [u'Rep']),
        (u'AR', [u'Cotton'], [u'Pryor'], [u'Dem']),
        (u'MI', [u'Land'], [u'Peters'], [])]

def get(r, nth):
    '''helper function to retrieve item from nth list in row r'''
    return r[nth][0] if r[nth] else np.nan

def remove_list_items(list_of_records):
    for r in list_of_records:
        yield r[0], get(r, 1), get(r, 2), get(r, 3)

The generator works similarly to this function, but instead of materializing a list unnecessarily in memory as an intermediate step, it just passes each row that would be in the list to the consumer of the list of rows:

def remove_list_items(list_of_records):
    result = []
    for r in list_of_records:
        result.append((r[0], get(r, 1), get(r, 2), get(r, 3)))
    return result

And then compose your DataFrame passing your data through the generator, (or the list version, if you wish.)

>>> df = pd.DataFrame.from_records(
        remove_list_items(rows), 
        columns=["State", "R", "D", "incumbent"])
>>> df
  State          R       D incumbent
0    KY  McConnell  Grimes       Rep
1    AR     Cotton   Pryor       Dem
2    MI       Land  Peters       NaN

Or you could use a list comprehension or a generator expression (shown) to do essentially the same:

>>> df = pd.DataFrame.from_records(
      ((r[0], get(r, 1), get(r, 2), get(r, 3)) for r in rows), 
      columns=["State", "R", "D", "incumbent"])
like image 96
Russia Must Remove Putin Avatar answered Sep 28 '22 05:09

Russia Must Remove Putin