I have the following code:
rows =[]
for dt in new_info:
x = dt['state']
est = dt['estimates']
col_R = [val['choice'] for val in est if val['party'] == 'Rep']
col_D = [val['choice'] for val in est if val['party'] == 'Dem']
incumb = [val['party'] for val in est if val['incumbent'] == True ]
rows.append((x, col_R, col_D, incumb))
Now I want to convert my rows list into a pandas data frame. Structure of my rows list is shown below and my list has 32 entries.
When I convert this into a pandas data frame, I get the entries in the data frame as a list. :
pd.DataFrame(rows, columns=["State", "R", "D", "incumbent"])
But I want my data frame like this
The new info variable looks like this
The pandas DataFrame can be created by using the list of lists, to do this we need to pass a python list of lists as a parameter to the pandas. DataFrame() function. Pandas DataFrame will represent the data in a tabular format, like rows and columns.
Since you mind the objects in the columns being lists, I would use a generator to remove the lists wrapping your items:
import pandas as pd
import numpy as np
rows = [(u'KY', [u'McConnell'], [u'Grimes'], [u'Rep']),
(u'AR', [u'Cotton'], [u'Pryor'], [u'Dem']),
(u'MI', [u'Land'], [u'Peters'], [])]
def get(r, nth):
'''helper function to retrieve item from nth list in row r'''
return r[nth][0] if r[nth] else np.nan
def remove_list_items(list_of_records):
for r in list_of_records:
yield r[0], get(r, 1), get(r, 2), get(r, 3)
The generator works similarly to this function, but instead of materializing a list unnecessarily in memory as an intermediate step, it just passes each row that would be in the list to the consumer of the list of rows:
def remove_list_items(list_of_records):
result = []
for r in list_of_records:
result.append((r[0], get(r, 1), get(r, 2), get(r, 3)))
return result
And then compose your DataFrame passing your data through the generator, (or the list version, if you wish.)
>>> df = pd.DataFrame.from_records(
remove_list_items(rows),
columns=["State", "R", "D", "incumbent"])
>>> df
State R D incumbent
0 KY McConnell Grimes Rep
1 AR Cotton Pryor Dem
2 MI Land Peters NaN
Or you could use a list comprehension or a generator expression (shown) to do essentially the same:
>>> df = pd.DataFrame.from_records(
((r[0], get(r, 1), get(r, 2), get(r, 3)) for r in rows),
columns=["State", "R", "D", "incumbent"])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With