Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting dictionary with values in List to Pandas DataFrame

I have a dictionary with City names as keys and corresponding to each city there is a list of dates. For Example:

{
'A':['2017-01-02','2017-01-03'],
'B':['2017-02-02','2017-02-03','2017-02-04','2017-02-05'],
'C':['2016-02-02']
}

And I want to convert this to the following dataframe with 2 columns.

City_Name  Date
A          2017-01-02
A          2017-01-03
B          2017-02-02
B          2017-02-03
B          2017-02-04
B          2017-02-05
C          2016-02-02
like image 456
Nazir Ahmed Avatar asked Nov 09 '17 14:11

Nazir Ahmed


People also ask

How do I make a pandas DataFrame from a list of dictionaries?

Use pd. DataFrame. from_dict() to transform a list of dictionaries to pandas DatFrame. This function is used to construct DataFrame from dict of array-like or dicts.

Can we create DataFrame from dictionary of lists?

It is the most commonly used pandas object. Creating pandas data-frame from lists using dictionary can be achieved in multiple ways. Let's discuss different ways to create a DataFrame one by one. With this method in Pandas, we can transform a dictionary of lists into a dataframe.

Can a dictionary of dictionaries be used to create a pandas DataFrame?

We can create a dataframe using Pandas. DataFrame() method. Example: Create pandas Dataframe from the dictionary of dictionaries.


2 Answers

Or we can using melt

pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items() ])).melt().dropna()
Out[51]: 
  variable       value
0        A  2017-01-02
1        A  2017-01-03
4        B  2017-02-02
5        B  2017-02-03
6        B  2017-02-04
7        B  2017-02-05
8        C  2016-02-02

A way inspired by piR

pd.Series(d).apply(pd.Series).melt().dropna()
Out[142]: 
    variable       value
0          0  2017-01-02
1          0  2017-02-02
2          0  2016-02-02
3          1  2017-01-03
4          1  2017-02-03
7          2  2017-02-04
10         3  2017-02-05
like image 194
BENY Avatar answered Sep 30 '22 10:09

BENY


Use numpy.repeat for repeat keys:

#get lens of lists
a = [len(x) for x in d.values()]
#flattening values
b = [i for s in d.values() for i in s]
df = pd.DataFrame({'City_Name':np.repeat(list(d.keys()), a), 'Date':b})
print (df)

  City_Name        Date
0         C  2016-02-02
1         B  2017-02-02
2         B  2017-02-03
3         B  2017-02-04
4         B  2017-02-05
5         A  2017-01-02
6         A  2017-01-03

Another similar like Danh Pham' solution, credit to him:

df = pd.DataFrame([(i, day) for i,j in d.items() for day in j], 
                  columns=['City_Name','Date'])
print(df)

  City_Name        Date
0         C  2016-02-02
1         B  2017-02-02
2         B  2017-02-03
3         B  2017-02-04
4         B  2017-02-05
5         A  2017-01-02
6         A  2017-01-03
like image 27
jezrael Avatar answered Sep 30 '22 10:09

jezrael