I have a DataFrame with a multiindex in the columns and would like to use dictionaries to append new rows.
Let's say that each row in the DataFrame is a city. The columns contains "distance" and "vehicle". And each cell would be the percentage of the population that chooses this vehicle for this distance.
I'm constructing an index like this:
index_tuples=[]
for distance in ["near", "far"]:
for vehicle in ["bike", "car"]:
index_tuples.append([distance, vehicle])
index = pd.MultiIndex.from_tuples(index_tuples, names=["distance", "vehicle"])
Then I'm creating a dataframe:
dataframe = pd.DataFrame(index=["city"], columns = index)
The structure of the dataframe looks good. Although pandas has added Nans as default values ?
Now I would like to set up a dictionary for the new city and add it:
my_home_city = {"near":{"bike":1, "car":0},"far":{"bike":0, "car":1}}
dataframe["my_home_city"] = my_home_city
But this fails:
ValueError: Length of values does not match length of index
Here is the complete error message (pastebin)
UPDATE:
Thank you for all the good answers. I'm afraid I've oversimplified the problem in my example. Actually my index is nested with 3 levels (and it could become more).
So I've accepted the universal answer of converting my dictionary into a list of tuples. This might not be as clean as the other approaches but works for any multiindex setup.
Multi index is a list
of tuple
, we just need to modify your dict
,then we could directly assign the value
d = {(x,y):my_home_city[x][y] for x in my_home_city for y in my_home_city[x]}
df.loc['my_home_city',:]=d
df
Out[994]:
distance near far
vehicle bike car bike car
city NaN NaN NaN NaN
my_home_city 1 0 0 1
More Info
d
Out[995]:
{('far', 'bike'): 0,
('far', 'car'): 1,
('near', 'bike'): 1,
('near', 'car'): 0}
df.columns.values
Out[996]: array([('near', 'bike'), ('near', 'car'), ('far', 'bike'), ('far', 'car')], dtype=object)
You can append to you dataframe like this:
my_home_city = {"near":{"bike":1, "car":0},"far":{"bike":0, "car":1}}
dataframe.append(pd.DataFrame.from_dict(my_home_city).unstack().rename('my_home_city'))
Output:
distance near far
vehicle bike car bike car
city NaN NaN NaN NaN
my_home_city 1 0 0 1
The trick is to create the dataframe row with from_dict
then unstack
to get structure of your original dataframe with multiindex columns then rename
to get index and append
.
Or if you don't want to create the empty dataframe first you can use this method to create the dataframe with the new data.
pd.DataFrame.from_dict(my_home_city).unstack().rename('my_home_city').to_frame().T
Output:
far near
bike car bike car
my_home_city 0 1 1 0
Explained:
pd.DataFrame.from_dict(my_home_city)
far near
bike 0 1
car 1 0
Now, let's unstack to create multiindex and get to that new dataframe into the structure of the original dataframe.
pd.DataFrame.from_dict(my_home_city).unstack()
far bike 0
car 1
near bike 1
car 0
dtype: int64
We use rename to give that series a name which becomes the index label of that dataframe row when appended to the original dataframe.
far bike 0
car 1
near bike 1
car 0
Name: my_home_city, dtype: int64
Now if you converted that series to a frame and transposed it would look very much like a new row, however, there is no need to do this because, Pandas does intrinsic data alignment, so appending this series to the dataframe will auto-align and add the new dataframe record.
dataframe.append(pd.DataFrame.from_dict(my_home_city).unstack().rename('my_home_city'))
distance near far
vehicle bike car bike car
city NaN NaN NaN NaN
my_home_city 1 0 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With