I am using pandas v23.4 and python 3.7.0. I have two tables which have identical shapes and columns. I'd like to set an index-selected subset of one table with the same index-selected subset of another table.
It fails to work sometimes with a string index, though I'm not sure it has to do with it being a string index. Commenting out one unused column of the dataframe causes it to work.
The traceback below shows that it's somewhere in the indexing that it gets confused in figuring out if the target and source are the same length. The code is lengthy and a bit convoluted.
$ cat foo.py
import numpy as np
import pandas as pd
m = np.array([1., 2., 1., 3., 5., 5., 6., 2., 2., 1., 7., 2.,
5., 4., 2., 5., 5., 5., 3., 8., 7., 2., 7., 6.], )
dma_l = [501, 501, 501, 501, 501, 501, 501, 501, 501, 501, 501, 501,
502, 502, 502, 502, 502, 502, 502, 502, 502, 502, 502, 502]
size_l = [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4,
1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4]
age_l = ['20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45',
'20-25', '30-35', '40-45']
df = pd.DataFrame()
df['dma'] = dma_l # <-- comment out this line and it works
df['size'] = size_l
df['age'] = age_l
df['total'] = m
df2 = df.copy() # Make a second dataframe with the same shape.
# Works with an integer index.
df.set_index('size', inplace=True)
df2.set_index('size', inplace=True)
df.loc[(1,), 'total'] = df2.loc[(1,), 'total']
# Does not work with my string index. Removing the dma column
# causes it to work again.
df.set_index('age', inplace=True)
df2.set_index('age', inplace=True)
df.loc[('20-25',), 'total'] = df2.loc[('20-25',), 'total']
$ python foo.py
Traceback (most recent call last):
File "...", line 34, in <module>
df.loc[('20-25',), 'total'] = df2.loc[('20-25',), 'total']
File ".../lib/python3.7/site-packages/pandas/core/indexing.py", line 189, in __setitem__
self._setitem_with_indexer(indexer, value)
File ".../lib/python3.7/site-packages/pandas/core/indexing.py", line 606, in _setitem_with_indexer
raise ValueError('Must have equal len keys and value '
ValueError: Must have equal len keys and value when setting with an iterable
All solutions working if same number of rows and same index values in both DataFrames.
Problem is duplicated index values, solution is create boolean mask if want working with one value of index:
df2['total'] *= 10
df.loc[df.index == 1, 'total'] = df2.loc[1, 'total']
print (df)
dma age total
size
1 501 20-25 10.0
1 501 30-35 20.0
1 501 40-45 10.0
2 501 20-25 3.0
2 501 30-35 5.0
2 501 40-45 5.0
3 501 20-25 6.0
3 501 30-35 2.0
3 501 40-45 2.0
4 501 20-25 1.0
4 501 30-35 7.0
4 501 40-45 2.0
1 502 20-25 50.0
1 502 30-35 40.0
1 502 40-45 20.0
2 502 20-25 5.0
2 502 30-35 5.0
2 502 40-45 5.0
3 502 20-25 3.0
3 502 30-35 8.0
3 502 40-45 7.0
4 502 20-25 2.0
4 502 30-35 7.0
4 502 40-45 6.0
Or create new column by assign for more general solution:
df2['total'] *= 10
df['total1'] = df2['total']
#working with one DataFrame
df.loc[[1, 4], 'total'] = df.loc[[1, 4], 'total1']
print (df)
dma age total total1
size
1 501 20-25 10.0 10.0
1 501 30-35 20.0 20.0
1 501 40-45 10.0 10.0
2 501 20-25 3.0 30.0
2 501 30-35 5.0 50.0
2 501 40-45 5.0 50.0
3 501 20-25 6.0 60.0
3 501 30-35 2.0 20.0
3 501 40-45 2.0 20.0
4 501 20-25 10.0 10.0
4 501 30-35 70.0 70.0
4 501 40-45 20.0 20.0
1 502 20-25 50.0 50.0
1 502 30-35 40.0 40.0
1 502 40-45 20.0 20.0
2 502 20-25 5.0 50.0
2 502 30-35 5.0 50.0
2 502 40-45 5.0 50.0
3 502 20-25 3.0 30.0
3 502 30-35 8.0 80.0
3 502 40-45 7.0 70.0
4 502 20-25 20.0 20.0
4 502 30-35 70.0 70.0
4 502 40-45 60.0 60.0
Another solution is create mask and filter in both DataFrames:
df2['total'] *= 10
mask = df.index.isin([1,4])
df.loc[mask, 'total'] = df2.loc[mask, 'total']
print (df)
dma age total
size
1 501 20-25 10.0
1 501 30-35 20.0
1 501 40-45 10.0
2 501 20-25 3.0
2 501 30-35 5.0
2 501 40-45 5.0
3 501 20-25 6.0
3 501 30-35 2.0
3 501 40-45 2.0
4 501 20-25 10.0
4 501 30-35 70.0
4 501 40-45 20.0
1 502 20-25 50.0
1 502 30-35 40.0
1 502 40-45 20.0
2 502 20-25 5.0
2 502 30-35 5.0
2 502 40-45 5.0
3 502 20-25 3.0
3 502 30-35 8.0
3 502 40-45 7.0
4 502 20-25 20.0
4 502 30-35 70.0
4 502 40-45 60.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With