Starting from a dataframe like the below (simplified example of my real case):
import pandas as pd
df = pd.DataFrame({
'a': [1.0, 1.1, 1.0, 4.2, 5.1],
'b': [5.0, 4.2, 3.1, 3.2, 4.1],
'c': [3.9, 2.0, 4.2, 3.8, 6.7],
'd': [3.1, 2.1, 1.2, 1.0, 1.0]
})
And then taking a dictionary containing some multipliers I want to multiply certain columns in the dataframe by:
dict = {
"b": 0.01,
"d": 0.001
}
i.e. I want to check if each column in the dataframe is in my dictionary, and if it does exist as a key, then multiply that column of the dataframe by the value in the dictionary. In this example, I would want to multiply column 'b' by 0.01 and column 'd' by 0.001. I would end up with:
'a': [1.0, 1.1, 1.0, 4.2, 5.1],
'b': [0.05, 0.042, 0.031, 0.032, 0.041],
'c': [3.9, 2.0, 4.2, 3.8, 6.7],
'd': [0.0031, 0.0021, 0.0012, 0.001, 0.001]
In my real example, the dataframe is a cleaned-up set of data read in from Excel, and the dictionary of multipliers is read in from a config file, to allow users to specify which columns need converting from whatever is in Excel to the desired/expected units of measure (e.g. converting 'g/h' in the raw data to 'kg/h' in the dataframe).
What are some good, clear ways of achieving this intent, even if I have to restructure the implementation a bit?
Try:
df[list(dct)] *= dct.values()
print(df)
Prints:
a b c d
0 1.0 0.050 3.9 0.0031
1 1.1 0.042 2.0 0.0021
2 1.0 0.031 4.2 0.0012
3 4.2 0.032 3.8 0.0010
4 5.1 0.041 6.7 0.0010
If in dct are keys not in dataframe:
tmp = {k: dct[k] for k in dct.keys() & df.columns}
df[list(tmp)] *= tmp.values()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With