I created a dataframe df as below:
Type = ['A', 'B', 'C', 'D']
Size = [72,23,66,12]
df = pd.DataFrame({'Type': Type, 'Size': Size})
I can extract any existing column using:
df_count = df['Size']
However, if a data frame is too big, and I don't know if the column exist in df or not. In such if I call a column e.g. df['Shape'] as below:
df_null = df['Shape']
It raises a KeyError. However I want that df_null should get an empty column with name "Shape".
Use DataFrame.get in a pattern similar to:
In [3]: df.get('Size', pd.Series(index=df.index, name='Size'))
Out[3]:
0 72
1 23
2 66
3 12
Name: Size, dtype: int64
In [4]: df.get('Shape', pd.Series(index=df.index, name='Shape'))
Out[4]:
0 NaN
1 NaN
2 NaN
3 NaN
Name: Shape, dtype: float64
Or generalize by creating a function to abstract this:
In [5]: get_column = lambda df, col: df.get(col, pd.Series(index=df.index, name=col))
In [6]: get_column(df, 'Size')
Out[6]:
0 72
1 23
2 66
3 12
Name: Size, dtype: int64
In [7]: get_column(df, 'Shape')
Out[7]:
0 NaN
1 NaN
2 NaN
3 NaN
Name: Shape, dtype: float64
Another alternative could be to use reindex and squeeze:
In [8]: df.reindex(columns=['Size']).squeeze()
Out[8]:
0 72
1 23
2 66
3 12
Name: Size, dtype: int64
In [9]: df.reindex(columns=['Shape']).squeeze()
Out[9]:
0 NaN
1 NaN
2 NaN
3 NaN
Name: Shape, dtype: float64
IIUC, try this
col = 'Shape'
df_null = pd.Series() if col not in df.columns else df[col]
Output:
Series([], dtype: float64)
OR
col = 'Size'
df_null = pd.Series() if col not in df.columns else df[col]
Output:
0 72
1 23
2 66
3 12
Name: Size, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With