Just getting into Python, so hopefully I'm not asking a stupid question here...
So I have a pandas dataframe named "df_complete' with let's say 100 rows, and containing columns named: "type", "writer", "status", 'col a', 'col c'. I want to create/update a new dataframe named "temp_df" and create it based on conditions using "df_complete" values.
temp_df = pandas.DataFrame()
if ((df_complete['type'] == 'NDD') & (df_complete['writer'] == 'Mary') & (df_complete['status'] != '7')):
temp_df['col A'] = df_complete['col a']
temp_df['col B'] = 'good'
temp_df['col C'] = df_complete['col c']
However, when I do this, I got the following error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I read this thread and changed my "and" to "&": Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
I also read this thread here to put everything in parenthesis: comparing dtyped [float64] array with a scalar of type [bool] in Pandas DataFrame
But the error is still present. What is causing this? and how can I fix it?
** follow up question ** Also, how can I obtain the index values of those rows that met the condition?
You can create a conditional DataFrame column by checking multiple columns using numpy. select() function. The select() function is more capable than the previous methods. We can use it to give a set of conditions and a set of values.
The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
Pandas DataFrame replace() MethodThe replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.
As we can see in the output, we have successfully added a new column to the dataframe based on some condition. Solution #3 : We can use DataFrame.map () function to achieve the goal. It is a very straight forward method where we use a dictionary to simply map values to the newly added column based on the key.
We could also use pandas.Series.map () to create new DataFrame columns based on a given condition in Pandas. This method is applied elementwise for Series and maps values from one column to the other based on the input that could be a dictionary, function, or Series.
Notice that this new DataFrame only contains the points and column from the old DataFrame. Notice that this new DataFrame contains all of the columns from the original DataFrame except the points column.
Example 1 shows how to build up a new data frame based on the column names of another data frame. To accomplish this, we can use square brackets and the c () function as shown below: Table 2 shows the output of the previous R syntax: A subset of our input data frame that was initialized based on the column names of the input data frame.
I think you need boolean indexing
with loc
for selecting only columns col a
and col c
:
temp_df = df_complete.loc[(df_complete['type'] == 'NDD') &
(df_complete['writer'] == 'Mary') &
(df_complete['status'] != '7'), ['col a','col c']]
#rename columns
temp_df = temp_df.rename(columns={'col a':'col A','col c':'col C'})
#add new column
temp_df['col B'] = 'good'
#reorder columns
temp_df = temp_df[['col A','col B','col C']]
Sample:
df_complete = pd.DataFrame({'type': ['NDD','NDD','NT'],
'writer':['Mary','Mary','John'],
'status':['4','5','6'],
'col a': [1,3,5],
'col b': [5,3,6],
'col c': [7,4,3]}, index=[3,4,5])
print (df_complete)
col a col b col c status type writer
3 1 5 7 4 NDD Mary
4 3 3 4 5 NDD Mary
5 5 6 3 6 NT John
temp_df = df_complete.loc[(df_complete['type'] == 'NDD') &
(df_complete['writer'] == 'Mary') &
(df_complete['status'] != '7'), ['col a','col c']]
print (temp_df)
col a col c
3 1 7
4 3 4
temp_df = temp_df.rename(columns={'col a':'col A','col c':'col C'})
#add new column
temp_df['col B'] = 'good'
#reorder columns
temp_df = temp_df[['col A','col B','col C']]
print (temp_df)
col A col B col C
3 1 good 7
4 3 good 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With