Remove rows that two columns have the same values by pandas

Tags:

pandas

Input：

    S   T   W      U
0   A   A   1   Undirected
1   A   B   0   Undirected
2   A   C   1   Undirected
3   B   A   0   Undirected
4   B   B   1   Undirected
5   B   C   1   Undirected
6   C   A   1   Undirected
7   C   B   1   Undirected
8   C   C   1   Undirected

Output：

    S   T   W      U
1   A   B   0   Undirected
2   A   C   1   Undirected
3   B   A   0   Undirected
5   B   C   1   Undirected
6   C   A   1   Undirected
7   C   B   1   Undirected

For column S and T ,rows(0,4,8) have same values. I want to drop these rows.

Trying:

I used df.drop_duplicates(['S','T'] but failed, how could I get the results.

604

asked May 13 '17 09:05

Jack

1 Answers

You need boolean indexing:

print (df['S'] != df['T'])
0    False
1     True
2     True
3     True
4    False
5     True
6     True
7     True
8    False
dtype: bool

df = df[df['S'] != df['T']]
print (df)
   S  T  W           U
1  A  B  0  Undirected
2  A  C  1  Undirected
3  B  A  0  Undirected
5  B  C  1  Undirected
6  C  A  1  Undirected
7  C  B  1  Undirected

Or query:

df = df.query("S != T")
print (df)
   S  T  W           U
1  A  B  0  Undirected
2  A  C  1  Undirected
3  B  A  0  Undirected
5  B  C  1  Undirected
6  C  A  1  Undirected
7  C  B  1  Undirected

139

answered Oct 06 '22 08:10

jezrael

Related questions
                            
                                Seaborn Heatmap: Move colorbar on top of the plot
                            
                                python pandas add leading zero to make all months 2 digits
                            
                                How to break numpy array into smaller chunks/batches, then iterate through them
                            
                                Pandas: assign category based on where value falls in range
                            
                                Why does max() sometimes return nan and sometimes ignores it?
                            
                                Convert list of arrays to pandas dataframe
                            
                                Using predicates to filter rows from pyarrow.parquet.ParquetDataset
                            
                                Calculate RSI indicator from pandas DataFrame?
                            
                                custom matplotlib plot : chess board like table with colored cells
                            
                                Binning a numpy array
                            
                                Easiest way to read csv files with multiprocessing in Pandas
                            
                                Plot datetime.date pandas
                            
                                How to create categorical variable based on a numerical variable
                            
                                Change data type of a specific column of a pandas dataframe
                            
                                How to rename a pandas Series?
                            
                                Set File_Path for to_csv() in Pandas
                            
                                Calculating Autocorrelation of Pandas DataFrame along each Column
                            
                                add hyperlink to excel sheet created by pandas dataframe to_excel method
                            
                                Create a legend with pandas and matplotlib.pyplot
                            
                                pandas replace null values for a subset of columns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With