Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete data frame column if column name ends with some string, Python 3.6 [duplicate]

I have a dataframe with below columns:

SectorName', 'Sector', 'ItemName', 'Item', 'Counterpart SectorName', 'Counterpart Sector', 'Stocks and TransactionsName', 'Stocks and Transactions', 'Units', 'Scale', 'Frequency', 'Date', 'Value'

How to delete column from df where column name ends with Name.

like image 701
Learnings Avatar asked Sep 21 '17 14:09

Learnings


People also ask

How do I remove duplicate columns from a DataFrame in Python?

To drop duplicate columns from pandas DataFrame use df. T. drop_duplicates(). T , this removes all columns that have the same data regardless of column names.

How do you delete a row from a DataFrame in Python based on a condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do I find duplicates in a column in a data frame?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.

How do you exclude a column from a data frame?

We can exclude one column from the pandas dataframe by using the loc function. This function removes the column based on the location. Here we will be using the loc() function with the given data frame to exclude columns with name,city, and cost in python.


1 Answers

You can filter by inverting (~) boolean mask for columns which not need delete with loc and str.endswith, also working str.contains with $ for match end of string:

cols = ['SectorName', 'Name Sector', 'ItemName', 'Item', 'Counterpart SectorName']
df = pd.DataFrame([range(5)], columns = cols)
print (df)
   SectorName  Name Sector  ItemName  Item  Counterpart SectorName
0           0            1         2     3                       4

print (~df.columns.str.endswith('Name'))
[False  True False  True False]

df1 = df.loc[:, ~df.columns.str.endswith('Name')]

df1 = df.loc[:, ~df.columns.str.contains('Name$')]

Or filter columns names first:

print (df.columns[~df.columns.str.endswith('Name')])
Index(['Sector', 'Item'], dtype='object')

df1 = df[df.columns[~df.columns.str.endswith('Name')]]

print (df1)
   Name Sector  Item
0            1     3
like image 57
jezrael Avatar answered Oct 20 '22 05:10

jezrael