Pyspark: Select all columns except particular columns

Question

I have a large number of columns in a PySpark dataframe, say 200. I want to select all the columns except say 3-4 of the columns. How do I select this columns without having to manually type the names of all the columns I want to select?

Tshilidzi Mudau · Accepted Answer

In the end, I settled for the following :

Drop:

df.drop('column_1', 'column_2', 'column_3')
Select :

df.select([c for c in df.columns if c not in {'column_1', 'column_2', 'column_3'}])

Pyspark: Select all columns except particular columns

Tags:

Tshilidzi Mudau

1 Answers

Tshilidzi Mudau

Recent Activity

Donate For Us

Pyspark: Select all columns except particular columns

Tags:

Tshilidzi Mudau

1 Answers

Tshilidzi Mudau

Related questions

Recent Activity

Donate For Us