I have a dynamic list which is created based on value of n.
n = 3
drop_lst = ['a' + str(i) for i in range(n)]
df.drop(drop_lst)
But the above is not working.
Note:
My use case requires a dynamic list.
If I just do the below without list it works
df.drop('a0','a1','a2')
How do I make drop function work with list?
Spark 2.2 doesn't seem to have this capability. Is there a way to make it work without using select()
?
The Spark DataFrame provides the drop() method to drop the column or the field from the DataFrame or the Dataset. The drop() method is also used to remove the multiple columns from the Spark DataFrame or the Database.
Drop multiple column in pyspark :Method 2 Drop multiple column in pyspark using drop() function. List of column names to be dropped is mentioned in the list named “columns_to_drop”. This list is passed to the drop() function.
DataFrame. drop() method removes the column/columns from the DataFrame, by default it doesn't remove on the existing DataFrame instead it returns a new DataFrame after dropping the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.
You can use the *
operator to pass the contents of your list as arguments to drop()
:
df.drop(*drop_lst)
You can give column name as comma separated list e.g.
df.drop("col1","col11","col21")
This is how drop specified number of consecutive columns in scala:
val ll = dfwide.schema.names.slice(1,5)
dfwide.drop(ll:_*).show
slice take two parameters star index and end index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With