Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count number of columns in Spark Dataframe?

I have this dataframe in Spark I want to count the number of available columns in it. I know how to count the number of rows in column but I want to count number of columns.

val df1 = Seq(
    ("spark", "scala",  "2015-10-14", 10,"rahul"),
    ("spark", "scala", "2015-10-15", 11,"abhishek"),
    ("spark", "scala", "2015-10-16", 12,"Jay"),
    ("spark","scala",null,13,"Kiran"))
  .toDF("bu_name","client_name","date","patient_id","paitent _name")
df1.show

Can anybody tell me how I can count number of column count in this dataframe? I am using the Scala language.

like image 438
Rahul Pandey Avatar asked Jul 27 '18 08:07

Rahul Pandey


People also ask

How do you use count in spark?

Using the count () method, we can get the number of rows from the column, and finally, we can use the collect() method to get the count from the column. Where, df is the input PySpark DataFrame. column_name is the column to get the total number of rows (count).

How do you count the number of records in PySpark?

To get the number of rows from the PySpark DataFrame use the count() function. This function returns the total number of rows from the DataFrame.

How do I find the length of my spark data frame?

Similar to Python Pandas you can get the Size and Shape of the PySpark (Spark with Python) DataFrame by running count() action to get the number of rows on DataFrame and len(df. columns()) to get the number of columns.


1 Answers

To count the number of columns, simply do:

df1.columns.size
like image 182
Shaido Avatar answered Sep 19 '22 14:09

Shaido