I'm new in Scala programming and this is my question: How to count the number of string for each row? My Dataframe is composed of a single column of Array[String] type.
friendsDF: org.apache.spark.sql.DataFrame = [friends: array<string>]
The simplest procedural way to get the value of the length of an array is by using the sizeof operator. First you need to determine the size of the array. Then you need to divide it by the size of one element. It works because every item in the array has the same type, and as such the same size.
If you are using Spark SQL, you can also use size() function that returns the size of an array or map type columns.
We can find the size of an array using the sizeof() operator as shown: // Finds size of arr[] and stores in 'size' int size = sizeof(arr)/sizeof(arr[0]);
VBA Code to get the length of Array (one-dimensional array):Select ” oneDimArrayLength” and Click Run button. Compute Number of Rows, Number of Columns using UBound and LBound function. Multiply by noOfRow and noOfCol variable to get Number of elements in multi-dimensional array.
You can use the size
function:
val df = Seq((Array("a","b","c"), 2), (Array("a"), 4)).toDF("friends", "id") // df: org.apache.spark.sql.DataFrame = [friends: array<string>, id: int] df.select(size($"friends").as("no_of_friends")).show +-------------+ |no_of_friends| +-------------+ | 3| | 1| +-------------+
To add as a new column:
df.withColumn("no_of_friends", size($"friends")).show +---------+---+-------------+ | friends| id|no_of_friends| +---------+---+-------------+ |[a, b, c]| 2| 3| | [a]| 4| 1| +---------+---+-------------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With