I have a Spark dataframe with a column (assigned_products
) of type string that contains values such as the following:
"POWER BI PRO+Power BI (free)+AUDIO CONFERENCING+OFFICE 365 ENTERPRISE E5 WITHOUT AUDIO CONFERENCING"
I would like to count the occurrences of +
in the string for and return that value in a new column.
I tried the following, but I keep returning errors.
from pyspark.sql.functions import col
DF.withColumn('Number_Products_Assigned', col("assigned_products").count("+"))
I'm running my code in Azure Databricks on a cluster running Apache Spark 2.3.1.
The contains() method checks whether a DataFrame column string contains a string specified as an argument (matches on part of the string). Returns true if the string exists and false if not.
In PySpark, you can use distinct(). count() of DataFrame or countDistinct() SQL function to get the count distinct. distinct() eliminates duplicate records(matching all columns of a Row) from DataFrame, count() returns the count of records on DataFrame.
To get the number of rows from the PySpark DataFrame use the count() function. This function returns the total number of rows from the DataFrame. By calling this function it triggers all transformations on this DataFrame to execute.
Replace will replace the occurrence of the sub-string with null string. So we can count the occurrences by comparing the lengths before and after the replacement as follows:
SELECT length(x) - length(replace(x,'+')) as substring_count
FROM (select 'abc+def+ghi++aaa' as x) -- Sample data
Output:
substring_count
---------------
4
import pyspark.sql.functions as F
df1 = spark.sql("select 'abc+def+ghi++aaa' as x") # Sample data
df1.withColumn('substring_count',
F.length(col('x'))
- F.length(F.regexp_replace(col('x'), '\+', ''))
).show()
Output:
+----------------+---------------+
| x|substring_count|
+----------------+---------------+
|abc+def+ghi++aaa| 4|
+----------------+---------------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With