Consider the following dataframe:
------------+--------------------+
|id| values
+------------+--------------------+
| 39|a,a,b,b,c,c,c,c,d
| 520|a,b,c
| 832|a,a
I want to convert it into the following DataFrame:
------------+--------------------+
|id| values
+------------+--------------------+
| 39|{"a":2, "b": 2,"c": 4,"d": 1}
| 520|{"a": 1,"b": 1,"c": 1}
| 832|{"a": 2}
I tried two approaches:
Converting the dataframe to rdd. Then I mapped the value column to a frequancy counter function. But I get errors on converting the rdd back to the dataframe
Using a udf to essentially do the same thing as above.
The reason I want to have a dictionary column is to load it as a json in one of my python application.
To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.
Pandas Columns to Dictionary with Pandas' to_dict() function It uses column names as keys and the column values as values. It creates a dictionary for column values using the index as keys.
We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.
A pandas DataFrame can be converted into a Python dictionary using the DataFrame instance method to_dict(). The output can be specified of various orientations using the parameter orient. In dictionary orientation, for each column of the DataFrame the column value is listed against the row label in a dictionary.
You can do this with a udf that returns a MapType
column.
from pyspark.sql.types import MapType, StringType, IntegerType
from collections import Counter
my_udf = udf(lambda s: dict(Counter(s.split(','))), MapType(StringType(), IntegerType()))
df = df.withColumn('values', my_udf('values'))
df.collect()
[Row(id=39, values={u'a': 2, u'c': 4, u'b': 2, u'd': 1}),
Row(id=520, values={u'a': 1, u'c': 1, u'b': 1}),
Row(id=832, values={u'a': 2})]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With