A pyspark dataframe containing dot (e.g. "id.orig_h") will not allow to groupby
upon unless first renamed by withColumnRenamed
. Is there a workaround? "`a.b`"
doesn't seem to solve it.
In my pyspark shell, the following snippets are working:
from pyspark.sql.functions import *
myCol = col("`id.orig_h`")
result = df.groupBy(myCol).agg(...)
and
myCol = df["`id.orig_h`"]
result = df.groupBy(myCol).agg(...)
I hope it helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With