I have the following data:
+---------------+-----------+-------------+-----+------+
| time_stamp_0|sender_ip_1|receiver_ip_2|count|attack|
+---------------+-----------+-------------+-----+------+
|06:10:55.881073| 10.0.0.3| 10.0.0.1| 1 | 0|
|06:10:55.881095| 10.0.0.3| 10.0.0.1| 2 | 0|
|06:10:55.881114| 10.0.0.3| 10.0.0.1| 3 | 0|
|06:10:55.881133| 10.0.0.3| 10.0.0.1| 4 | 0|
|06:10:55.881152| 10.0.0.3| 10.0.0.1| 5 | 0|
|06:10:55.881172| 10.0.0.3| 10.0.0.1| 6 | 0|
|06:10:55.881191| 10.0.0.3| 10.0.0.1| 7 | 0|
|06:10:55.881210| 10.0.0.3| 10.0.0.1| 8 | 0|
I need to compare the total standard deviation on count column with itself (with count column) in my dataframe. Here is my code:
val std_dev=Dataframe_addcount.agg(stddev_pop($"count"))
val final_add_count_attack = Dataframe_addcount.withColumn("attack", when($"count" > std_dev , 0).otherwise(1))
However my problem is that, I got the following error:
Unsupported literal type class org.apache.spark.sql.Dataset [stddev_pop(count): double]
Could you help me? Thanks a lot.
It's because in when and otherwise you should use values; not std_dev is a DataFrame.
You can get result:
val stdDevValue = std_dev.head().getDouble(0)
val final_add_count_attack = Dataframe_addcount.withColumn("attack", when($"count" > lit(std_dev), lit(0)).otherwise(lit(1)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With