I am trying to concat all the values in a column to make a string out of it with comma seperated values. To do that in Scala, I wrote the following code:
val pushLogIds = incLogIdDf.select($"interface_log_id").collect().map(_.getInt(0).toString).mkString(",")
I am new to Python and after selecting the values in the column, I am unable to find a logic to Python to concat all the column values to a String after collecting them.
final_log_id_list = logidf.select("interface_log_id").collect()
Ex:
interface_log_id
----------------
1
2
3
4
Output: a variable of String containing '1,2,3,4'
Could anyone let me know how to concat all the column values of a dataframe into a single String of comma separated values.
For converting a column to a single string , you can first collect the column as a list using collect_list and then concat with , , finally get the first value as a scalar using first:
df.agg(F.concat_ws(",",F.collect_list(F.col("interface_log_id")))).first()[0]
#'1,2,3,4'
Another way is collect_list and then using python ','.join with map for numeric columns
','.join(map(str,df.agg(F.collect_list(F.col("A"))).first()[0]))
Adding benchmarks:
%timeit ','.join(map(str,df.agg(F.collect_list(F.col("A"))).first()[0]))
#9.38 s ± 133 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit df.agg(F.concat_ws(",",F.collect_list(F.col("A")))).first()[0]
#9.46 s ± 246 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With