I have an RDD[org.joda.time.DateTime]. I would like to sort records by date in scala.
Input - sample data after applying collect() below -
res41: Array[org.joda.time.DateTime] = Array(2016-10-19T05:19:07.572Z, 2016-10-12T00:31:07.572Z, 2016-10-18T19:43:07.572Z)
Expected Output
2016-10-12T00:31:07.572Z
2016-10-18T19:43:07.572Z
2016-10-19T05:19:07.572Z
I have googled and checked following link but could not understand it -
How to define an Ordering in Scala?
Any help?
If you collect the records of your RDD, then you can apply the following sorting:
array.sortBy(_.getMillis)
On the contrary, if your RDD is big and you do not want to collect it to the driver, you should consider:
rdd.sortBy(_.getMillis)
You can define an implicit ordering for org.joda.time.DateTime
like so;
implicit def ord: Ordering[DateTime] = Ordering.by(_.getMillis)
Which looks at the milliseconds of a DateTime and sorts based on that.
You can then either ensure that the implicit is in your scope or just use it more explicitly:
arr.sorted(ord)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With