Does one batch interval of data generate one and only one RDD in DStream regardless of how big is the quantity of the data?
It's very late to reply to this thread. But still, It's worth adding a few more points. Number of RDDs depends upon how many receivers you have in your application. That's why "sparkContext.read" will have multiple RDDs. But if you have only one receiver or Kafka as a source (receiver-less) in that case you will get only one RDD.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With