Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pyspark: parallelize and collect order preserving

For spark in python, do the sc.parallelize() and collect() operations preserve order? For example if I have a list of elements x, will sc.parallelize(x).collect() return a list of elements in the exact same order as x?

like image 562
charmander123 Avatar asked Nov 08 '22 07:11

charmander123


1 Answers

Both parallelize and collect preserve order. Most of the methods in Spark don't.

like image 135
user6022341 Avatar answered Nov 15 '22 13:11

user6022341