If I have an RDD that has key-value pair and I want to get only the key part, what is the most efficient way of doing it?

It is very simple <code>yourRDD.keys()</code> Similarly you can get RDD with values by <code>youRDD.values()</code> For this and other RDD transformations and actions see examples here

Spark getting keys from key-value RDD

1 Answers

It is very simple yourRDD.keys()

Similarly you can get RDD with values by youRDD.values()

For this and other RDD transformations and actions see examples here

151

answered Sep 22 '22 00:09

lanenok

Related questions
                            
                                Where does EMR store Spark stdout?
                            
                                How to pass schema to create a new Dataframe from existing Dataframe?
                            
                                How to overwrite data with PySpark's JDBC without losing schema?
                            
                                Spark 2.3 java.lang.NoSuchMethodError: io.netty.buffer.PooledByteBufAllocator.metric
                            
                                StandardScaler in Spark not working as expected
                            
                                Understanding output of lscpu
                            
                                reduceByKey method not being found in IntelliJ
                            
                                PySpark count values by condition
                            
                                Spark Job Keep on Running
                            
                                How to set spark.local.dir property from spark shell?
                            
                                GroupByKey and create lists of values pyspark sql dataframe
                            
                                How to transform Spark Dataframe columns to a single column of a string array
                            
                                How to unpack multiple keys in a Spark DataSet
                            
                                Does Apache Spark SQL support MERGE clause?
                            
                                How do you display Dataframe column names sorted?
                            
                                Cumulative sum in Spark
                            
                                How to use approxQuantile by group?
                            
                                How to set jdbc/partitionColumn type to Date in spark 2.4.1
                            
                                Hbase 0.96 with Spark v 1.0+
                            
                                Writing a RDD to a csv

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark getting keys from key-value RDD

Tags:

apache-spark

Ammar

People also ask

1 Answers

lanenok

Recent Activity

Donate For Us