Which function in spark is used to combine two RDDs by keys

Tags:

Let us say I have the following two RDDs, with the following key-pair values.

rdd1 = [ (key1, [value1, value2]), (key2, [value3, value4]) ]

and

rdd2 = [ (key1, [value5, value6]), (key2, [value7]) ]

Now, I want to join them by key values, so for example I want to return the following

ret = [ (key1, [value1, value2, value5, value6]), (key2, [value3, value4, value7]) ]

How I can I do this, in spark using Python or Scala? One way is to use join, but join would create a tuple inside the tuple. But I want to only have one tuple per key value pair.

589

asked Nov 13 '14 11:11

MetallicPriest

1 Answers

I would union the two RDDs and to a reduceByKey to merge the values.

(rdd1 union rdd2).reduceByKey(_ ++ _)

102

answered Oct 19 '22 00:10

maasg

Related questions
                            
                                MinGW doesn't produce warnings
                            
                                spring boot with spring security : Error creating bean with name 'securityFilterChainRegistration'
                            
                                Behavior of F# "unmanaged" type constraint
                            
                                Cannot access field from static context when passing value to superconstructor
                            
                                How to hide navigation bar back button?
                            
                                Angular Material: how to set background-color (without CSS)
                            
                                How do you set the IIS Application Pool Identity User Locale when it's set to ApplicationPoolIdentity
                            
                                Azure SQL Database pricing is per database server or per user-created database
                            
                                Is it correct to assume that floating-point samples in a WAV or AIFF file will be normalized?
                            
                                setInitialText method in SLComposeViewController iOS 8.3 does not show text in Facebook sheet
                            
                                Java Generic Interface vs Generic Methods, and when to use one
                            
                                Upload single file to firebase hosting via CLI or other without deleting existing ones?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With