I am doing join of 2 data frames and select all columns of left frame for example:
val join_df = first_df.join(second_df, first_df("id") === second_df("id") , "left_outer")
in above I want to do select first_df.* .How can I select all columns of one frame in join ?
With alias:
first_df.alias("fst").join(second_df, Seq("id"), "left_outer").select("fst.*")
We can also do it with leftsemi join. leftsemi join will select the data from left side dataframe from a joined dataframe.
Here we join two dataframes df1 and df2 based on column col1.
df1.join(df2, df1.col("col1").equalTo(df2.col("col1")), "leftsemi")
Suppose you:
Then, you might do the following:
val selectColumns = df1.columns.map(df1(_)) ++ Array(df2("field1"), df2("field2"))
df1.join(df2, df1("key") === df2("key")).select(selectColumns:_*)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With