I have 2 following data frames in pandas:
movies
+---+------------------------------+--------------+-----------+
| | movie title | genre | tconst |
+---+------------------------------+--------------+-----------+
| 0 | Edison Kinetoscopic Record | Documentary | tt0000008 |
+---+------------------------------+--------------+-----------+
| 1 | La sortie des usines Lumière | Documentary | tt0000010 |
+---+------------------------------+--------------+-----------+
| 2 | The Arrival of a Train | Documentary | tt0000012 |
+---+------------------------------+--------------+-----------+
| 3 | The Oxford and Cambridge | NaN | tt0000025 |
+---+------------------------------+--------------+-----------+
| 4 | Le manoir du diable | Short|Horror | tt0000091 |
+---+------------------------------+--------------+-----------+
and crew
+---+-----------+-----------+---------+------+
| | tconst | directors | writers | year |
+---+-----------+-----------+---------+------+
| 0 | tt0000001 | nm0005690 | \N | 2001 |
+---+-----------+-----------+---------+------+
| 1 | tt0000002 | nm0721526 | \N | 2002 |
+---+-----------+-----------+---------+------+
| 2 | tt0000003 | nm0721526 | \N | 2003 |
+---+-----------+-----------+---------+------+
| 3 | tt0000004 | nm0721526 | \N | 2004 |
+---+-----------+-----------+---------+------+
| 4 | tt0000005 | nm0005690 | \N | 2005 |
+---+-----------+-----------+---------+------+
How do I create a new data frame where I join directors and year columns only to movies data frame (using tconst column) ?
Try:
pd.merge(movies, crew[["tconst", "directors", "year"]], on="tconst", how="left")
the on
parameter tells the function that you want to merge on the key tconst
, the how
parameter tells the function how you want to deal with rows that aren't intersections (shared) between the two DataFrames.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With