Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I drop the right index on a left merge from the results?

Tags:

merge

pandas

In the below example, I can get the merge to run correctly, but how do I not have the second index print as well? Do I have to add a separate line of code:

df_merge = df_merge.drop(columns='cities')  

Can't I choose which columns I want to merge into the left dataset? What if df2 had 30 columns and I only want 10 of them?

import pandas as pd

df1 = pd.DataFrame({
    "city": ['new york','chicago', 'orlando','ottawa'],
    "humidity": [35,69,79,99]
})


df2 = pd.DataFrame({
    "cities": ['new york', 'chicago', 'toronto'],
    "temp": [1, 6, -35]
})

df_merge = df1.merge(df2, left_on='city', right_on='cities', how='left')
print(df_merge)

**output**

   index      city  humidity    cities  temp
0      0  new york        35  new york   1.0
1      1   chicago        69   chicago   6.0
2      2   orlando        79       NaN   NaN
3      3    ottawa        99       NaN   NaN
like image 397
caddie Avatar asked May 16 '18 14:05

caddie


People also ask

How do you remove an index from a data frame?

The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index. The method will also simply insert the dataframe index into a column in the dataframe.

How do I remove an index from a column in a data frame?

We can remove the index column in existing dataframe by using reset_index() function. This function will reset the index and assign the index columns start with 0 to n-1. where n is the number of rows in the dataframe.

How do I drop a pandas Series index?

drop() function return Series with specified index labels removed. It remove elements of a Series based on specifying the index labels.

How do I keep index from merging pandas?

You can make a copy of index on left dataframe and do merge. I found this simple method very useful while working with large dataframe and using pd. merge_asof() (or dd. merge_asof() ).


1 Answers

merge

Change the name of the column first

df1.merge(df2.rename(columns={'cities': 'city'}), 'left')

       city  humidity  temp
0  new york        35   1.0
1   chicago        69   6.0
2   orlando        79   NaN
3    ottawa        99   NaN

If you need to explicitly state what you're merging on:

df1.merge(df2.rename(columns={'cities': 'city'}), how='left', on='city')

join

set the index of the right side first
'left' is default.

df1.join(df2.set_index('cities'), 'city')

       city  humidity  temp
0  new york        35   1.0
1   chicago        69   6.0
2   orlando        79   NaN
3    ottawa        99   NaN

map

Make a dictionary.

df1.assign(temp=df1.city.map(dict(df2.values)))

       city  humidity  temp
0  new york        35   1.0
1   chicago        69   6.0
2   orlando        79   NaN
3    ottawa        99   NaN

Less cute, more explicit

df1.assign(temp=df1.city.map(dict(df2.set_index('cities').temp)))
like image 63
piRSquared Avatar answered Jan 01 '23 20:01

piRSquared