I have two separate dataframes that share a project number. In <code>type_df</code>, the project number is the index. In <code>time_df</code>, the project number is a column. I would like to count the number of rows in <code>type_df</code> that have a <code>Project Type</code> of <code>2</code>. I am trying to do this with <code>pandas.merge()</code>. It works great when using both columns, but not indices. I'm not sure how to reference the index and if <code>merge</code> is even the right way to do this. <pre class="prettyprint"><code>import pandas as pd type_df = pd.DataFrame(data = [['Type 1'], ['Type 2']], columns=['Project Type'], index=['Project2', 'Project1']) time_df = pd.DataFrame(data = [['Project1', 13], ['Project1', 12], ['Project2', 41]], columns=['Project', 'Time']) merged = pd.merge(time_df,type_df, on=[index,'Project']) print merged[merged['Project Type'] == 'Type 2']['Project Type'].count() </code></pre> Error: <blockquote> Name 'Index' is not defined. </blockquote> Desired Output: <pre class="prettyprint"><code>2 </code></pre>

If you want to use an index in your merge you have to specify <code>left_index=True</code> or <code>right_index=True</code>, and then use <code>left_on</code> or <code>right_on</code>. For you it should look something like this: <pre class="prettyprint"><code>merged = pd.merge(type_df, time_df, left_index=True, right_on='Project') </code></pre>

Using Merge on a column and Index in Pandas

I have two separate dataframes that share a project number. In type_df, the project number is the index. In time_df, the project number is a column. I would like to count the number of rows in type_df that have a Project Type of 2. I am trying to do this with pandas.merge(). It works great when using both columns, but not indices. I'm not sure how to reference the index and if merge is even the right way to do this.

import pandas as pd type_df = pd.DataFrame(data = [['Type 1'], ['Type 2']],                         columns=['Project Type'],                         index=['Project2', 'Project1']) time_df = pd.DataFrame(data = [['Project1', 13], ['Project1', 12],                                 ['Project2', 41]],                         columns=['Project', 'Time']) merged = pd.merge(time_df,type_df, on=[index,'Project']) print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()

Error:

Name 'Index' is not defined.

Desired Output:

How do I merge index columns in pandas?

The merge() function is used to merge DataFrame or named Series objects with a database-style join. The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on.

How do I merge two DataFrames with the same index?

You can use pandas. merge() to merge DataFrames by matching their index. When merging two DataFrames on the index, the value of left_index and right_index parameters of merge() function should be True .

Can you merge columns in pandas?

To merge two pandas DataFrames on multiple columns use pandas. merge() method. merge() is considered more versatile and flexible and we also have the same method in DataFrame.

What is the difference between merge join and concatenate in pandas?

The main difference between merge & concat is that merge allow you to perform more structured "join" of tables where use of concat is more broad and less structured.

If you want to use an index in your merge you have to specify left_index=True or right_index=True, and then use left_on or right_on. For you it should look something like this:

merged = pd.merge(type_df, time_df, left_index=True, right_on='Project')

Another solution is use DataFrame.join:

df3 = type_df.join(time_df, on='Project')

For version pandas 0.23.0+ the on, left_on, and right_on parameters may now refer to either column names or index level names:

left_index = pd.Index(['K0', 'K0', 'K1', 'K2'], name='key1') left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],                     'B': ['B0', 'B1', 'B2', 'B3'],                      'key2': ['K0', 'K1', 'K0', 'K1']},                     index=left_index)                      right_index = pd.Index(['K0', 'K1', 'K2', 'K2'], name='key1')  right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],                      'D': ['D0', 'D1', 'D2', 'D3'],                      'key2': ['K0', 'K0', 'K0', 'K1']},                       index=right_index)            print (left)            A   B key2 key1              K0    A0  B0   K0 K0    A1  B1   K1 K1    A2  B2   K0 K2    A3  B3   K1          print (right)        C   D key2 key1              K0    C0  D0   K0 K1    C1  D1   K0 K2    C2  D2   K0 K2    C3  D3   K1

df = left.merge(right, on=['key1', 'key2']) print (df)        A   B key2   C   D key1                      K0    A0  B0   K0  C0  D0 K1    A2  B2   K0  C1  D1 K2    A3  B3   K1  C3  D3

Using Merge on a column and Index in Pandas

Tags:

python

merge

pandas

python-2.7

user2242044

People also ask

2 Answers

maxymoo

jezrael

Recent Activity

Donate For Us

Using Merge on a column and Index in Pandas

Tags:

python

merge

pandas

python-2.7

user2242044

People also ask

2 Answers

maxymoo

jezrael

Related questions

Recent Activity

Donate For Us