Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dataframe, set index from list

Tags:

python

pandas

Is it possible when creating a dataframe from a list, to set the index as one of the values?

import pandas as pd

tmp = [['a', 'a1'], ['b',' b1']]

df = pd.DataFrame(tmp, columns=["First", "Second"])

        First  Second
0          a   a1
1          b   b1

And how I'd like it to look:

        First  Second
a          a   a1
b          b   b1
like image 472
vandelay Avatar asked Aug 24 '16 20:08

vandelay


People also ask

How do I make a list into a DataFrame index?

We can set a specific column or multiple columns as an index in pandas DataFrame. Create a list of column labels to be used to set an index. We need to pass the column or list of column labels as input to the DataFrame. set_index() function to set it as an index of DataFrame.

How do I change the index of a pandas series?

rename() function in the pandas Series functionalities, which is used to change the series index labels or to change the name of the series object.

How do you add an index to a DataFrame in python?

Set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length). The index can replace the existing index or expand on it.

How do I change the default index in a DataFrame?

Use DataFrame.reset_index() function We can use DataFrame. reset_index() to reset the index of the updated DataFrame. By default, it adds the current row index as a new column called 'index' in DataFrame, and it will create a new row index as a range of numbers starting at 0.


3 Answers

Change it to list before assigning it to index

df.index = list(df["First"])
like image 110
Aman Singh Kamboj Avatar answered Oct 13 '22 00:10

Aman Singh Kamboj


>>> pd.DataFrame(tmp, columns=["First", "Second"]).set_index('First', drop=False)
      First Second
First             
a         a     a1
b         b     b1
like image 39
Alexander Avatar answered Oct 13 '22 02:10

Alexander


set_axis

To set arbitrary values as the index, best practice is to use set_axis:

df = df.set_axis(['idx1', 'idx2'])

#       First  Second
# idx1      a      a1
# idx2      b      b1

set_index (list vs array)

It's also possible to pass arbitrary values to set_index, but note the difference between passing a list vs array:

  • listset_index assigns these columns as the index:

    df.set_index(['First', 'First'])
    
    #              Second
    # First First        
    # a     a          a1
    # b     b          b1
    
  • array (Series/Index/ndarray) — set_index assigns these values as the index:

    df = df.set_index(pd.Series(['First', 'First']))
    
    #        First  Second
    # First      a      a1
    # First      b      b1
    

    Note that passing arrays to set_index is very contentious among the devs and may even get deprecated.


Why not just modify df.index directly?

Directly modifying attributes is fine and is used often, but using methods has its advantages:

  • Methods provide better error checking, e.g.:

    df = df.set_axis(['idx1', 'idx2', 'idx3'])
    
    # ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements
    
    df.index = ['idx1', 'idx2', 'idx3']
    
    # No error despite length mismatch
    
  • Methods can be chained, e.g.:

    df.some_method().set_axis(['idx1', 'idx2']).another_method()
    
like image 45
tdy Avatar answered Oct 13 '22 00:10

tdy