Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IPython Notebook and Pandas autocomplete

I noticed if I were to type df.column_name(), I can autocomplete the column_name with a tab in IPython notebook.

Now, the proper syntax for doing something to a column would be df['column_name'], where I am unable to autocomplete (I am assuming because it is a string?). Is there any other notation or way to simplyfy typing out column names. I am essentailly looking for a solution that would allow me to tab autocomplete the column name within this df['column_name'].

like image 458
metersk Avatar asked Jan 31 '14 00:01

metersk


1 Answers

I've found the following method to be useful to me. It basically creates a namedtuple containing the names of all the variables in the data frame as strings.

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var. and press Tab, the names of all the variables in the dataframe df will be displayed. Writing var.variable_1 is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

like image 68
Maturin Avatar answered Sep 24 '22 15:09

Maturin