IPython Notebook and Pandas autocomplete

Question 1

I've found the following method to be useful to me. It basically creates a namedtuple containing the names of all the variables in the data frame as strings.

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var. and press Tab, the names of all the variables in the dataframe df will be displayed. Writing var.variable_1 is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

Question 2

I'm not sure how your data is situated but when I am importing a csv/txt file, I specify the names of the columns in a list, such as...

names = ['col_1', 'col_2', 'col_3']

etc... and then import my file as such...

import pandas as pd
data = pd.read_csv('./some_file.txt', header = True, delimiter = '\t', names = names)

You could then do tab completion like...

new_thing = data[names[1]]

where you would be hitting tab as you started to type "names" and then all you would have to do is specify what 'name' item you wanted. I not sure if this is any more efficient then simply typing out the word.