Pergunta

I noticed if I were to type df.column_name(), I can autocomplete the column_name with a tab in IPython notebook.

Now, the proper syntax for doing something to a column would be df['column_name'], where I am unable to autocomplete (I am assuming because it is a string?). Is there any other notation or way to simplyfy typing out column names. I am essentailly looking for a solution that would allow me to tab autocomplete the column name within this df['column_name'].

Foi útil?

Solução

I've found the following method to be useful to me. It basically creates a namedtuple containing the names of all the variables in the data frame as strings.

For example, consider the following data frame containing 2 variables called "variable_1" and "variable_2":

from collections import namedtuple
from pandas import DataFrame
import numpy as np

df = DataFrame({'variable_1':np.arange(5),'variable_2':np.arange(5)})

The following code creates a namedtuple called "var":

def ntuples():
    list_of_names = df.columns.values
    list_of_names_dict = {x:x for x in list_of_names}

    Varnames = namedtuple('Varnames', list_of_names) 
    return Varnames(**list_of_names_dict)

var = ntuples()

In a notebook, when I write var. and press Tab, the names of all the variables in the dataframe df will be displayed. Writing var.variable_1 is equivalent to writing 'variable_1'. So the following would work: df[var.variable_1].

The reason I define a function to do it is that often times you will add new variables to a data frame. In order to update the new variables to your namedtuple "var" simply call the function again, ntuples(), and you are good to go.

Outras dicas

I'm not sure how your data is situated but when I am importing a csv/txt file, I specify the names of the columns in a list, such as...

names = ['col_1', 'col_2', 'col_3']

etc... and then import my file as such...

import pandas as pd
data = pd.read_csv('./some_file.txt', header = True, delimiter = '\t', names = names)

You could then do tab completion like...

new_thing = data[names[1]]

where you would be hitting tab as you started to type "names" and then all you would have to do is specify what 'name' item you wanted. I not sure if this is any more efficient then simply typing out the word.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top