Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use dictionary keys and values to rename columns in a pandas DataFrame?

I am building functions to help me load data from the web. The problem I am trying to solve as far as loading data is that column names are different depending on the source. For example, Yahoo Finance data column headings look like this Open, High, Low, Close, Volume, Adj Close. Quandl.com will have data sets that have DATE,VALUE,date,value etc. The all upper case and lowercase throws everything off and Value and Adj. Close for the most part mean the same thing. I want to associate columns with different names but the same meaning to one value. For example Adj. Close and value both = AC; Open, OPEN, and open all = O.

So I have a Csv file ("Functions//ColumnNameChanges.txt") that stores dict() keys and values of column names.

Date,D
Open,O
High,H

and then I wrote this function to populate my dictionary

def DictKeyValuesFromText ():

    Dictionary = {}
    TextFileName = "Functions//ColumnNameChanges.txt"
    with open(TextFileName,'r') as f:
        for line in f:
            x = line.find(",")
            y = line.find("/")
            k = line[0:x]
            v = line[x+1:y]

            Dictionary[k] = v
    return Dictionary

This is the output of print(DictKeyValuesFromText())

{'': '', 'Date': 'D', 'High': 'H', 'Open': 'O'}

The next function is where my problems are at

def ChangeColumnNames(DataFrameFileLocation):
    x = DictKeyValuesFromText()
    df = pd.read_csv(DataFrameFileLocation)
    for y in df.columns:
        if y not in x.keys():
            i = input("The column " +  y +  " is not in the list, give a name:")
            df.rename(columns={y:i}) 
        else:
            df.rename(columns={y:x[y]})

    return df

df.rename is not working. This is the output I get print(ChangeColumnNames("Tvix_data.csv"))

The column Low is not in the list, give a name:L
The column Close is not in the list, give a name:C
The column Volume is not in the list, give a name:V
The column Adj Close is not in the list, give a name:AC
            Date        Open        High         Low       Close    Volume  \
0     2010-11-30  106.269997  112.349997  104.389997  112.349997         0
1     2010-12-01   99.979997  100.689997   98.799998  100.689997         0
2     2010-12-02   98.309998   98.309998   86.499998   86.589998         0

The columns names should be D, O, H, L, C, V. I am missing something any help would be appreciated.

like image 380
ZacAttack Avatar asked Jan 21 '17 18:01

ZacAttack


People also ask

Which function of DataFrame is used to rename the existing column names?

Using rename() function Pandas has a built-in function called rename() to change the column names.

How do you change the values in a column in Pandas using a dictionary?

You can use df. replace({"Courses": dict}) to remap/replace values in pandas DataFrame with Dictionary values. It allows you the flexibility to replace the column values with regular expressions for regex substitutions.

What method will you use to rename the index or columns of Pandas DataFrame?

Pandas rename() method is used to rename any index, column or row.


1 Answers

df.rename works just fine, but it is not inplace by default. Either re-assign its return value or use inplace=True. It expects a dictionary with old names as keys and new names as values.

df = df.rename(columns = {'col_a': 'COL_A', 'col_b': 'COL_B'})

or

df.rename(columns = {'col_a': 'COL_A', 'col_b': 'COL_B'}, inplace=True)

like image 171
DeepSpace Avatar answered Oct 23 '22 18:10

DeepSpace