Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to correctly read csv in Pandas while changing the names of the columns

Tags:

An absolute basic read_csv question.

I have data that looks like the following in a csv file -

Date,Open Price,High Price,Low Price,Close Price,WAP,No.of Shares,No. of Trades,Total Turnover (Rs.),Deliverable Quantity,% Deli. Qty to Traded Qty,Spread High-Low,Spread Close-Open
28-February-2015,2270.00,2310.00,2258.00,2294.85,2279.192067772602217319,73422,8043,167342840.00,11556,15.74,52.00,24.85
27-February-2015,2267.25,2280.85,2258.00,2266.35,2269.239841485775122730,50721,4938,115098114.00,12297,24.24,22.85,-0.90
26-February-2015,2314.90,2314.90,2250.00,2259.50,2277.198324862194860047,69845,8403,159050917.00,22046,31.56,64.90,-55.40
25-February-2015,2290.00,2332.00,2278.35,2318.05,2315.100614216488163214,161995,10174,375034724.00,102972,63.56,53.65,28.05
24-February-2015,2276.05,2295.00,2258.00,2278.15,2281.058946240263344242,52251,7726,119187611.00,13292,25.44,37.00,2.10
23-February-2015,2303.95,2311.00,2253.25,2270.70,2281.912259219760108491,75951,7344,173313518.00,24969,32.88,57.75,-33.25
20-February-2015,2324.00,2335.20,2277.00,2284.30,2301.631421152326354478,79717,10233,183479152.00,23045,28.91,58.20,-39.70
19-February-2015,2304.00,2333.90,2292.00,2326.60,2321.485466301625211160,85835,8847,199264705.00,29728,34.63,41.90,22.60
18-February-2015,2284.00,2305.00,2261.10,2295.75,2282.060986778089405300,69884,6639,159479550.00,26665,38.16,43.90,11.75
16-February-2015,2281.00,2305.85,2266.00,2278.50,2284.961866239581019628,85541,10149,195457923.00,22164,25.91,39.85,-2.50
13-February-2015,2311.00,2324.90,2286.95,2296.40,2311.371235111317676864,109731,5570,253629077.00,69039,62.92,37.95,-14.60
12-February-2015,2280.00,2322.85,2275.00,2315.45,2301.372038211769425569,79766,9095,183571242.00,33981,42.60,47.85,35.45
11-February-2015,2275.00,2295.00,2258.25,2287.20,2279.587966250020639664,60563,7467,138058686.00,20058,33.12,36.75,12.20
10-February-2015,2244.90,2297.40,2225.00,2280.30,2269.562228214830293104,141656,13026,321497107.00,55577,39.23,72.40,35.40

--

I am trying to read this data in a pandas dataframe using the following variations of read_csv. I am only interested in two columns.

z = pd.read_csv('file.csv', parse_dates=True, index_col="Date", usecols=["Date", "Open Price", "Close Price"], names=["Date", "O", "C"], header=0)

What I get is

     O    C

Date                
2015-02-28  NaN  NaN
2015-02-27  NaN  NaN
2015-02-26  NaN  NaN
2015-02-25  NaN  NaN
2015-02-24  NaN  NaN

Or 
z = pd.read_csv('file.csv', parse_dates=True, index_col="Date", usecols=["Date", "Open", "Close"], names=["Date", "Open Price", "Close Price"], header=0)

The result is -

    Open Price Close Price
Date                             
2015-02-28        NaN         NaN
2015-02-27        NaN         NaN
2015-02-26        NaN         NaN
2015-02-25        NaN         NaN

Am I missing something fundamental or is there an issue with read_csv of pandas 0.13.1 - my version on Debian Wheezy?

like image 870
gabhijit Avatar asked Apr 04 '15 03:04

gabhijit


People also ask

How do I rename a column in pandas while reading CSV?

Method 3: Rename the Column Name Using the read_csv File For that, we have to create a list of columns and pass that list as a parameter to the names attribute while reading the csv. We use the one attribute header=0, which means that we override the previous columns of the .


1 Answers

You are right, something is odd with the name attributes. Seems to me that you can not use both in the same time. Either you set the name for every columns of the CSV file or you don't set the name at all. So it seems that you can't set the name when you are not taking all the colums (usecols)

names : array-like List of column names to use. If file contains no header row, then you should explicitly pass header=None

You might already know it but you can rename the colums after also.

import pandas as pd
from StringIO import StringIO

csv = r"""Date,Open Price,High Price,Low Price,Close Price,WAP,No.of Shares,No. of Trades,Total Turnover (Rs.),Deliverable Quantity,% Deli. Qty to Traded Qty,Spread High-Low,Spread Close-Open
28-February-2015,2270.00,2310.00,2258.00,2294.85,2279.192067772602217319,73422,8043,167342840.00,11556,15.74,52.00,24.85
27-February-2015,2267.25,2280.85,2258.00,2266.35,2269.239841485775122730,50721,4938,115098114.00,12297,24.24,22.85,-0.90
26-February-2015,2314.90,2314.90,2250.00,2259.50,2277.198324862194860047,69845,8403,159050917.00,22046,31.56,64.90,-55.40
25-February-2015,2290.00,2332.00,2278.35,2318.05,2315.100614216488163214,161995,10174,375034724.00,102972,63.56,53.65,28.05
24-February-2015,2276.05,2295.00,2258.00,2278.15,2281.058946240263344242,52251,7726,119187611.00,13292,25.44,37.00,2.10
23-February-2015,2303.95,2311.00,2253.25,2270.70,2281.912259219760108491,75951,7344,173313518.00,24969,32.88,57.75,-33.25
20-February-2015,2324.00,2335.20,2277.00,2284.30,2301.631421152326354478,79717,10233,183479152.00,23045,28.91,58.20,-39.70
19-February-2015,2304.00,2333.90,2292.00,2326.60,2321.485466301625211160,85835,8847,199264705.00,29728,34.63,41.90,22.60
18-February-2015,2284.00,2305.00,2261.10,2295.75,2282.060986778089405300,69884,6639,159479550.00,26665,38.16,43.90,11.75
16-February-2015,2281.00,2305.85,2266.00,2278.50,2284.961866239581019628,85541,10149,195457923.00,22164,25.91,39.85,-2.50
13-February-2015,2311.00,2324.90,2286.95,2296.40,2311.371235111317676864,109731,5570,253629077.00,69039,62.92,37.95,-14.60
12-February-2015,2280.00,2322.85,2275.00,2315.45,2301.372038211769425569,79766,9095,183571242.00,33981,42.60,47.85,35.45
    11-February-2015,2275.00,2295.00,2258.25,2287.20,2279.587966250020639664,60563,7467,138058686.00,20058,33.12,36.75,12.20
    10-February-2015,2244.90,2297.40,2225.00,2280.30,2269.562228214830293104,141656,13026,321497107.00,55577,39.23,72.40,35.40"""

df = pd.read_csv(StringIO(csv), 
        usecols=["Date", "Open Price", "Close Price"],
        header=0)

df.columns = ['Date', 'O', 'C']

df

output:

                Date        O        C
0   28-February-2015  2270.00  2294.85
1   27-February-2015  2267.25  2266.35
2   26-February-2015  2314.90  2259.50
3   25-February-2015  2290.00  2318.05
4   24-February-2015  2276.05  2278.15
5   23-February-2015  2303.95  2270.70
6   20-February-2015  2324.00  2284.30
7   19-February-2015  2304.00  2326.60
8   18-February-2015  2284.00  2295.75
9   16-February-2015  2281.00  2278.50
10  13-February-2015  2311.00  2296.40
11  12-February-2015  2280.00  2315.45
12  11-February-2015  2275.00  2287.20
13  10-February-2015  2244.90  2280.30
like image 186
Papouche Guinslyzinho Avatar answered Oct 16 '22 04:10

Papouche Guinslyzinho