Q1: I have a strange issue which I can't seem to figure out.
I'm parsing through a CSV File using the NumPy Module, where a portion of the CSV File (which contains 253 rows and 4 Columns) is shown below:
Code Date NetPrice Gain
MICRO US 01/05/2012 613.98 0
MICRO US 01/06/2012 622.75 1.09342432
MICRO US 01/07/2012 690.99 -0.44342342
MICRO US 01/08/2012 611.26 -3.242423423
I'm parsing through the CSV File using the code below:
micro_info = np.genfromtxt('MICRO.csv', delimiter=',', dtype=None, names=True)
However, when I run the code below, I get that the first line gives me (253,), but the second line prints the required contents of the CSV File containing all 253 rows and 4 Columns. I don't understand why this is so.
print micro_info.shape
print micro_info
Q2: Does what I am doing below make sense?
I'm essentially looking to convert the Dates to floats so that I can use Matplotlib to plot the NetPrice values of MICRO US against each Date. For this I use the code below:
convertingdates = strpdate2num(micro_info[1:,2])
datesasfloat = {1: convertingdates}
micro_info = np.genfromtxt('MICRO.csv', delimiter=',', dtype=None, converters = datesasfloat, names=True)
I will then access the Dates and NetPrice as required.
Thank You
With your sample text, this works:
In [314]: dconverter=pylab.strpdate2num('%M/%S/%Y')
In [316]: names='code us Date NetPrice Gain'.split()
In [317]: data=np.genfromtxt(ss,skip_header=1,dtype=None,
converters={'Date':dconverter},names=names)
In [318]: data.shape
Out[318]: (4,)
In [319]: data['Date']
Out[319]:
array([ 734503.00075231, 734503.00076389, 734503.00077546,
734503.00078704])
In [320]: data['NetPrice']
Out[320]: array([ 613.98, 622.75, 690.99, 611.26])
It uses the default white spaces delimiter. Because that splits 'MICRO US', I used a custom names list, rather than the header line. I refined your use of strpdate2num.
If the file was comma delimited, then this would work (and using a corrected date converter):
In [410]: dconverter=pylab.strpdate2num('%m/%d/%Y')
In [412]: data=np.genfromtxt(ss,names=True,delimiter=',',dtype=None,
autostrip=True,converters={'Date':dconverter})
In [413]: data
Out[413]:
array([('MICRO US', 734507.0, 613.98, 0.0),
('MICRO US', 734508.0, 622.75, 1.09342432),
('MICRO US', 734509.0, 690.99, -0.44342342),
('MICRO US', 734510.0, 611.26, -3.242423423)],
dtype=[('Code', 'S8'), ('Date', 'O'), ('NetPrice', '<f8'), ('Gain', '<f8')])
Another way to deal with 'delimiters' is to give a list of field widths. For some reason this required an explicit dtype.
dt=np.dtype([('Code', 'S8'), ('Date', 'O'), ('NetPrice', '<f8'), ('Gain', '<f8')])
data=np.genfromtxt(ss, names=True, delimiter=[15,10,11,12],
converters={'Date':dconverter}, dtype=dt)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With