load csv into 2D matrix with numpy for plotting

Tags:

Given this CSV file:

"A","B","C","D","E","F","timestamp" 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12 611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12

I simply want to load it as a matrix/ndarray with 3 rows and 7 columns. However, for some reason, all I can get out of numpy is an ndarray with 3 rows (one per line) and no columns.

r = np.genfromtxt(fname,delimiter=',',dtype=None, names=True) print r print r.shape  [ (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291111964948.0)  (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291113113366.0)  (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291120650486.0)] (3,)

I can manually iterate and hack it into the shape I want, but this seems silly. I just want to load it as a proper matrix so I can slice it across different dimensions and plot it, just like in matlab.

417

asked Nov 30 '10 15:11

dgorissen

2 Answers

Pure numpy

numpy.loadtxt(open("test.csv", "rb"), delimiter=",", skiprows=1)

Check out the loadtxt documentation.

You can also use python's csv module:

import csv import numpy reader = csv.reader(open("test.csv", "rb"), delimiter=",") x = list(reader) result = numpy.array(x).astype("float")

You will have to convert it to your favorite numeric type. I guess you can write the whole thing in one line:

 result = numpy.array(list(csv.reader(open("test.csv", "rb"), delimiter=","))).astype("float")

Added Hint:

You could also use pandas.io.parsers.read_csv and get the associated numpy array which can be faster.

119

answered Sep 30 '22 08:09

Kaveh_kh

I think using dtype where there is a name row is confusing the routine. Try

>>> r = np.genfromtxt(fname, delimiter=',', names=True) >>> r array([[  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,           8.64075140e+02,   1.71537476e+03,   7.65227770e+02,           1.29111196e+12],        [  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,           8.64075140e+02,   1.71537476e+03,   7.65227770e+02,           1.29111311e+12],        [  6.11882430e+02,   9.08956010e+03,   5.13300000e+03,           8.64075140e+02,   1.71537476e+03,   7.65227770e+02,           1.29112065e+12]]) >>> r[:,0]    # Slice 0'th column array([ 611.88243,  611.88243,  611.88243])

answered Sep 30 '22 09:09

mtrw

Related questions
                            
                                How to get the current working directory using python 3?
                            
                                AttributeError: 'DataFrame' object has no attribute 'ix'
                            
                                PANDAS plot multiple Y axes
                            
                                long running py.test stop at first failure
                            
                                Data structure for maintaining tabular data in memory?
                            
                                How to See if a String Contains Another String in Django Template
                            
                                Testing code that requires a Flask app or request context
                            
                                Python Mixed Integer Linear Programming
                            
                                Pandas: create new column in df with random integers from range
                            
                                Celery parallel distributed task with multiprocessing
                            
                                Circular dependency in Python
                            
                                Is there a 'multimap' implementation in Python?
                            
                                Is there a way to change effective process name in Python?
                            
                                What is causing ImportError: No module named pkg_resources after upgrade of Python on os X?
                            
                                How do I pass options to the Selenium Chrome driver using Python?
                            
                                How can I generate a list of consecutive numbers?
                            
                                Python: Differentiating between row and column vectors
                            
                                What does this Django regular expression mean? `?P`
                            
                                Partition array into N chunks with Numpy
                            
                                How to delete all instances of a character in a string in python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

load csv into 2D matrix with numpy for plotting

Tags:

python

arrays

csv

numpy

reshape

dgorissen

People also ask

2 Answers

Kaveh_kh

mtrw

Recent Activity

Donate For Us