Variable Number of Columns in genfromtxt() in Python?

Question

I have a .txt file that has rows of different lengths. Each row is a series point representing one trajectory. Since every trajectory has its own length, the rows are all different in length. That is, the number of columns varies from one row to another.

AFAIK, the genfromtxt() module in Python requires the numbers of the columns to be the same.

>>> import numpy as np
>>> 
>>> data=np.genfromtxt('deer_1995.txt', skip_header=2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages
umpy\lib
pyio.py", line 1638, in genfromtxt
    raise ValueError(errmsg)
ValueError: Some errors were detected !
    Line #4 (got 2352 columns instead of 1824)
    Line #5 (got 2182 columns instead of 1824)
    Line #6 (got 1412 columns instead of 1824)
    Line #7 (got 1650 columns instead of 1824)
    Line #8 (got 1688 columns instead of 1824)
    Line #9 (got 1500 columns instead of 1824)
    Line #10 (got 1208 columns instead of 1824)

It is also able to fill the missing values by the help of filling_values. However, I think that incurs unnecessary trouble, which I wish to avoid.

So what is the best (Pythonic) way of simply importing this data set in without filling in the "missing values"?

lucasg · Accepted Answer

Numpy.genfromtxt does not handle variable-length rows since numpy does only works with arrays and matrices (fixed row/column sizes).

You need to parse your data manually. for example :

The data (csv-based) :

0.613 ;  5.919 
0.615 ;  5.349
0.615 ;  5.413
0.617 ;  6.674
0.617 ;  6.616
0.63 ;   7.418
0.642 ;  7.809 ; 5.919
0.648 ;  8.04
0.673 ;  8.789
0.695 ;  9.45
0.712 ;  9.825
0.734 ;  10.265
0.748 ;  10.516
0.764 ;  10.782
0.775 ;  10.979
0.783 ;  11.1
0.808 ;  11.479
0.849 ;  11.951
0.899 ;  12.295
0.951 ;  12.537
0.972 ;  12.675
1.038 ;  12.937
1.098 ;  13.173
1.162 ;  13.464
1.228 ;  13.789
1.294 ;  14.126
1.363 ;  14.518
1.441 ;  14.969
1.545 ;  15.538
1.64 ;   16.071
1.765 ;  16.7
1.904 ;  17.484
2.027 ;  18.36
2.123 ;  19.235
2.149 ;  19.655
2.172 ;  20.096
2.198 ;  20.528
2.221 ;  20.945
2.265 ;  21.352
2.312 ;  21.76
2.365 ;  22.228
2.401 ;  22.836
2.477 ;  23.804

The parser :

import csv
datafile = open('i.csv', 'r')
datareader = csv.reader(datafile)
data = []
for row in datareader:
    # I split the input string based on the comma separator, and cast every elements into a float
    data.append( [ float(elem) for elem in row[0].split(";") ] )

print data

Variable Number of Columns in genfromtxt() in Python?

Tags:

python

Sibbs Gambling

1 Answers

lucasg

Recent Activity

Donate For Us

Variable Number of Columns in genfromtxt() in Python?

Tags:

python

Sibbs Gambling

1 Answers

lucasg

Related questions

Recent Activity

Donate For Us