Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

read an ascii file into a numpy array

I have an ascii file and I want to read it into a numpy array. But it was failing and for the first number in the file, it returns 'NaN' when I use numpy.genfromtxt. Then I tried to use the following way of reading the file into an array:

lines = file('myfile.asc').readlines()
X     = []
for line in lines:
    s = str.split(line)
    X.append([float(s[i]) for i in range(len(s))])

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
ValueError: could not convert string to float: 15.514

when I printed the first line of the file it looks like :

>>> s
['\xef\xbb\xbf15.514', '15.433', '15.224', '14.998', '14.792', '15.564', '15.386', '15.293', '15.305', '15.132', '15.073', '15.005', '14.929', '14.823', '14.766', '14.768', '14.789']

how could I read such a file into a numpy array without problem and any presumption about the number of rows and columns?

like image 534
Dalek Avatar asked Mar 18 '23 11:03

Dalek


2 Answers

Based on @falsetru's answer, I want to provide a solution with Numpy's file reading capabilities:

import numpy as np
import codecs

with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
    X = np.loadtxt(f)

It loads the file into an open file instance using the correct encoding. Numpy uses this kind of handle (it can also use handles from open() and works seemless like in every other case.

like image 83
sebix Avatar answered Apr 01 '23 04:04

sebix


The file is encoded with utf-8 with BOM. Use codecs.open with utf-8-sig encoding to handle it correctly (To exclude BOM \xef\xbb\xbf).

import codecs

X = []
with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
    for line in f:
        s = line.split()
        X.append([float(s[i]) for i in range(len(s))])

UPDATE You don't need to use index at all:

with codecs.open('myfile.asc', encoding='utf-8-sig') as f:
    X = [[float(x) for x in line.split()] for line in f]

BTW, instead of using the unbound method str.split(line), use line.split() if you have no special reason to do it.

like image 44
falsetru Avatar answered Apr 01 '23 04:04

falsetru