Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

skip ending rows containing string while reading a txt file with numpy to generate a numerical array

I'm trying to generate an array reading a text file from internet.

My target is to use Python instead of MATLAB, to replace this step in MATLAB:

url=['http://www.cdc.noaa.gov/Correlation/amon.us.long.data'];
urlwrite(url,'file.txt');

I'm using this code:

urllib.urlretrieve('http://www.cdc.noaa.gov/Correlation/amon.us.long.data', '/Users/epy/file2.txt')
a = np.loadtxt('/Users/epy/file2.txt', skiprows=1, dtype=None)

But it fails because of the text description at the end of the file.

Do you know if exist a way to skip the X lines at the end, or I have to use some sort of string manipulation (readlines?) instead?

like image 779
epifanio Avatar asked Oct 25 '11 19:10

epifanio


1 Answers

For more complex text loading, have a look at numpy.genfromtxt.

It's slower than numpy.loadtxt but more flexible.

In your case (I'm avoiding saving a temporary file here...):

import numpy as np
import urllib2

url = 'http://www.cdc.noaa.gov/Correlation/amon.us.long.data'
data = np.genfromtxt(urllib2.urlopen(url), skip_header=1, skip_footer=4)
like image 85
Joe Kington Avatar answered Sep 27 '22 17:09

Joe Kington