I have xyz textfiles that need to be gridded. For each xyz file i have info about the origin coordinates the cellsize and the number of rows/columns. however, records where there´s no z value are missing in the xyz file so just creating a grid from the present records fails because of the missing values. so i tried this:
nxyz = np.loadtxt(infile,delimiter=",",skiprows=1)
ncols = 4781
nrows = 4405
xllcorner = 682373.533843
yllcorner = 205266.898604
cellsize = 1.25
grid = np.zeros((nrows,ncols))
for item in nxyz:
idx = (item[0]-xllcorner)/cellsize
idy = (item[1]-yllcorner)/cellsize
grid[idy,idx] = item[2]
outfile = open(r"e:\test\myrasout.txt","w")
np.savetxt(outfile,grid[::-1], fmt="%.2f",delimiter= " ")
outfile.close()
This gets me the grid with zeroes where no records are present in the xyz file. It works for smaller files but i got an out of memory error for a file with 290Mb size (~8900000 records). And this is not the largest file i have to process.
So i tried another (iterative) approach by Joe Kington i found here for loading the xyz file. This worked for the 290MB file, but failed with an out of memory error on the next bigger one (533MB, ~15600000 records).
How can i grid these larger files correctly (accounting for the missing records) without running out of memory?
Based on the comments I'd change the code to
ncols = 4781
nrows = 4405
xllcorner = 682373.533843
yllcorner = 205266.898604
cellsize = 1.25
grid = np.zeros((nrows,ncols))
with open(file) as f:
for line in f:
item = line.split() # fill with whatever is separating the values
idx = (item[0]-xllcorner)/cellsize
idy = (item[1]-yllcorner)/cellsize
#...
You can do fancy indexing with NumPy. Try using something like this, instead of the loop which is probably the root of yuor problem:
grid = np.zeros((nrows,ncols))
grid[nxyz[:,0],nxyz[:,1]] = nxyz[:,2]
With the origin and cell size conversion, it is a bit more involved:
grid = np.zeros((nrows,ncols))
grid[(nxyz[:,0]-x11corner)/cellsize,(nxyz[:,1]-y11corner)/cellsize] = nxyz[:,2]
If this doesn't help, the nxyz array is too big, but I doubt that. If it is, then you could load the text file in several parts and do the above for each part sequentially.
P.S. You probably know the range of the data contained in your text files, and you can limit memory usage by explicitely stating this while loading the file. Like so if you are dealing with maximally 16 bit integers: np.loadtxt("myfile.txt", dtype=int16).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With