I am trying to calculate the origin and offset of variable size arrays and store them in a dictionary. Here is the likely non-pythonic way that I am achieving this. I am not sure if I should be looking to use map, a lambda function, or list comprehensions to make the code more pythonic.
Essentially, I need to cut chunks of an array up based on the total size and store the xstart, ystart, x_number_of_rows_to_read, y_number_of_columns_to_read in a dictionary. The total size is variable. I can not load the entire array into memory and use numpy indexing or I definitely would. The origin and offset are used to get the array into numpy.
intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks
#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0
key = 0
d = defaultdict(list)
for y in xrange(0, ysize, intervaly):
if y + (intervaly * 2) < ysize:
numberofrows = intervaly
else:
numberofrows = ysize - y
for x in xrange(0, xsize, intervalx):
if x + (intervalx * 2) < xsize:
numberofcolumns = intervalx
else:
numberofcolumns = xsize - x
l = [x,y,numberofcolumns, numberofrows]
d[key].append(l)
key += 1
return d
I realize that xrange is not ideal for a port to 3.
This code looks fine except for your use of defaultdict
. A list seems like a much better data structure because:
One thing you could do:
Here's a modified version of your code with my few suggestions.
intervalx = xsize / xsegment #Get the size of the chunks
intervaly = ysize / ysegment #Get the size of the chunks
#Setup to segment the image storing the start values and key into a dictionary.
xstart = 0
ystart = 0
output = []
for y in xrange(0, ysize, intervaly):
numberofrows = intervaly if y + (intervaly * 2) < ysize else ysize -y
for x in xrange(0, xsize, intervalx):
numberofcolumns = intervalx if x + (intervalx * 2) < xsize else xsize -x
lst = [x, y, numberofcolumns, numberofrows]
output.append(lst)
#If it doesn't make any difference to your program, the above 2 lines could read:
#tple = (x, y, numberofcolumns, numberofrows)
#output.append(tple)
#This will be slightly more efficient
#(tuple creation is faster than list creation)
#and less memory hungry. In other words, if it doesn't need to be a list due
#to other constraints (e.g. you append to it later), you should make it a tuple.
Now to get your data, you can do offset_list=output[5]
instead of offset_list=d[5][0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With