I'm trying to preprocess a dataset for a neuronal network. Therefore, I need to reshape an array with the shape (2040906, 1) into an array of batches.
I need a batch size around 1440 rows but 2040906 is not dividable (with a remainder of zero) by that number obviously.
I tried to just calculate the modulo of the division and drop as many rows as the remainder so the division will result in a modulo of zero. But dropping rows of my dataset is not what I want to do.
So this is an example snippet to reproduce the problem.
import numpy as np
x = np.ones((2040906, 1))
np.split(x, 1440)
The perfect solution for me would be some kind of function, that returns the nearest divisor for a given value that has a remainder of 0.
Not sure this is the most elegant solution, but you can do the following:
def getDivisors(n, res=None) :
res = res or []
i = 1
while i <= n :
if (n % i==0) :
res.append(i),
i = i + 1
return res
getDivisors(2040906)
Out[4]:
[1,
2,
3,
6,
7,
14,
21,
42,
48593,
97186,
145779,
291558,
340151,
680302,
1020453,
2040906]
def get_closest_split(n, close_to=1440):
all_divisors = getDivisors(n)
for ix, val in enumerate(all_divisors):
if close_to < val:
if ix == 0: return val
if (val-close_to)>(close_to - all_divisors[ix-1]):
return all_divisors[ix-1]
return val
def get_closest_split(n, close_to=1440)
Out[6]: 42
Which in your case, would return 42 as the only divisor closest to 1440. Thus, np.split(x, 42)
should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With