Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating numpy array from list gives wrong shape

I'm creating several numpy arrays from a list of numpy arrays, like so:

seq_length = 1500
seq_diff = 200  # difference between start of two sequences
# x and y are 2D numpy arrays
x_seqs = [x[i:i+seq_length,:] for i in range(0, seq_diff*(len(x) // seq_diff), seq_diff)]
y_seqs = [y[i:i+seq_length,:] for i in range(0, seq_diff*(len(y) // seq_diff), seq_diff)]
boundary1 = int(0.7 * len(x_seqs))   # 70% is training set
boundary2 = int(0.85 * len(x_seqs))  # 15% validation, 15% test
x_train = np.array(x_seqs[:boundary1])
y_train = np.array(y_seqs[:boundary1])
x_valid = np.array(x_seqs[boundary1:boundary2])
y_valid = np.array(y_seqs[boundary1:boundary2])
x_test = np.array(x_seqs[boundary2:])
y_test = np.array(y_seqs[boundary2:])

I'd like to end up with 6 arrays of shape (n, 1500, 300) where n is either 70%, 15% or 15% of my data for the training, validation and test arrays, respectively.

This is where it goes wrong: the _train and _valid arrays turn out fine, but the _test arrays are one-dimensional arrays of arrays. That is:

  • x_train.shape is (459, 1500, 300)
  • x_valid.shape is (99, 1500, 300)
  • x_test.shape is (99,)

But printing x_test verifies that it contains the correct elements - i.e. it's a 99-element long array of (1500, 300) arrays.

Why do the _test matrices get the wrong shape, while the _train and _valid matrices don't?

like image 444
tao_oat Avatar asked Oct 17 '22 22:10

tao_oat


1 Answers

The items in x_seqs vary in length. When they are all the same length, np.array can make a 3d array from them; when they differ it makes an object array of lists. Look at the dtype of x_test. Look at the [len(i) for i in x_test].

I took your code, added:

x=np.zeros((2000,10))
y=x.copy()
...
print([len(i) for i in x_seqs])
print(x_train.shape)
print(x_valid.shape)
print(x_test.shape)

and got:

1520:~/mypy$ python3 stack40643639.py 
[1500, 1500, 1500, 1400, 1200, 1000, 800, 600, 400, 200]
(7,)
(1, 600, 10)
(2,)
like image 187
hpaulj Avatar answered Oct 21 '22 00:10

hpaulj