I'm new to the ML and I was following this tutorial which teaches how to do cryptocurrency predictions based on some futures.
My code to do the prediction:
model = load_model("Path//myModel.model")
ready_x = preprocess_df(main_df) # the function returns array of price sequences and targets (0-buy,1-sells): return np.array(X), y
predictions = []
for x in ready_x:
l_p = model.predict_classes(x) #error occurs on this line
predictions.append(l_p[0])
plot_prediction(main_df, predictions)
But i got the below error:
ValueError: Error when checking input: expected lstm_input to have 3 dimensions, but got array with shape (69188, 1)
I don't really get the idea of this error, It's literally my second project on ML after famous cats and dogs classification. So don't have much experience for debugging, I did learn the theory first, about neurons and the relationships between them but still It's really difficult to apply that knowledge to the real project. So the idea of this project is to predict the future price, 3 minute into the future, based on the last 60 minute prices (trained on that).
The model looks like this:
model = Sequential()
model.add(LSTM(128, input_shape=(train_x.shape[1:]),return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(2, activation="softmax"))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
main_df
is a data frame and consists of:
My question is, how i should I feed the model with correct data input to do this prediction?
EDIT:preprocess
function:
def preprocess_df(df):
#scalling
df = df.drop('future', 1)
for col in df.columns:
if col!= "target":
df[col] = df[col].pct_change() # normalizes the data
df.dropna(inplace=True)
df[col] = preprocessing.scale(df[col].values) #scale the data between 0-1
df.dropna(inplace=True)
sequential_data = []
prev_days = deque(maxlen=SEQ_LEN)
for i in df.values:
prev_days.append([n for n in i[:-1]]) # append each column and not taking a target
if len(prev_days) == SEQ_LEN:
sequential_data.append([np.array(prev_days), i[-1]])
random.shuffle(sequential_data)
# BALANCING THE DATA
buys = []
sells = []
for seq, target in sequential_data:
if target == 0:
sells.append([seq, target])
elif target == 1:
buys.append([seq, target])
random.shuffle(buys)
random.shuffle(sells)
lower = min(len(buys), len(sells))
buys = buys[:lower]
sells = sells[:lower]
sequential_data = buys + sells
random.shuffle(sequential_data)
X = []
y = []
for seq, target in sequential_data:
X.append(seq)
y.append(target)
return np.array(X), y
LSTM expects inputs shaped (batch_size, timesteps, channels)
; in your case, timesteps=60
, and channels=128
. batch_size
is how many samples you're feeding at once, per fit / prediction.
Your error indicates preprocessing flaws:
time
, would fill dim 1 of x
-> timesteps
x
-> channels
Once accounting for above:
print(x.shape)
should read (N, 60, 128)
, where N
is the number of samples, >= 1
ready_x
, x
will slice ready_x
along its dim 0 - so print(ready_x.shape)
should read (M, N, 60, 128)
, where M >= 1
; it's the "batches" dimension, each slice being 1 batch.As basic debugging: insert print(item.shape)
throughout your preprocessing code, where item
is an array, DataFrame, etc. - to see how shapes change throughout various steps. Ensure that there is a step which gives 128
on the last dimension, and 60
on second-to-last.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With