How to apply SMOTE technique (oversampling) before word embedding layer

Tags:

How to apply SMOTE algorithm before word embedding layer in LSTM.

I have a problem of text binary classification (Good(9500) or Bad(500) review with total of 10000 training sample and it's unbalanced training sample), mean while i am using LSTM with pre-trained word-embeddings (100 dimension space for each word) as well, so each training input have an id's (Total of 50 ids with zero padding's as well when the text description is having lesser than 50 words and trimmed to 50 when the description is exceeded 50 characters) of word dictionary.

Below is my general flow,

Input - 1000(batch) X 50 (sequence length)
Word Embedding - 200(Unique vocabulary word) X 100 (word representation)
After word embedding layer (new input for LSTM) - 1000(batch) X 50(sequence) X 100 (features)
Final State from LSTM 1000 (batch) X 100 (units)
Apply final layer 1000(batch) X 100 X [100(units) X 2 (output class)]

All i want to generate more data for Bad review with the help of SMOTE

906

asked Nov 19 '18 23:11

user1531248

1 Answers

I faced the same issue. Found this post on stackexchange which proposes to adjust the weights of the class distribution instead of oversampling. Apparently it is the standard way in LSTM / RNN to deal with class imbalance.

https://stats.stackexchange.com/questions/342170/how-to-train-an-lstm-when-the-sequence-has-imbalanced-classes

answered Nov 10 '22 23:11

clagger

Related questions
                            
                                Tensorflow ResourceExhaustedError after first batch
                            
                                Python3 - how to correctly do absolute imports and make Pylint happy
                            
                                invite user by username to telegram channel
                            
                                Why does importing numpy add 1 GB of virtual memory on Linux?
                            
                                Algorithm to get minimum movement to avoid square overlap
                            
                                Safely bind method from one class to another class in Python [duplicate]
                            
                                Can I use the secrets module with a version of Python earlier than 3.6?
                            
                                df.append() with dicts converts booleans to 1s and 0s
                            
                                How do chr() and ord() relate to str and bytes?
                            
                                Python: error handling with recursive function in error
                            
                                "Could not find a version that satisfies the requirement" error for Django2 app installation
                            
                                How to access country restricted website through proxy selenium in python
                            
                                Getting Multiple Last Price Quotes from Interactive Brokers's API
                            
                                Model Output `to_excel` in Python?
                            
                                How to plot heatmap for high-dimensional dataset?
                            
                                How I can get the vectors for words that were not present in word2vec vocabulary?
                            
                                Quickfix read custom repeating group
                            
                                Imported package not available in Jupyter-Python
                            
                                Changing the type of values in arrays resulting from sklearn.model_selection.train_test_split
                            
                                Warning message from tika python module using the unpack method

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to apply SMOTE technique (oversampling) before word embedding layer

Tags:

python-3.x

tensorflow

deep-learning

oversampling

user1531248

People also ask

1 Answers

clagger

Recent Activity

Donate For Us