I am using BERT Word Embeddings for sentence classification task with 3 labels. I am using Google Colab for coding. My problem is, since I will have to execute the embedding part every time I restart the kernel, is there any way to save these word embeddings once it is generated? Because, it takes a lot of time to generate those embeddings.
The code I am using to generate BERT Word Embeddings is -
[get_features(text_list[i]) for text_list[i] in text_list]
Here, gen_features is a function which returns word embedding for each i in my list text_list.
I read that converting embeddings into bumpy tensors and then using np.save can do it. But I actually don't know how to code it.
You can save your embeddings data to a numpy file by following these steps:
all_embeddings = here_is_your_function_return_all_data()
all_embeddings = np.array(all_embeddings)
np.save('embeddings.npy', all_embeddings)
If you're saving into google colab, then you can download it to your local computer. Whenever you need it, just upload it and load it.
all_embeddings = np.load('embeddings.npy')
That's it.
Btw, You can also directly save your file to google drive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With