How to work with sequences longer than 512 tokens. I don't wanted to use truncates =True. But actually wanted to handle the longer sequences
You can use the stride with max length parameter to handle the larger documents.
encoding = processor(images, words, boxes=boxes, word_labels=word_labels, truncation=True,padding="max_length", max_length = 512, stride = 128, return_overflowing_tokens = True,return_offsets_mapping = True)
This would help to handle the larger files.
Let me know if this is useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With