I want to do an OCR benchmark for scanned text (typically any scan, i.e. A4). I was able to find some NEOCR datasets here, but NEOCR is not really what I want.
I would appreciate links to sources of free databases that have appropriate images and the actual texts (contained in the images) referenced.
I hope this thread will also be useful for other people doing OCR surfing for datasets, since I didn't find any good reference to such sources.
Thanks!
Coco dataset : https://vision.cornell.edu/se3/coco-text-2/
Char74Kdatase : http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
COCO dataset is a benchmark dataset for images. World's toughest competitions are arranged using COCO dataset. It can be used for object detecion, image captioning, OCR.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With