Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resources containing OCR benchmark test-sets for free [closed]

I want to do an OCR benchmark for scanned text (typically any scan, i.e. A4). I was able to find some NEOCR datasets here, but NEOCR is not really what I want.

I would appreciate links to sources of free databases that have appropriate images and the actual texts (contained in the images) referenced.

I hope this thread will also be useful for other people doing OCR surfing for datasets, since I didn't find any good reference to such sources.

Thanks!

like image 390
SuTron Avatar asked Dec 16 '16 10:12

SuTron


1 Answers

Coco dataset : https://vision.cornell.edu/se3/coco-text-2/

Char74Kdatase : http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/

COCO dataset is a benchmark dataset for images. World's toughest competitions are arranged using COCO dataset. It can be used for object detecion, image captioning, OCR.

like image 62
Manas Bhardwaj Avatar answered Sep 20 '22 19:09

Manas Bhardwaj