Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HOCR to HTML for visualizing

Tags:

html

ocr

hocr

How to convert hOCR to HTML for visualization?

If you open the raw hOCR file its only rendered as plain text (the elements are not positioned)

like image 801
clarkk Avatar asked Jul 13 '16 20:07

clarkk


Video Answer


2 Answers

To add the interface to a plain hOCR file, add this line just before the closing tag:

<script src="https://unpkg.com/hocrjs"></script>

Then open the html (hOCR) file in your browser.

Source

like image 137
Philip Bergström Avatar answered Oct 06 '22 03:10

Philip Bergström


There are different solutions for this task and I know these three:

  • https://github.com/kba/hocrjs (overlay the hocr data over the image, different options how to show this) enter image description here

  • https://github.com/not-implemented/hocr-proofreader (shows the image on the left and the hocr data on the right, possible to use for entering corrections) enter image description here

  • https://github.com/ultrasaurus/hocr-javascript

All of these repos seem to consist mainly of some JavaScript and CSS files. The first two repos have both a link to some demo page where I have taken the pictures from.

The first one provides a Greasemonkey/Tampermonkey script which allows to inject this overlay on any suitable hocr website online and local (some configuration may be possible for that). I don't know how difficult it is to use the other solutins for your own hocr files, but it should be doable.

like image 22
zuphilip Avatar answered Oct 06 '22 02:10

zuphilip