How to convert hOCR to HTML for visualization?
If you open the raw hOCR file its only rendered as plain text (the elements are not positioned)
To add the interface to a plain hOCR file, add this line just before the closing tag:
<script src="https://unpkg.com/hocrjs"></script>
Then open the html (hOCR) file in your browser.
Source
There are different solutions for this task and I know these three:
https://github.com/kba/hocrjs (overlay the hocr data over the image, different options how to show this)
https://github.com/not-implemented/hocr-proofreader (shows the image on the left and the hocr data on the right, possible to use for entering corrections)
https://github.com/ultrasaurus/hocr-javascript
All of these repos seem to consist mainly of some JavaScript and CSS files. The first two repos have both a link to some demo page where I have taken the pictures from.
The first one provides a Greasemonkey/Tampermonkey script which allows to inject this overlay on any suitable hocr website online and local (some configuration may be possible for that). I don't know how difficult it is to use the other solutins for your own hocr files, but it should be doable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With