Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segmentation fault while calling cpp function from Python

I am trying to call this cpp function from python:

TESS_API BOOL TESS_CALL TessBaseAPIProcessPages(TessBaseAPI* handle, const char* filename, 
  const char* retry_config, int timeout_millisec, TessResultRenderer* renderer)
{
    if (handle->ProcessPages(filename, retry_config, timeout_millisec, renderer))
        return TRUE;
    else
        return FALSE;
}

The last parameter of this function is TessResultRenderer. There is another cpp function for creating TessResultRenderer

TESS_API TessResultRenderer* TESS_CALL TessTextRendererCreate(const char* outputbase)
{
    return new TessTextRenderer(outputbase);
}

Now while calling this from my python, I did the following:

outputbase = "stdout"
renderer = tesseract.TessTextRendererCreate(outputbase)
text_out = tesseract.TessBaseAPIProcessPages(api, 
     ctypes.create_string_buffer(path), 
     None, 0, renderer) //Segmentation fault (core dumped) error on this line

but I keep getting Segmentation fault error.

My question is how can I called TessBaseAPIProcessPages from Python?

Some more reference links into the codebase:

referer api

Implementation of processPages(...)

Edit

After trying the commented suggestions, I did the following but I get an error: item 1 in _argtypes_ has no from_param method

PTessResultRenderer = ctypes.POINTER(TessResultRenderer)
self.tesseract.TessTextRendererCreate.restype = PTessResultRenderer
outputbase = "stdout"
self.tesseract.TessTextRendererCreate.argtypes = [outputbase] #error here
self.tesseract.TessTextRendererCreate

ReturnVal = ctypes.c_bool
self.tesseract.TessBaseAPIProcessPages.argtypes = [self.api, path, None, 0, PTessResultRenderer]
self.tesseract.TessBaseAPIProcessPages.restype = ReturnVal
self.tesseracto.TessBaseAPIProcessPages

class TessResultRenderer(ctypes.Structure):
    pass
like image 697
Anthony Avatar asked May 24 '26 01:05

Anthony


1 Answers

There is an example of using the tesseract C-API from ctypes in the contrib folder. However it seems to be a little out of date. contrib/tesseract-c_api-demo.py

You need to set the restype and argtypes for a few methods. Also, don't forget to call the init function on the handler. The following example works for me. It reads the text from a file called "test.bmp" in English into the text variable.

from ctypes import *
from ctypes.util import find_library

lang = b"eng"
filename = b"test.bmp"
TESSDATA_PREFIX = b"/usr/local/Cellar/tesseract/3.04.01_1/share/tessdata"

path = find_library("libtesseract.dylib")
tesseract = CDLL(path)

class TessBaseAPI(Structure):
    pass
class TessResultRenderer(Structure):
    pass

tesseract.TessBaseAPICreate.restype = POINTER(TessBaseAPI)
tesseract.TessBaseAPIInit3.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p]
tesseract.TessBaseAPIInit3.restype = c_bool
tesseract.TessBaseAPIProcessPages.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p, c_int, POINTER(TessResultRenderer)]
tesseract.TessBaseAPIProcessPages.restype = c_bool
tesseract.TessBaseAPIGetUTF8Text.argtypes = [POINTER(TessBaseAPI)]
tesseract.TessBaseAPIGetUTF8Text.restype = c_char_p

api = tesseract.TessBaseAPICreate()
rc = tesseract.TessBaseAPIInit3(api, TESSDATA_PREFIX, lang);
if (rc):
    tesseract.TessBaseAPIDelete(api)
    print("Could not initialize tesseract.\n")
    exit(3)

success = tesseract.TessBaseAPIProcessPages(api, filename, None , 0, None)

if success:
    text = tesseract.TessBaseAPIGetUTF8Text(api)
    print("="*78)
    print(text.decode("utf-8").strip())
    print("="*78)

The output looks like this:

==============================================================================
This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.

The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.
==============================================================================

Edit: Replaced use of c_void_p with opaque types as suggested by eryksun. Thanks!

like image 196
Snorfalorpagus Avatar answered May 26 '26 14:05

Snorfalorpagus



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!