Exception on using ctypes with tesserac-ocr TessPageIteratorBoundingBox

Question

Exception on using ctypes with tesserac-ocr TessPageIteratorBoundingBox

import ctypes
import os
os.putenv("PATH", r'C:\Program Files\Tesseract-OCR')
os.environ["TESSDATA_PREFIX"] = r'C:\Program Files\Tesseract-OCR\tessdata'

liblept = ctypes.cdll.LoadLibrary('liblept-5.dll')
pix = liblept.pixRead('test.png'.encode())
print(pix)

tesseractLib = ctypes.cdll.LoadLibrary('libtesseract-5.dll')

tesseractHandle = tesseractLib.TessBaseAPICreate()

tesseractLib.TessBaseAPIInit3(tesseractHandle, '.', 'eng')

tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)

# text_out = tesseractLib.TessBaseAPIGetUTF8Text(tesseractHandle)
# print(ctypes.string_at(text_out))

tessPageIterator = tesseractLib.TessResultIteratorGetPageIterator(tesseractHandle)
iteratorLevel = 3  # RIL_BLOCK,  RIL_PARA,  RIL_TEXTLINE,  RIL_WORD,  RIL_SYMBOL
tesseractLib.TessPageIteratorBoundingBox(tessPageIterator, iteratorLevel, ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0))

I got exceptions :

Traceback (most recent call last):
  File "D:\BaiduYunDownload\programming\Python\CtypesOCR.py", line 25, in <module>
    tesseractLib.TessPageIteratorBoundingBox(tessPageIterator, iteratorLevel, ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0), ctypes.c_int(0))
OSError: exception: access violation reading 0x00000018

So what's wrong ? The aim of this program is to get bounding rectangle of each word. I know projects like tesserocr and PyOCR

P.S. Specifying the required argument types (function prototypes) for the DLL functions doesn't matter here. One could uncoment the commented lines and comment the last three lines to test it. I posted the question before , and it was closed for this reason

python

c++

c

tesseract

ctypes

asked on Stack Overflow Feb 12, 2020 by

iMath

1 Answer

I solved my question by myself

import ctypes
import os
import io
os.putenv("PATH", r'C:\Program Files\Tesseract-OCR')
os.environ["TESSDATA_PREFIX"] = r'C:\Program Files\Tesseract-OCR\tessdata'

liblept = ctypes.cdll.LoadLibrary('liblept-5.dll')
pix = liblept.pixRead(b'test.png')  # 必须encode
print(pix)

tesseractLib = ctypes.cdll.LoadLibrary('libtesseract-5.dll')

tesseractHandle = tesseractLib.TessBaseAPICreate()

tesseractLib.TessBaseAPIInit3(tesseractHandle, b'.', b'eng')  # (TessBaseAPI* handle, const char* datapath,const char* language);

# from PIL import Image
# pixmap = Image.open("test.png")
# image = io.BytesIO()
# pixmap.save(image, 'png')  # 没有什么类型，这里就任意指定个吧；For images created by the library itself (via a factory function, or by running a method on an existing image), this attribute is set to None.
# image.seek(0)  # 要回到开始才行，不然后面requests读的时候会从结尾读，读不到数据

tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)  # pixmap.tobytes("raw", "RGB")
# text_out = tesseractLib.TessBaseAPIGetUTF8Text(tesseractHandle)
# print(ctypes.string_at(text_out))
tesseractLib.TessBaseAPIRecognize(tesseractHandle, None)  # 必须有，否则下面会出问题
tessResultIterator = tesseractLib.TessBaseAPIGetIterator(tesseractHandle)  # TessResultIteratorGetPageIterator要用

tessPageIterator = tesseractLib.TessResultIteratorGetPageIterator(tessResultIterator)
wordLevel = 3  # RIL_BLOCK,  RIL_PARA,  RIL_TEXTLINE,  RIL_WORD,  RIL_SYMBOL


left = ctypes.c_int(0)  # 这几个是要用来写入数据的，所以要构造出来 可写；byref() argument must be a ctypes instance, not 'int'
top = ctypes.c_int(0)
right = ctypes.c_int(0)
bottom = ctypes.c_int(0)

while True:
    r = tesseractLib.TessPageIteratorBoundingBox(
        tessPageIterator,
        wordLevel,
        ctypes.byref(left),  # byref behaves similar to pointer(obj), but the construction is a lot faster.
        ctypes.byref(top),
        ctypes.byref(right),
        ctypes.byref(bottom)
    )

    text_out = tesseractLib.TessResultIteratorGetUTF8Text(tessPageIterator, wordLevel)
    print(ctypes.string_at(text_out), left.value, top.value, right.value, bottom.value)

    if not tesseractLib.TessPageIteratorNext(tessPageIterator, wordLevel):
        break

answered on Stack Overflow Mar 11, 2020 by

iMath

User contributions licensed under CC BY-SA 3.0