Setup:
Libraries:
import itertools
import pytesseract
import os
from PIL import Image, ImageEnhance, ImageFilter
import glob
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
from copy import deepcopy
import cv2
from matplotlib.pyplot import imshow
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import time
Sharable Code:
def img_prep(img):
im = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
return im
Code relying on C compiling:
text = pytesseract.image_to_string(img_prep(img))
img.crop([guess])
Exception:
Unhandled exception at 0x00007FFF0814ADEF (_imaging.cp36-win_amd64.pyd) in python.exe: 0xC0000005: Access violation reading location 0x0000000000000000.
Note:
Unfortunately, I can't post my code as it is work product, but I'm using cv2 and tesseract (with pytesseract) to read text from images. My script reads progressively smaller images and runs into issues when as the images get smaller and therefore run faster. When the runtime on the OCR call hits ~.9 seconds the program crashes.
I'm running this in Jupyter Notebook and don't get a traceback when the kernel dies.
Updates: Dummy Example
def load_img(file_name):
im = Image.open(img_name)
return im
img_name = r"Path\To\Some\Img.jpg"
img = load_img(img_name)
print(img.size)
#(1700,2200)
#Note: img.crop([100,100,50,200]) returns an error
#However:
img = img.crop([100,100,50,200]) #Does Not
print(img.size)
#(0,100)
def img_prep(img):
im = img.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
return im
#Calling the following code crashes
text = pytesseract.image_to_string(img_prep(img))
# This appears to be as a result of tesseract not having a handling mechanism in for miss-size images
Added several prints to my code to try and dig further, found that one of my loops was cropping the image with a bug, where the 2nd vertex was smaller than the first. Fixing this loop should resolve the error.
cv2's crop function takes the following parameters. img.crop(width_lower, height_lower, width_upper, height_upper)
This error arises when the upper bound is less than or equal to the lower bound in cv2's crop function
The issue is that cv2's crop function fails silently on assignment.
img.crop(width_lower, height_lower, width_upper, height_upper)
where upper is less than lower fails with an error, but
sub_img = img.crop(width_lower, height_lower, width_upper, height_upper)
does not and causes this error when processing sub_img downstream.
User contributions licensed under CC BY-SA 3.0