Tesseract OCR recognition of small numbers
-
Tesseract OCR doesn't recognize small numbers, i.e., 6 and 9, others recognize as necessary.
Exhibit:
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('src_path...')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
scale_percent = 400 # percent of original size
width = int(img_gray.shape[1] * scale_percent / 100)
height = int(img_gray.shape[0] * scale_percent / 100)
dim = (width, height)
resized_img = cv2.resize(img_gray, dim, interpolation=cv2.INTER_AREA)blur_img = cv2.GaussianBlur(resized_img, (3, 3), 0)
blur_img = cv2.medianBlur(blur_img, 3)thresh, new_img = cv2.threshold(blur_img, 0, 255, cv2.THRESH_OTSU |cv2.THRESH_BINARY)
custom_config = '--psm 12 --oem 3 -c tessedit_char_whitelist=0123456789'
digits = pytesseract.image_to_string(new_img, lang='eng', config=custom_config)
print(digits)
After all the changes, this is the image:
But that tesseract doesn't recognize what you can do?
upd 1: Temporary solution, 10 times the image and a strong blur.
medianBlur
between 13 and 21). ♪ ♪tesseract
I've decided. But is there a better solution to my problem?
-
Found the reference image:
Invertify the colors, the thickness of the teseract is more of a black print on the white background and put psm 8
Possible options psm:
- 0 = Orientation and script detection (OSD) only.
- 1 = Automatic page segmentation with OSD.
- 2 = Automatic page segmentation, but no OSD, or OCR. (not implemented)
- 3 = Fully automatic page segmentation, but no OSD. (Default)
- 4 = Assume a single column of text of variable sizes.
- 5 = Assume a single uniform block of vertically aligned text.
- 6 = Assume a single uniform block of text.
- 7 = Treat the image as a single text line.
- 8 = Treat the image as a single word.
- 9 = Treat the image as a single word in a circle.
- 10 = Treat the image as a single character.
- 11 = Sparse text. Find as much text as possible in no particular order.
- 12 = Sparse text with OSD.
- 13 = Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.
oem:
- 0 = Original Tesseract only.
- 1 = Neural nets LSTM only.
- 2 = Tesseract + LSTM.
- 3 = Default, based on what is available.
The descriptions of the regimes are taken https://github.com/tesseract-ocr/tesseract/blob/main/doc/tesseract.1.asc
import pytesseract import cv2
img = cv2.imread('ZZt0xKV.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
scale_percent = 400 # percent of original size
width = int(img_gray.shape[1] * scale_percent / 100)
height = int(img_gray.shape[0] * scale_percent / 100)
dim = (width, height)
resized_img = cv2.resize(img_gray, dim, interpolation=cv2.INTER_AREA)blur_img = cv2.GaussianBlur(resized_img, (3, 3), 0)
blur_img = cv2.medianBlur(blur_img, 3)thresh, new_img = cv2.threshold(blur_img, 0, 255, cv2.THRESH_OTSU |cv2.THRESH_BINARY)
pytesseract.pytesseract.tesseract_cmd = "C:\Program Files\Tesseract-OCR\tesseract.exe"
new_img = cv2.bitwise_not(new_img)
custom_config = '--psm 8 -c tessedit_char_whitelist=0123456789'
digits = pytesseract.image_to_string(new_img, lang='eng', config=custom_config)
print(digits.strip()) # 9
Useful https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html with Boards to improve the quality of recognition
upd: Got the conversion code from question, changed psm