本文主要介紹一個(gè)通過(guò)圖像處理改善OCR識(shí)別結(jié)果的實(shí)例,并給出詳細(xì)步驟和源碼。
背景介紹
在很多情況下,文字識(shí)別會(huì)遇到困難。比如非單一的背景、雜訊干擾、文字部分缺失等。
我們希望識(shí)別圖中的黑色文字(12-14),但背景較復(fù)雜且存在其他干擾,如果直接用Tesseract識(shí)別(代碼如下),識(shí)別結(jié)果為空。
# -*- coding:utf-8 -*- 
import pytesseract
from PIL import Image
# 打開圖像
image = Image.open('0.png')
# OCR識(shí)別:lang默認(rèn)英文
text = pytesseract.image_to_string(image)
# 打印識(shí)別后的文本
print(text)
對(duì)這種復(fù)雜情況的文字識(shí)別,直接去識(shí)別很容易失敗。思考:可不可以通過(guò)圖像處理將我們需要的部分分割或凸顯出來(lái)再做識(shí)別?本文將以此為例做演示說(shuō)明。
**詳細(xì)實(shí)現(xiàn)步驟
**
【1】OTSU二值化
image = cv2.imread('0.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
cv2.imshow("Otsu", thresh)
【2】距離變化 + 歸一化
dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)
dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)
dist = (dist * 255).astype("uint8")
cv2.imshow("Dist", dist)
【3】對(duì)距離變換結(jié)果圖做OTSU二值化
_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imshow("Dist Otsu", dist)
【4】形態(tài)學(xué)開運(yùn)算濾除雜訊
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)
cv2.imshow("Opening", opening)
【5】輪廓篩選,找出文字區(qū)域
black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)
cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
chars = []
# loop over the contours
for c in cnts:
  # compute the bounding box of the contour
  (x, y, w, h) = cv2.boundingRect(c)
  if w >= 35 and h >= 100:
    chars.append(c)
cv2.drawContours(black_img,chars,-1,(0,255,0),2)
cv2.imshow("chars", black_img)
【6】計(jì)算輪廓凸包,進(jìn)一步獲取文字區(qū)域mask
mask = np.zeros(image.shape[:2], dtype="uint8")
cv2.drawContours(mask, [hull], -1, 255, -1)
mask = cv2.dilate(mask, None, iterations=2)
cv2.imshow("Mask", mask)
take the bitwise of the opening image and the mask to reveal just
the characters in the image
final = cv2.bitwise_and(opening, opening, mask=mask)
cv2.imshow("final", mask)
【7】Tesseract文字識(shí)別
text = pytesseract.image_to_string(final)
# 打印識(shí)別后的文本
print(text)
【8】完整代碼:
#公眾號(hào):OpenCV與AI深度學(xué)習(xí)
import cv2
import numpy as np
import imutils
import pytesseract
image = cv2.imread('0.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
cv2.imshow("Otsu", thresh)
dist = cv2.distanceTransform(thresh, cv2.DIST_L2, 5)
dist = cv2.normalize(dist, dist, 0, 1.0, cv2.NORM_MINMAX)
dist = (dist * 255).astype("uint8")
cv2.imshow("Dist", dist)
threshold the distance transform using Otsu's method
_,dist = cv2.threshold(dist, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imshow("Dist Otsu", dist)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
opening = cv2.morphologyEx(dist, cv2.MORPH_OPEN, kernel)
cv2.imshow("Opening", opening)
black_img = cv2.cvtColor(opening, cv2.COLOR_GRAY2BGR)
cnts = cv2.findContours(opening.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
chars = []
loop over the contours
for c in cnts:
compute the bounding box of the contour
(x, y, w, h) = cv2.boundingRect(c)
if w >= 35 and h >= 100:
chars.append(c)
cv2.drawContours(black_img,chars,-1,(0,255,0),2)
cv2.imshow("chars", black_img)
chars = np.vstack([chars[i] for i in range(0, len(chars))])
hull = cv2.convexHull(chars)
allocate memory for the convex hull mask, draw the convex hull on
the image, and then enlarge it via a dilation
mask = np.zeros(image.shape[:2], dtype="uint8")
cv2.drawContours(mask, [hull], -1, 255, -1)
mask = cv2.dilate(mask, None, iterations=2)
cv2.imshow("Mask", mask)
take the bitwise of the opening image and the mask to reveal just
the characters in the image
final = cv2.bitwise_and(opening, opening, mask=mask)
cv2.imshow("final", final)
text = pytesseract.image_to_string(final)
打印識(shí)別后的文本
print(text)
cv2.waitKey()
cv2.destroyAllWindows()
**參考鏈接**
(1)https://pyimagesearch.com/2021/11/22/improving-ocr-results-with-basic-image-processing/
(2)https://stackoverflow.com/questions/33881175/remove-background-noise-from-image-to-make-text-more-clear-for-ocr
發(fā)布評(píng)論請(qǐng)先 登錄
端側(cè)OCR文字識(shí)別實(shí)現(xiàn) -- Core Vision Kit ##HarmonyOS SDK AI##
OCR文字距離太近應(yīng)該如何處理?
OCR SDK開發(fā)者平臺(tái)推薦:OCR圖像智能字符識(shí)別技術(shù)
Labview調(diào)用OCR Training.exe實(shí)現(xiàn)字符識(shí)別
Labview怎么實(shí)現(xiàn)對(duì)OCR識(shí)別定位,在線急等
Python OCR 識(shí)別庫(kù)-ddddocr
【KV260視覺(jué)入門套件試用體驗(yàn)】七、VITis AI字符和文本檢測(cè)(OCR&Textmountain)
車號(hào)圖像處理與識(shí)別系統(tǒng)的研制
什么是OCR
基于FPGA的OCR文字識(shí)別技術(shù)的深度解析
移動(dòng)端證件OCR識(shí)別/安卓IOS平臺(tái)
OCR光學(xué)字符識(shí)別技術(shù)原理講解
OCR識(shí)別技術(shù)
OCR實(shí)戰(zhàn)教程
 
    
OCR如何自動(dòng)識(shí)別圖片文字
 
    
 
           
        
 
         通過(guò)圖像處理改善OCR識(shí)別結(jié)果的實(shí)例
通過(guò)圖像處理改善OCR識(shí)別結(jié)果的實(shí)例 
                 
  
            
             
             
                 
             工商網(wǎng)監(jiān)
工商網(wǎng)監(jiān)
        
評(píng)論