Detection And Removal Of Hand-Drawn Annotation Lines In Scanned Image

Performance of optical character recognition is badly affected due to of various unwanted variations in a document. When these documents are used in optical character recognition then it will create misrecognition. The work presented in this paper is to remove the hand-drawn underlines and annotation lines from document like circular annotation, strikethrough lines, straight underlines, touched and untouched underlines, broken underlines and other text surrounding lines. Based on this observation, an RGB image is converted into lab color space for global features then divided into different clusters so that annotated area should be extracted from the background. Novelty of this method lies in its ability to compute the annotation area whether it touches the word and also detect strikethrough lines. Inpainting is used for removing of lines and filled the lost parts. Index Terms- Hand-Drawn Annotation Lines, Inpainting, Optical Character Recognition, Text Surrounding Lines.