SEVERAL METHODS OF FEATURE EXTRACTION TO HELP IN OPTICAL CHARACTER RECOGNITION

  • Binod Kumar Prasad Bengal College of Engineering and Technology, Durgapur, W.B., India.
  • Rajdeep Kundu Bengal College of Engineering, Durgapur, W.B., India.
Keywords: Optical Character Recognition, Feature extraction, Chain code, Directional features, Bengali numerals recognition system.

Abstract

An Optical Character Recognition (OCR) consists of three bold steps namely Preprocessing, Feature extraction, Classification. Methods of Feature extraction yield feature vectors based on which the classification of a testing pattern is executed. The paper aims at proposing some  methods of feature extraction that may go a long way to recognize a Bengali numeral or character. Pixel Ex-OR Method presents a digital gating (Ex-OR) technique to extract the information in an image. Two successive elements of a row in image matrix have been Ex-ORed and the output is again Ex-ORed with the next element.  Alphabetical coding codes a binary character image by means of letters of English alphabet. Directional features find gradient information using Sobel Masks to make position of stroke clear in an image. The features have been derived in eight standard directions and then these eight feature vectors are merged into four sets of features to reduce the system complexity and hence processing time is saved considerably. These features will help develop a Bengali numeral recognition system.

References

U. Pal and B. B. Chaudhuri; Indian script character recognition: a survey; Pattern Recognition; 2004, Vol. 37, No.9, Pg. 1887– 1899.

A. Dutta and S. Chaudhury; Bengali alpha-numeric character-recognition using curvature features; Pattern Recognition; 1993, Vol. 26, No. 12, Pg. 1757–1770.

U. Bhattacharya, T. K. Das, A. Datta, S. K. Parui, and B. B. Chaudhuri; Recognition of handprinted bangla numerals using neural network models. Springer, Berlin / Heidelberg, 2002, Pg. 139–161.

R. Bajaj, L. Dey and S. Chaudhur;. Devnagari numeral recognition by combining decision of multiple connectionist classifiers, Springer, 2002, Vol. 27(Part 1), Pg. 59–72.

W. Lu, Y. Lu, Y. Pengfei and S. Pengfei; Handwritten bangla numeral recognition system and its application to postal automation, Pattern Recognition, 2007, Vol. 40, No. 1, Pg. 99– 107).

U. Bhattacharya and B. B. Chaudhuri; Handwritten numeral databases of indian scripts and multistage recognition of mixed numerals; IEEE Trans. on Pattern Analysis and Machine Intelligence; 2009, Vol. 31, No. 3, Pg. 444–457.

A.Sikdar, S. Banerjee, P. Roy, S. Mukherjee and M. Das; Bengali printed character recognition using a feature based chain code method, Advances in Image and Video Processing, 2014, Vol. 2, No. 3, Pg. 01–09.

Muhammad Arif Mohamad, Haswadi Hassan, Dewi Nasien and Habibollah Haron,;A Review on Feature Extraction and Feature Selection for Handwritten Character Recognition; International Journal of Advanced Computer Science and Applications, 2015, Vol. 6, No. 2, Pg. 204-212.

Published
2017-01-25
Section
Articles