Pattern Recognition or Optical Character Recognition (OCR) is a pipelined process consisting of several stages in proper sequence. They are shown in figure 2.
Each character is represented as a combination of pixels. All pixels together make a huge feature vector. Total number of pixels is equal to wh where w is the number of pixel in width side and h is the pixels present in height. Figure 3 depicts the way pixel forms one particular character. xi is the fraction of ink in pixel i. Classifier must be adaptive (generalize) in nature so that it can be able to recognize patterns encountering first time. A typical character image is 6464 pixels large and for each such pixel 256 grey values are required making feature space large. For training a recognizer hence, requires huge amount of data to fill this vast space. In order to reduce the dimension space Principal Component Analysis is mostly used which transforms into lower dimension space (Yeung & Ruzzu, 2001).
OCR also should make a distinguishing between 'O' and '6'. Figure 4 shows one case example. If t/b comes smaller that means letter is 'O' otherwise '6'. A good algorithm must define the tolerance level (T) adequately. ...Show more