Pattern Recognition or Optical Character Recognition (OCR) is a pipelined process consisting of several stages in proper sequence. They are shown in figure 2.
Each character is represented as a combination of pixels. All pixels together make a huge feature vector. Total number of pixels is equal to wh where w is the number of pixel in width side and h is the pixels present in height. Figure 3 depicts the way pixel forms one particular character. xi is the fraction of ink in pixel i. Classifier must be adaptive (generalize) in nature so that it can be able to recognize patterns encountering first time. A typical character image is 6464 pixels large and for each such pixel 256 grey values are required making feature space large. For training a recognizer hence, requires huge amount of data to fill this vast space. In order to reduce the dimension space Principal Component Analysis is mostly used which transforms into lower dimension space (Yeung & Ruzzu, 2001).
OCR also should make a distinguishing between 'O' and '6'. Figure 4 shows one case example. If t/b comes smaller that means letter is 'O' otherwise '6'. A good algorithm must define the tolerance level (T) adequately. Other examples of such cases are letter 'q' and digit '9'.
There are various algorithm or computer processes available for pattern recognition. One such example is Brian Sanderson's Pattern Recognition (PR) Algorithm.
Every patter is identified according to three systems of notation:
333 Conway Thurston Notation.
P3 The International Union of Crystallography notation.
S333 The Montesinos Notation
First identify the maximum rotation number whether it is 1, 2, 3, 4 or 6. Then check any mirror is present or not (m). Is there any indecomposable glide reflection (g) And finally whether there is any rotation axis on the mirror present or not.
Other than this Genetic Algorithm also works as PR. Selection of patterns play an important role in PR process as it determines the accuracy of algorithm, its learning time, and the necessary number of samples. Best selection of feature plays important role at the time developing classifiers. The problem comes out more difficult when number of features become very large. Genetic Algorithm (GA) gives better result in that. As they are effective in rapid global search of large, nonlinear and sparsely spaced points, GA is applied for feature recognition problem. It combines different optimization problem into a single formulation problem (Morita).
Most effective approach today for OCR is Neural Network based recognition.
Neural Network: An Overview
A Neural network is a massively parallel distributed processor made up of simple processing units, which has a natural propensity for storing experimental knowledge and making it available for use. It resembles the brain