SEGMENTATION-FREE RECOGNITION OF URDU SCRIPT USING HMM
Main Article Content
Abstract
All the Urdu literature is in the form of manuscripts and typewritten books.There is a need for converting all these physical libraries into electronic libraries. Various OCRs have been developed for different languages and are widely used. Building a complete Urdu OCR is a difficult task because Urdu is highly cursive language, where ligatures overlap and style variation poses challenges to the recognition system.
We are describing a technique for automatic recognition of off-line printed Urdu text using Hidden Markov Models. Our method does not require segmentation into characters and considers each shape of Urdu character as different class resulting in a total of 196 classes (compared to 38 Urdu letters). This paper presents a novel feature extraction method based on sliding window technique, using only 16 statistical features from each sliding window thereby eliminating the need for segmentation of Urdu text. The dependency of Recognition rate of Urdu script upon, the number of states of HMM, different sizes of hierarchical window and different fonts is presented. We are using HTK (Hidden Markov Model Toolkit) for training, recognition and result analysis.
We are describing a technique for automatic recognition of off-line printed Urdu text using Hidden Markov Models. Our method does not require segmentation into characters and considers each shape of Urdu character as different class resulting in a total of 196 classes (compared to 38 Urdu letters). This paper presents a novel feature extraction method based on sliding window technique, using only 16 statistical features from each sliding window thereby eliminating the need for segmentation of Urdu text. The dependency of Recognition rate of Urdu script upon, the number of states of HMM, different sizes of hierarchical window and different fonts is presented. We are using HTK (Hidden Markov Model Toolkit) for training, recognition and result analysis.
Downloads
Download data is not yet available.
Article Details
Section
Articles
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.