APPROACH TO TEXT EXTRACTION FROM IMAGE

Disha Bhat, Charitha D, Dimple M K, Amruthashree R V, Shruthi G

Abstract


The multimedia resources in a database and on the web are increasing. The multimedia resources can be images and videos. It has become a very difficult task to develop the effective methods to manage as well as to retrieve these resources by their content. The Text which is an important object which carries high-level semantic information which is useful for this task. The current technologyis optical character recognition (OCR) is used to convert machine generated text which is printed against clean background to computer readable form (ASCII). But, text isoften printed against shaded or textured backgrounds or is embedded in images.Examples include maps, photographs, advertisements, videos etc. Current document segmentation and recognition technologies cannot handle these situations well. Our system takes advantages of the distinctive characteristics of text that make it stand out from other image material that is, text possesses certain frequency and orientation information. We will first clean the image by changing the contrast and gradient of the image. Now the objects in the images are identified and numbered. Further in the text recognition process, these numbered objects are segregated into text and non-text. Later the recognised text is reconstructed to form a meaningful text present in the image. Also we are focusing on extracting the text such that certain portion of the images such as logos etc is retained. This is done by calculating the pixels of the required portion of the image to be retained and then training the system in such a way that it extracts all the text except the portion of the image to be retained.



Keywords


image, text, extraction

Full Text:

PDF

References


Deepayan Sarkar "Optical Character Recognition using Neural Networks" University of Wisconsin MadisonECE 539 Project, Fall 2003. [2] “Evaluation of OCR Algorithms for Images with Different Spatial Resolutions and Noises” School of Information Technology and Engineering Faculty of Engineering University of Ottawa©Qing Chen, Ottawa, Canada, 2003. [3] “A Neural Network Implementation of Optical Character Recognition” Technical Report Number CSSE10-05 COMP 6600 – Artificial Intelligence Spring 2009. [4] Sukhpreet Singh M.tech Student “Optical Character Recognition Techniques: A Survey”, Dept. of Computer Engineering, YCOE Talwandi Sabo BP. India. [5] Amarjot Singh, ketanbacchuwar, Akshaybhasin“Survey of OCR Applications”.International Journal of Machine Learning and Computing , June 2012 [6] M.D. Ganis, C.L. Wilson, J.L. Blue, “Neural network-based systems for handprint OCR applications” in IEEE Transactions on Image Processing, 1998, Vol: 7, Issue: 8, p.p. 1097 – 1112. [7] Sadagopan Srinivasan, Li Zhao, Lin Sun, Zhen Fang, Peng Li, Tao Wang,RavishankarIyer, Ramesh Illikkal ,“Performance Characterization and Acceleration of Optical Character Recognition on Handheld Platforms”, IEEE December 2010, DOI: 10.1109/IISWC.2010.5648852 [8] Sonia Bhaskar, Nicholas Lavassar, Scott GreenEE “Implementing Optical Character Recognition on the Android Operating System for Business Cards” 368 Digital Image Processing, 2010. [9] Anitha Mary M.O. Chacko and P.M. Dhanya“A Comparative Study of Different Feature Extraction Techniques for Offline Malayalam Character Recognition” Springer India 2015, DOI 10.1007/978-81-322-2208-8_2




DOI: https://doi.org/10.26483/ijarcs.v9i0.6270

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 International Journal of Advanced Research in Computer Science