IJLLL 2016 Vol.2(4): 164-168 ISSN: 2382-6282
DOI: 10.18178/IJLLL.2016.2.4.88

Text Analysis and Information Retrieval of Historical Tamil Ancient Documents Using Machine Translation in Image Zoning

E. K. Vellingiriraj, M. Balamurugan, and P. Balasubramanie
Abstract—The aim of this paper is to develop a system that involves character recognition of Brahmi, Grantha and Vattezuthu Characters from palm manuscripts of Historical Tamil Ancient Documents, anaylsed the text and machine translated the present Tamil digital text format. Though many researchers have implemented various algorithms and techniques for character recognition in different languages, Ancient characters conversion still poses a big challenge. Because Image recognition technology has reached near-perfection when it comes to scanning English and other language text. But optical character recognition (OCR) software capable of digitizing printed Tamil text with high levels of accuracy is still elusive. Only a few people are familiar with the ancient characters and make attempts to convert them into written documents manually. The proposed system overcomes such a situation by converting all the ancient historical documents from inscriptions and palm manuscripts into Tamil digital text format. It converts the digital text format using Tamil Unicode. Our algorithm comprises different stages: i) image preprocessing, ii) feature extraction, iii) character recognition and iv) digital text conversion. The first phase conversion accuracy of the Brahmi script rate of our algorithm is 91.57% using the neural network and image zoning method. The second phase of the vettezhuthu character set is to be implemented. Conversion accuracy of Vattezhuthu is 89.75%.

Index Terms—Character recognition, vattezhuthu, segmentation, image zoning, machine translation.

E. K. Vellingiriraj and P. Balasubramanie are with the Department of Computer Science and Engineering, Kongu Engineering College, Perundurai, Tamil Nadu, India (e-mail: girirajek@rediffmail.com, balu_p@kongu.ac.in).
M. Balamurugan is with the Department of Computer Science and Engineering, Christ University, Bangalore, Karnataka, India (e-mail: balamurugan.m@christuniversity.in).

[PDF]

Cite:E. K. Vellingiriraj, M. Balamurugan, and P. Balasubramanie, "Text Analysis and Information Retrieval of Historical Tamil Ancient Documents Using Machine Translation in Image Zoning," International Journal of Languages, Literature and Linguistics vol. 2, no. 4, pp. 164-168, 2016.

Copyright©2008-2017. International Journal of Languages, Literature and Linguistics. All rights reserved.
E-mail: ijlll@ejournal.net