Bangla text processing and recognition based on Fuzzy unsupervised Feature Extraction and SVM

M. A. Haque Monil, Md S.Q. Zulkar Nine, Bruce Poon, M. Ashraful Amini, Hong Yan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Optical character recognition (OCR) is a widely used technology to convert text images to editable text. Researchers already proposed many machine learning algorithms to address this problem. However, Bangla text recognition is highly challenging job for its complicated writing style, compound characters and highly diversified fonts. To address the segmentation problem we have proposed an algorithm namely Blob-Labeled character Segmentation (BLCS) that initiates with an extensive preprocessing to extract the characters from text. Our novel character segmentation procedure extracts characters maintaining 97.5% accuracy. Unsupervised feature learning becomes a powerful tool in machine learning nowadays. To increase the recognition rate of the characters, we have introduced a fuzzy unsupervised feature learning algorithm to learn features of individual characters. We then use Artificial Neural Network (ANN) and Support Vector Machine (SVM) to classify the characters. The SVM provides 99.4% accuracy which outperforms all other approaches.

Original languageEnglish
Title of host publicationProceedings - International Conference on Machine Learning and Cybernetics
PublisherIEEE Computer Society
Pages1272-1278
Number of pages7
ISBN (Electronic)9781479902576
DOIs
StatePublished - 2013
Externally publishedYes
Event12th International Conference on Machine Learning and Cybernetics, ICMLC 2013 - Tianjin, China
Duration: Jul 14 2013Jul 17 2013

Publication series

NameProceedings - International Conference on Machine Learning and Cybernetics
Volume3
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Conference

Conference12th International Conference on Machine Learning and Cybernetics, ICMLC 2013
Country/TerritoryChina
CityTianjin
Period07/14/1307/17/13

Keywords

  • Artificial neural network (ANN)
  • Optical character recognition (OCR)
  • Support vector machine (SVM)
  • Type-I fuzzy system
  • Unsupervised feature learning

Fingerprint

Dive into the research topics of 'Bangla text processing and recognition based on Fuzzy unsupervised Feature Extraction and SVM'. Together they form a unique fingerprint.

Cite this