TY - GEN
T1 - Bangla text processing and recognition based on Fuzzy unsupervised Feature Extraction and SVM
AU - Haque Monil, M. A.
AU - Zulkar Nine, Md S.Q.
AU - Poon, Bruce
AU - Ashraful Amini, M.
AU - Yan, Hong
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2013
Y1 - 2013
N2 - Optical character recognition (OCR) is a widely used technology to convert text images to editable text. Researchers already proposed many machine learning algorithms to address this problem. However, Bangla text recognition is highly challenging job for its complicated writing style, compound characters and highly diversified fonts. To address the segmentation problem we have proposed an algorithm namely Blob-Labeled character Segmentation (BLCS) that initiates with an extensive preprocessing to extract the characters from text. Our novel character segmentation procedure extracts characters maintaining 97.5% accuracy. Unsupervised feature learning becomes a powerful tool in machine learning nowadays. To increase the recognition rate of the characters, we have introduced a fuzzy unsupervised feature learning algorithm to learn features of individual characters. We then use Artificial Neural Network (ANN) and Support Vector Machine (SVM) to classify the characters. The SVM provides 99.4% accuracy which outperforms all other approaches.
AB - Optical character recognition (OCR) is a widely used technology to convert text images to editable text. Researchers already proposed many machine learning algorithms to address this problem. However, Bangla text recognition is highly challenging job for its complicated writing style, compound characters and highly diversified fonts. To address the segmentation problem we have proposed an algorithm namely Blob-Labeled character Segmentation (BLCS) that initiates with an extensive preprocessing to extract the characters from text. Our novel character segmentation procedure extracts characters maintaining 97.5% accuracy. Unsupervised feature learning becomes a powerful tool in machine learning nowadays. To increase the recognition rate of the characters, we have introduced a fuzzy unsupervised feature learning algorithm to learn features of individual characters. We then use Artificial Neural Network (ANN) and Support Vector Machine (SVM) to classify the characters. The SVM provides 99.4% accuracy which outperforms all other approaches.
KW - Artificial neural network (ANN)
KW - Optical character recognition (OCR)
KW - Support vector machine (SVM)
KW - Type-I fuzzy system
KW - Unsupervised feature learning
UR - http://www.scopus.com/inward/record.url?scp=84907270880&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2013.6890784
DO - 10.1109/ICMLC.2013.6890784
M3 - Conference contribution
AN - SCOPUS:84907270880
T3 - Proceedings - International Conference on Machine Learning and Cybernetics
SP - 1272
EP - 1278
BT - Proceedings - International Conference on Machine Learning and Cybernetics
PB - IEEE Computer Society
T2 - 12th International Conference on Machine Learning and Cybernetics, ICMLC 2013
Y2 - 14 July 2013 through 17 July 2013
ER -