5- A Multi-cascaded Deep Model for Bilingual SMS Classification

Published in International Conference on Neural Information Processing (ICONIP), 2019

Recommended citation: Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan (2019).A Multi-cascaded Deep Model for Bilingual SMS Classification. International Conference on Neural Information Processing (ICONIP). https://doi.org/10.1007/978-3-030-36708-4_24

Download paper here

Abstract: Most studies on text classification are focused on the English language. However, short texts such as SMS are influenced by regional languages. This makes the automatic text classification task challenging due to the multilingual, informal, and noisy nature of language in the text. In this work, we propose a novel multi-cascaded deep learning model called McM for bilingual SMS classification. McM exploits n-gram level information as well as long-term dependencies of text for learning. Our approach aims to learn a model without any code-switching indication, lexical normalization, language translation, or language transliteration. The model relies entirely upon the text as no external knowledge base is utilized for learning. For this purpose, a 12 class bilingual text dataset is developed from SMS feedbacks of citizens on public services containing mixed Roman Urdu and English languages. Our model achieves high accuracy for classification on this dataset and outperforms the previous model for multilingual text classification, highlighting language independence of McM.