•  
  •  
 

Al-Bahir Journal for Engineering and Pure Sciences

Abstract

The proliferation of social networking sites and their user base has led to an exponential increase in the amount of data generated on a daily basis. Textual content is one type of data that is commonly found on these platforms, and it has been shown to have a significant impact on decision-making processes at the individual, group, and national levels. One of the most important and largest part of this data are the texts that express human intentions, feelings and condition. Understanding these texts is one of the biggest challenges that facing data analysis. It is the backbone for understanding people, their orientations, and making decisions in many cases and thus predicting their behavior. In this paper, a model was proposed for understanding texts that written by people on social media platforms, and hence knowing people's attitudes within specific topics, the emotion of those people, positivity, negativity, and neutrality. Also, it extracts emotion of those people. In this context, the system solves many tasks in natural language processing therefore it uses many techniques including topic classifier, sentiment analyzer, sarcasm detector and emotion classifier. CNN-BiLSTM was used for topic classifier, sentiment analyzer, sarcasm detector, and emotion classifier where (f-measure, accuracy) were (97,97.58) %, (84,86) %, (95,97) %, and (82,81.6) % respectively.

References

  1. Giachanou A, Crestani F. Like it or not: a survey of twitter sentiment analysis methods. ACM Comput Surv Jun. 2017; 49(2):1e41. https://doi.org/10.1145/2938640.
  2. Kumar A, Sharma A. Twitter sentiment analysis using machine learning: a review. International Research Journal of Engineering and Technology; 2020 [Online]. Available: www.imdb.com.
  3. Wankhade M, Rao ACS, Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artif Intell Rev, Oct 2022;55(7):5731e80. https://doi.org/10.1007/s10462-022-10144-1.
  4. Das S, Chen M. Yahoo! for Amazon: extracting market sentiment from stock message boards. In: Asia pacific finance association annual conf. APFA); 2001.
  5. Tong R. An operational system for detecting and tracking opinions in on-line discussions. In: Working notes of the SIGIR workshop on operational text classification. Louisianna: New Orleans; 2001. p. 1e6.
  6. P. D. Turney, “Thumbs up or thumbs down? Semantic Orientation Applied to Unsupervised Classification of Reviews.” [Online]. Available: http://www.google.com.
  7. Pang B, Lee L, Vaithyanathan S. Thumbs up?. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP ’02. Morristown, NJ, USA: Association for Computational Linguistics; 2002. p. 79e86. https://doi.org/10.3115/1118693.1118704.
  8. Nasukawa T, Yi J. Sentiment analysis. In: Proceedings of the 2nd international conference on Knowledge capture. New York, NY, USA: ACM; Oct. 2003. p. 70e7. https://doi.org/10.1145/945645.945658.
  9. Wang Y, Guo J, Yuan C, Li B. Sentiment analysis of twitter data. Appl Sci Nov. 2022;12(22):11775. https://doi.org/10.3390/app122211775.
  10. Li W, Qi F, Tang M, Yu Z. Bidirectional LSTM with selfattention mechanism and multi-channel features for sentiment classification. Neurocomputing Apr. 2020;387:63e77. https://doi.org/10.1016/j.neucom.2020.01.006.
  11. Sanagar S, Gupta D. Unsupervised genre-based multidomain sentiment lexicon learning using corpus-generated polarity seed words. IEEE Access 2020;8:118050e71. https://doi.org/10.1109/ACCESS.2020.3005242.
  12. El Karfi I, El Fkihi S. An ensemble of Arabic transformerbased models for Arabic sentiment analysis. Int J Adv Comput Sci Appl 2022;13(8). https://doi.org/10.14569/IJACSA.2022.0130865.
  13. Benrouba F, Boudour R. Emotional sentiment analysis of social media content for mental health safety. 2022. https://doi.org/10.21203/rs.3.rs-2170906/v1.
  14. Qin Y, Shi Y, Hao X, Liu J. Microblog text emotion classification algorithm based on TCN-BiGRU and dual attention.Information 2023;vol. 14(2). https://doi.org/10.3390/info14020090.
  15. Paul MJ, Dredze M. Discovering health topics in social media using topic models.PLoS One 2014;9(8):e103408.
  16. Albalawi R, Yeap TH, Benyoucef M. Using topic modeling methods for short-text data: a comparative analysis.Front Artif Intell 2020;3.https://doi.org/10.3389/frai.2020.00042.
  17. Teh PL, Piao S, Almansour M, Ong HF, Ahad A.Analysis of popular social media topics regarding plastic pollution.Sustainability 2022;14(3):1709.https://doi.org/10.3390/su14031709.
  18. Mihunov VV, Jafari NH, Wang K, Lam NSN, Govender D.Disaster impacts surveillance from social media with topic modeling and feature extraction: case of Hurricane Harvey.International Journal of Disaster Risk Science Oct. 2022;13(5):729e42.https://doi.org/10.1007/s13753-022-00442-1.
  19. He W, Zha S, Li L.Social media competitive analysis and text mining: a case study in the pizza industry.Int J Inf Manag Jun. 2013;33(3):464e72.https://doi.org/10.1016/j.ijinfomgt. 2013.
  20. Panchendrarajan R, Amaresan A.Bidirectional LSTM-CRF for named entity recognition. 2018.
  21. Arif MH, Li J, Iqbal M.Solving social media text classification problems using code fragment-based XCSR.In: IEEE 29th international conference on tools with artificial intelligence (ICTAI). IEEE; 2017.p. 485e92. https://doi.org/10.1109/ICTAI.2017.00080
  22. Almahdawi AJ, Teahan WJ. A new Arabic dataset for emotion recognition. In: Advances in intelligent systems and computing. Springer Verlag; 2019. p. 200e16. https://doi.org/10.1007/978-3-030-22868-2_16.
  23. Abu Farha I, Zaghouani W, Magdy W. Overview of the WANLP 2021 shared task on sarcasm and sentiment detection in Arabic. In: Proceedings of the sixth Arabic Natural Language Processing workshop, Kyiv, Ukraine (Virtual): Association for Computational Linguistics; Apr. 2021. p. 296e305 [Online]. Available: https://aclanthology.org/2021.wanlp-1.36.
  24. Alyami SN, Olatunji SO. Application of support vector machine for Arabic sentiment classification using twitter-based dataset. J Inf Knowl Manag 2020;19(1):2040018. https://doi.org/10.1142/S0219649220400183.
  25. Elmadany A, Mubarak H, Magdy W. ArSAS: an Arabic speech-act and sentiment corpus of tweets. In: Proceedings of the 3rd workshop on open-source Arabic corpora and processing tools; 2018. p. 13e7. co-located with LREC 2018.
  26. Abdulla NA, Mahyoub NA, Shehab M, Al-Ayyoub M. Arabic sentiment analysis: corpus-based and lexicon-based. In: IEEE conference on applied electrical engineering and computing technologies (AEECT); Dec. 2013. p. 1e6. Amman, Jordan.
  27. Inoue G, Alhafni B, Baimukan N, Bouamor H, Habash N. The interplay of variant, size, and task type in Arabic pretrained language models. In: Proceedings of the sixth Arabic Natural Language Processing workshop, kyiv, Ukraine (online): Association for Computational Linguistics; Apr. 2021.
  28. Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Jul. 2016.
  29. Arabic Language Technologies Group. Arabic word embeddings model. Qatar Computing Research Institute; 2017.
  30. Soliman AB, Eisa K, El-Beltagy SR. AraVec: a set of Arabic word embedding models for use in Arabic NLP. In: Proceedings of the 3rd international conference on Arabic computational linguistics (ACLing 2017). Dubai: UAE; 2017.
  31. Darwish K, Magdy W, Mourad A. Language processing for Arabic microblog retrieval. In: Proceedings of the 21st ACM international conference on Information and knowledge management; 2012. p. 2427e30.
  32. M. Alhanjouri, “Pre processing techniques for Arabic documents clustering,” Int J Eng Manag Res, no. 7, [Online]. Available: www.ijemr.net.
  33. Schnabel T, Labutov I, Mimno D, Joachims T. Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 conference on empirical methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics; 2015. p. 298e307. https://doi.org/10.18653/v1/D15-1036.
  34. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: Proceedings of the international conference on learning representations (ICLR); 2013.
  35. Bergstra J, Ca JB, Ca YB. Random search for hyper-parameter optimization yoshua bengio. 2012 [Online]. Available: http://scikit-learn.sourceforge.net.
  36. Kumar A, Sebastian TM. Sentiment analysis on twitter. 2012 [Online]. Available: www.youtube.com.
  37. Ismail HM, Harous S, Belkhouche B. A comparative analysis of machine learning classifiers for twitter sentiment analysis. Research in Computing Science Dec. 2016;110(1):71e83. https://doi.org/10.13053/rcs-110-1-6.
  38. Li M, Ch’ng E, Chong AYL, See S. “Multi-class Twitter sentiment classification with emojis". Ind Manag Data Syst 2018;118(9):1804e1820, Sep. https://doi.org/10.1108/IMDS-12-2017-0582.
  39. Lopez-Chau A, Valle-Cruz D, Sandoval-Almazan R. Sentiment analysis of twitter data through machine learning techniques. 2020. p. 185e209. https://doi.org/10.1007/978-3-030-33624-0_8.
  40. Valencia JD, Al Joseph TL, Centino NM, Fabito BS, Imperial J.M.Rodriguez RL, et al. Understanding anonymous social media posts using topic modeling. In: 2019 IEEE 11th international conference on humanoid, nanotechnology, information technology, Communication and Control, Environment,and Management (HNICEM). IEEE; Nov. 2019. p. 1e4.https://doi.org/10.1109/HNICEM48295.2019.9072791.
  41. Sit MA, Koylu C, Demir I. Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning,natural language processing and spatial analysis: a case study of Hurricane Irma.Int J Digit Earth, Nov 2019;12(11):1205e29.https://doi.org/10.1080/17538947.2018.1563219.

Share

COinS