Sentiment analysis and prediction of polarity vaccines based on Twitter data using deep NLP techniques

Hassan Badi, Imad Badi, Karim El Moutaouakil, Aziz Khamjane, Abdelkhalek Bahri

Abstract


The global impact of COVID-19 has been significant and several vaccines have been developed to combat this virus. However, these vaccines have varying levels of efficacy and effectiveness in preventing illness and providing immunity. As the world continues to grapple with the ongoing pandemic, the development and distribution of effective vaccines remains a top priority, making monitoring prevention strategies mandatory and necessary to mitigate the spread of the disease. These vaccines have raised a huge debate on social networks and in the media about their effectiveness and secondary effects. This has generated big data, requiring intelligent tools capable of analyzing these data in depth and extracting the underlying knowledge and feelings. There is a scarcity of works that analyze feelings and the prediction of these feelings based on their estimated polarities at the same time. In this work, first, we use big data and Natural Language Processing (NLP) tools to extract the entities expressed in tweets about AstraZeneca and Pfizer and estimate their polarities; second, we use a Long Short-Term Memory (LSTM) neural network to predict the polarities of these two vaccines in the future. To ensure parallel data treatment for large-scale processing via clustered systems, we use the Apache Spark Framework (ASF) which enables the treatment of massive amounts of data in a distributed way. Results showed that the Pfizer vaccine is more popular and trustworthy than AstraZeneca. Additionally, according to the predictions generated by Long Short-Term Memory (LSTM) model, it is likely that Pfizer will continue to maintain its strong market position in the foreseeable future. These predictive analytics, which uses advanced machine learning techniques, have proven to be accurate in forecasting trends and identifying patterns in data. As such, we have confidence in the LSTM's prediction of Pfizer's ongoing dominance in the industry.

Keywords


Natural Language Processing (NLP); Machine learning; Big Data; COVID-19; Sentiment analysis; Prediction; Vaccines; Long short-term memory (LSTM); Apache Spark Framework (ASF)

Full Text:

PDF

References


Shahriar, K. T., Islam, M. N., Anwar, M. M. and Sarker, I. H. COVID-19 analytics: Towards the effect of vaccine brands through analyzing public sentiment of tweets. Informatics in Medicine Unlocked, 2022, vol. 31, article no. 100969. DOI: 10.1016/j.imu.2022.100969.

Bibi, M., Abbasi, W. A., Aziz, W., Khalil, S., Uddin, M., Iwendi, C., Gadekallu, T. R. A novel unsupervised ensemble framework using concept-based linguistic methods and machine learning for twitter sentiment analysis. Pattern Recognition Letters, 2022, vol. 158, pp. 80-86. DOI: 10.1016/j.patrec.2022.04.004.

Agarwal, A., Xie, B., Vovsha, I., Rambow, O. and Passonneau, R. Sentiment Analysis of Twitter Data. Proceedings of the Workshop on Language in Social Media (LSM 2011), 2011, pp. 30-38. Available at: https://aclanthology.org/W11-0705. (accessed March 20, 2022)

Sunitha, D., Patra, R. K., Babu, N. V., Suresh, A. and Gupta, S. C. Twitter sentiment analysis using ensemble based deep learning model towards COVID-19 in India and European countries. Pattern Recognition Letters, 2022, vol. 158, pp. 164-170. DOI: 10.1016/j.patrec.2022.04.027.

Joshi, M., Prajapati, P., Shaikh, A. and Vala, V. A Survey on Sentiment Analysis. International Journal of Computer Applications, 2017, vol. 163, no. 6, pp. 34-38. DOI: 10.5120/ijca2017913552.

Chumachenko, D., Pyrohov, P., Meniailov, I. and Chumachenko, T. Impact of war on COVID-19 pandemic in Ukraine: the simulation study. Radioelectronic and Computer Systems, 2022, no. 2, pp. 6-23. DOI: 10.32620/reks.2022.2.01.

About Worldometer. Available at: https://www.worldometers.info/about/ (accessed March 28, 2022).

Zhang, H., Zang, Z., Zhu, H., Uddin, M. I. and Amin, M. A. Big data-assisted social media analytics for business model for business decision making system competitive analysis. Information Processing & Management, 2022, vol. 59, iss. 1, article no. 102762. DOI: 10.1016/j.ipm.2021.102762.

Chen, Y.-J. and Chen, Y.-M. Forecasting corporate credit ratings using big data from social media. Expert Systems with Applications, 2022, vol. 207, article no. 118042. DOI: 10.1016/j.eswa.2022.118042.

Chumachenko, D., Chumachenko, T., Kirinovych, N., Meniailov, I., Muradyan, O. and Salun, O. Barriers of COVID-19 vaccination in Ukraine during the war: the simulation study using ARIMA model. Radioelectronic and Computer Systems, 2022, no. 3, pp. 20-35. DOI: 10.32620/reks.2022.3.02.

Wongkoblap, A., Vadillo, M. A. and Curcin, V. 6 - Social media big data analysis for mental health research. Mental Health in a Digital World, Academic Press Publ., 2022, pp. 109-143. DOI: 10.1016/B978-0-12-822201-0.00018-6.

Afifi, R. A. et al. ’Most at risk’ for COVID19? The imperative to expand the definition from biological to social factors for equity. Preventive Medicine, 2020, vol. 139, article no. 106229. DOI: 10.1016/j.ypmed.2020.106229.

Chinnasamy, P., Suresh, V. et al. COVID-19 vaccine sentiment analysis using public opinions on Twitter. Materials Today: Proceedings, 2022, vol. 64, Part 1, pp. 448-451. DOI: 10.1016/j.matpr.2022.04.809.

Nezhad, Z. B. and Deihimi, M. A. Twitter sentiment analysis from Iran about COVID 19 vaccine. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 2022, vol. 16, no. 1, article no. 102367. DOI: 10.1016/j.dsx.2021.102367.

Paul, S. Analyzing the attitude of Indian citizens during the second wave of COVID-19: A text analytics study. International Journal of Disaster Risk Reduction, 2022, vol. 79, article no. 103161. DOI: 10.1016/j.ijdrr.2022.103161.

Anastasiou, D., Ballis, A. and Drakos, K. Constructing a positive sentiment index for COVID-19: Evidence from G20 stock markets. Social Science Research Network, 2021, 38 p. DOI: 10.2139/ssrn.3895548.

Huynh, T. L. D., Foglia, M., Nasir, M. A. and Angelini, E. Feverish sentiment and global equity markets during the COVID-19 pandemic. Journal of Economic Behavior & Organization, 2021, vol. 188, pp. 1088-1108. DOI: 10.1016/j.jebo.2021.06.016.

Sv, P., Tandon, J., Vikas, and Hinduja, H. Indian citizen’s perspective about side effects of COVID-19 vaccine – A machine learning study. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 2021, vol. 15, iss. 4, article no. 102172. DOI: 10.1016/j.dsx.2021.06.009.

Hosgurmath, S., Petli, V. and Jalihal, V. K. An Omicron Variant Tweeter Sentiment Analysis Using NLP Technique. Global Transitions Proceedings, 2022, vol. 3, iss. 1, pp. 215-219. DOI: 10.1016/j.gltp. 2022.03.025.

Zulfiker, M. S., Kabir, N., Biswas, A. A., Zulfiker, S. and Uddin, M. S. Analyzing the public sentiment on COVID-19 vaccination in social media: Bangladesh context. Array, 2022, vol. 15, article no. 100204. DOI: 10.1016/j.array.2022.100204.

Kudo, M., Toyama, J. and Shimbo, M. Multidimensional curve classification using passing-through regions. Pattern Recognition Letters, 1999, vol. 20, no. 11-13, pp. 1103-1111. DOI: 10.1016/S0167-8655(99)00077-X.

Baytas, I. M., Xiao, C., Zhang, X. S., Wang, F., Jain, A. K. and Zhou, J. Patient Subtyping via Time-Aware LSTM Networks. KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 65-74. DOI: 10.1145/3097983.3097997.

Voelker, A. R., Kajić, I. and Eliasmith, C. Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks. NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canad, 2019, article no. 1395, pp. 15570-15579.

Kingma, D. P.and Ba, J. Adam: A Method for Stochastic Optimization. arXiv, 2014. 15 p. DOI: 10.48550/ARXIV.1412.6980.

Hamzah, F. A. et al. CoronaTracker: World-wide COVID-19 Outbreak Data Analysis and Prediction. 2020. Available at: https://www.researchgate.net/ publication/340032869_CoronaTracker_World-wide_COVID-19_Outbreak_Data_Analysis _and_Prediction. (accessed March 28, 2022).




DOI: https://doi.org/10.32620/reks.2022.4.02

Refbacks

  • There are currently no refbacks.