Explainable artificial intelligence for multimodal sentiment analysis in revitalization project management

Serhii Dolhopolov, Yuliia Riabchun, Maksym Delembovskyi, Oleksandr Molodid

Abstract


This study focuses on the development and evaluation of an explainable artificial intelligence (XAI) framework for multimodal sentiment analysis, specifically applied to territorial revitalization project management. The research addresses the critical problem of “black box” AI models, whose lack of transparency hinders their adoption by project managers who require trustworthy information for high-stakes decision-making in complex social environments. The goal of this study is to propose and rigorously validate a novel framework for multimodal sentiment analysis that is tailored to provide transparent, trustworthy, and actionable insights for decision-making in territorial revitalization project management. The tasks to be solved include developing a hybrid XAI technique that fuses insights from cross-modal attention and gradient-based attribution, designing a cohesive, user-centric explanation format combining highlighted text and image heatmaps, constructing a custom RevitalizeSent-MM dataset for this specific domain, and empirically evaluating the framework’s predictive accuracy and, crucially, the fidelity of its explanations. The methods used involve a transformer-based Multimodal Sentiment Analysis (MSA) model using BERT and ViT with cross-modal attention for information fusion. The explainability component is a hybrid XAI technique that integrates cross-modal attention analysis with Integrated Gradients to assign importance scores to input features. Evaluation was performed using standard classification metrics for performance and the “Accuracy Drop on Perturbation” metric for explanation fidelity. The results confirmed the efficacy of the framework. The multimodal model demonstrated superior accuracy over unimodal baselines, and the proposed XAI method achieved significantly higher fidelity than naive explanation approaches, demonstrating its ability to accurately reflect the model’s internal reasoning. The scientific novelty lies in three areas: the development of a fused, hybrid XAI technique specifically for transformer-based multimodal models, creation of a unique, domain-specific dataset for revitalization analysis, and validation of a methodology for adapting advanced XAI to solve critical trust and adoption barriers, thereby confirming its practical significance in project management.

Keywords


explainable AI (XAI); multimodal sentiment analysis; project management; deep learning; revitalization

Full Text:

PDF

References


Yao, S., & Wang L. Difficulties and Challenges Faced by the Implementation of China’s Rural Revitalization Strategy. Frontiers in Sustainable Development, 2023 vol. 3, no. 2, article no. 3768. DOI: 10.54691/fsd.v3i2.3768.

Guo, B., Yuan L., & Lu M. Analysis of Influencing Factors of Farmers’ Homestead Revitalization Intention from the Perspective of Social Capital. Land, 2023, vol. 12, no. 4, article no. 812. DOI: 10.3390/land12040812.

Jin, X., Chin, T., Yu, J., Zhang, Y., & Shi, Y. How Government’s Policy Implementation Methods Influence Urban Villager. Acceptance of Urban Revitalization Programs: Evidence from China. Land, 2020, vol. 9, no. 3, article no. 77. DOI: 10.3390/land9030077.

Chen, X., Xie, H., Tao, X., Wang, F. L., Leng, M., & Lei, B. Artificial intelligence and multimodal data fusion for smart healthcare: topic modeling and bibliometrics. Artificial Intelligence Review, 2024, vol. 57, article no. 91. DOI: 10.1007/s10462-024-10712-7.

Sulubacak, U., Caglayan, O., Gronroos, S., Rouhe, A., Elliott, D., Specia, L., & Tiedemann, J. Multimodal machine translation through visuals and speech. Machine Translation, 2020, vol. 34, pp. 97-147. DOI: 10.1007/s10590-020-09250-0.

Singh, K., Piyush, P., Kumar, R., Chhabra, S., Goomer, N., & Kashyap, A. Multimodal Data Extraction & Fusion for Health Monitoring System and Early Diagnosis. 2024 International Conference on Computational Intelligence and Computing Applications (ICCICA), 2024, vol. 1, pp. 216-220. DOI: 10.1109/ICCICA60014.2024.10585027.

De Vito, S., Del Giudice, A., D’Elia, G., Esposito, E., Fattoruso, G., Ferlito, S., Formisano, F., Loffredo, G., Massera, E., D’Auria, P., & Di Francia, G. Future Low-Cost Urban Air Quality Monitoring Networks: Insights from the EU’s AirHeritage Project. Atmosphere, 2024, vol. 15, no. 11, article no. 1351. DOI: 10.3390/atmos15111351.

Chernyshev, D., Ryzhakova, G., Honcharenko, T., Petrenko, H., Chupryna, I., Reznik, N. Digital Administration of the Project Based on the Concept of Smart Construction. Explore Business, Technology Opportunities and Challenges ‎After the Covid-19 Pandemic. ICBT 2022. Lecture Notes in Networks and Systems, 2022, vol. 495, pp. 1316–1331. DOI: 10.1007/978-3-031-08954-1_114.

Honcharenko, T., Mihaylenko, V., Borodavka, Y., Dolya, E., & Savenko, V. Information Tools for Project Management of the Building Territory at the Stage of Urban Planning. CEUR Workshop Proceedings, 2021, vol. 2851, pp. 22–33.

Matsiievskyi, O., Achkasov, I., Borodavka, Y., & Mazurenko, R. Behavioral model of autonomous robotic systems using reinforcement learning methods. International Workshop on Information Technologies: Theoretical and Applied Problems, 2024, vol. 3896, pp. 1–9. Available at: https://ceur-ws.org/Vol-3896/short14.pdf (accessed 12.05.2025).

Riabchun, Y., Honcharenko, T., Honta, V., Chupryna, K., & Fedusenko, O. Methods and means of evaluation and development for prospective students’ spatial awareness. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2019, vol. 8, no. 11, pp. 4050-4058. DOI: 10.35940/ijitee.k1532.0981119.

Omas-as, R. M., & Encarnacion, R. E. Stakeholders’ Satisfaction on Institutional Assessment: A Proposal for Unified Feedback Management System with Text Analytics and Sentiment Analysis. International Journal of Advanced Research in Science, Communication and Technology (IJARSCT), 2024, vol. 4, no. 1, pp. 50-56. DOI: 10.48175/ijarsct-18716.

von Eschenbach, W. J. Transparency and the Black Box Problem: Why We Do Not Trust AI. Philosophy & Technology, 2021, vol. 34, no. 4, pp. 1607–1622. DOI: 10.1007/s13347-021-00477-0.

Dolhopolov, S., Honcharenko, T., Savenko, V., Balina, O., Bezklubenko, I.S., & Liashchenko, T. Construction Site Modeling Objects Using Artificial Intelligence and BIM Technology: A Multi-Stage Approach. 2023 IEEE International Conference on Smart Information Systems and Technologies (SIST), 2023, pp. 174-179. DOI: 10.1109/SIST58284.2023.10223543.

Dolhopolov, S., Honcharenko, T., Terentyev, O., Predun, K., & Rosynskyi, A. Information system of multi-stage analysis of the building of object models on a construction site. IOP Conference Series: Earth and Environmental Science, 2023, vol. 1254, no. 1, article no. 012075. DOI: 10.1088/1755-1315/1254/1/012075.

Hazarika, D., Zimmermann, R., & Poria, S. MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. Proceedings of the 28th ACM International Conference on Multimedia (MM ‘20), 2020, pp. 1363–1371. DOI: 10.1145/3394171.3413678.

Zadeh, A., Chen, M., Poria, S., Cambria, E., & Morency, L. Tensor Fusion Network for Multimodal Sentiment Analysis. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017, pp. 1103–1114. DOI: 10.18653/v1/D17-1115.

Xu, W., Chen, J., Ding, Z., & Wang, J. Text sentiment analysis and classification based on bidirectional Gated Recurrent Units (GRUs) model. Applied and Computational Engineering, 2024, vol. 77, pp. 132–137. DOI: 10.54254/2755-2721/77/20240670.

Tang, Z. Review of Multimodal Sentiment Analysis Techniques. Applied and Computational Engineering, 2024, vol. 120, pp. 88–97. DOI: 10.54254/2755-2721/2025.18747.

Aldeen, M., MohajerAnsari, P., Ma, J., Chowdhury, M., Cheng, L., & Pesé, M. D. WIP: A First Look At Employing Large Multimodal Models Against Autonomous Vehicle Attacks. Proceedings of the 2024 Symposium on Vehicle Security and Privacy (VehicleSec), 2024, pp. 1–7. DOI: 10.14722/vehiclesec.2024.23044.

Beserra, A. A., Kishi, R. M., & Goularte, R. Evaluating Early Fusion Operators at Mid-Level Feature Space. Proceedings of the Brazilian Symposium on Multimedia and the Web (WebMedia ‘20), 2020, pp. 113–120. DOI: 10.1145/3428658.3431079.

Cheng, J., Feng, C., Xiao, Y., & Cao, Z. Late better than early: A decision-level information fusion approach for RGB-Thermal crowd counting with illumination awareness. Neurocomputing, 2024, vol. 594, article no. 127888. DOI: 10.1016/j.neucom.2024.127888.

Shaikh, M., Chai, D., Islam, S. M., & Akhtar, N. From CNNs to Transformers in Multimodal Human Action Recognition: A Survey. ACM Transactions on Multimedia Computing, Communications, and Applications, 2024, vol. 20, no. 8, article no. 260, pp. 1-24. DOI: 10.1145/3664815.

Lu, J., Batra, D., Parikh, D., & Lee, S. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, article no. 2, pp. 13-23.

Tan, H. H., & Bansal, M. LXMERT: Learning Cross-Modality Encoder Representations from Transformers. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 5100-5111. DOI: 10.18653/v1/D19-1514.

Chen, Y., Li, L., Yu, L., Kholy, A.E., Ahmed, F., Gan, Z., Cheng, Y., & Liu, J. UNITER: UNiversal Image-TExt Representation Learning. Computer Vision – ECCV 2020. Lecture Notes in Computer Science, 2020, vol. 12375, pp. 104-120. DOI: 10.1007/978-3-030-58577-8_7.

Wang, P., Liu, S., & Chen, J. CCDA: A Novel Method to Explore the Cross-Correlation in Dual-Attention for Multimodal Sentiment Analysis. Appl. Sci., 2024, vol. 14, 1934. DOI: 10.3390/app14051934.

Yu, W., Xu, H., Yuan, Z., & Wu, J. Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 12, pp. 10798– 10806. DOI: 10.1609/aaai.v35i12.17289.

Liu, Z., Braytee, A., Anaissi, A., Zhang, G., Qin, L., & Akram, J. Ensemble Pretrained Models for Multimodal Sentiment Analysis using Textual and Video Data Fusion. WWW ‘24: Companion Proceedings of the ACM Web Conference 2024, 2024, pp. 1841–1848. DOI: 10.1145/3589335.3651971.

Keane, M. T., & Smyth, B. Good Counterfactuals and Where to Find Them: A Case-Based Technique for Generating Counterfactuals for Explainable AI (XAI). Case-Based Reasoning Research and Development. ICCBR 2020. Lecture Notes in Computer Science, 2020, vol. 12311, pp. 163–178. DOI: 10.1007/978-3-030-58342-2_11.

Pradhan, B., Dikshit, A., Lee, S., & Kim, H. An explainable AI (XAI) model for landslide susceptibility modeling. Applied Soft Computing, 2023, vol. 142, article no. 110324. DOI: 10.1016/j.asoc.2023.110324.

Jang, H., Kim, S., & Yoon, B. An Explainable AI (XAI) model for text-based patent novelty analysis. Expert Systems with Applications, 2023, vol. 231, article no. 120839. DOI: 10.1016/j.eswa.2023.120839.

Hasan, R., Dattana, V., Mahmood, S., & Hussain, S. Towards Transparent Diabetes Prediction: Combining AutoML and Explainable AI for Improved Clinical Insights. Information, 2025, vol. 16, no. 1, article no. 7. DOI: 10.3390/info16010007.

Choi, S. R., & Lee, M. Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review. Biology, 2023, vol. 12, no. 7, article no. 1033. DOI: 10.3390/biology12071033.

Gao, Y., Liu, J., Li, W., Hou, M., Li, Y., & Zhao, H. Augmented Grad-CAM++: Super-Resolution Saliency Maps for Visual Interpretation of Deep Neural Network. Electronics, 2023, vol. 12, no. 23, article no. 4846. DOI: 10.3390/electronics12234846.

Cheng, Z., Wu, Y., Li, Y., Cai, L., & Ihnaini, B. A Comprehensive Review of Explainable Artificial Intelligence (XAI) in Computer Vision. Sensors, 2025, vol. 25, article no. 4166. DOI: 10.3390/s25134166.

Waa, J. V., Nieuwburg, E., Cremers, A. H., & Neerincx, M. A. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence, 2021, vol. 291, article no. 103404. DOI: 10.1016/j.artint.2020.103404.

Engel, A., Wang, Z., Frank, N.S., Dumitriu, I., Choudhury, S., Sarwate, A.D., & Chiang, T. Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models. arXiv preprint, 2023, pp. 1–54. DOI: 10.48550/arXiv.2305.14585.

Sun, Y., He, S., Han, X., & Luo, Y. Interpretability in Sentiment Analysis: A Self-Supervised Approach to Sentiment Cue Extraction. Appl. Sci., 2024, vol. 14, 2737. DOI: 10.3390/app14072737.

Gipiškis, R., Tsai, C., & Kurasova, O. Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey. arXiv preprint, 2024, pp. 1-35. DOI: 10.48550/arXiv.2405.01636.

Mabokela, K. R., Primus, M., & Celik, T. Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages. Big Data Cogn. Comput., 2024, vol. 8, article no. 160. DOI: 10.3390/bdcc8110160.

Cerasuolo, F., Guarino, I., Spadari, V., Aceto, G., & Pescapé, A. XAI for Interpretable Multimodal Architectures with Contextual Input in Mobile Network Traffic Classification. 2024 IFIP Networking Conference (IFIP Networking), 2024, pp. 757–762. DOI: 10.23919/IFIPNetworking62109.2024.10619769.

Kharchenko, V., Fesenko, H., & Illiashenko, O. Quality Models for Artificial Intelligence Systems: Characteristic-Based Approach, Development and Application. Sensors, 2022, vol. 22, 4865. DOI: 10.3390/s22134865.




DOI: https://doi.org/10.32620/reks.2025.4.07

Refbacks

  • There are currently no refbacks.