A bayesian-driven feedforward neural network model for Kafka cluster latency forecasting

Olha Solovei, Tetiana Honcharenko

Abstract


The subject matter of this article is the process of designing the architecture of a Feedforward neural network model based on the discrete Bayesian Network and a new method for setting the initial weights that connect neurons across layers. The goal of this study is to develop a Neural Network model designed to forecast end-to-end latency in a Kafka cluster. The proposed model can be used as a tool to predict the end-to-end latency of Kafka clusters based on the given configuration parameters and performance metrics. This study resolved the following tasks: developed and validated a discrete Bayesian network to understand the factors influencing end-to-end latency in Kafka clusters; conducted a sensitivity analysis on the discrete Bayesian network; created a matrix with initial weights derived from the sensitivity analysis in the Bayesian network to initialize weights in FFNN model; designed FFNN architecture for predicting the Kafka cluster end-to-end latency and configured its parameters; trained and evaluated the designed FFNN model. Methods from theories were used to conduct the research: big data processing, probabilistic graphical models and Bayesian inference theory, artificial neural networks and deep learning theories, graph theory, and machine learning optimization. The following results were obtained: a trained FFNN model Mean Square Error showed consistent decrease across epochs, so we concluded that the model can be deployed and used as a tool to forecast Apach Kafka latency for given configuration parameters and performance metrics. The comparison of the Mean Square Error values when FFNN model is initialized with weights derived from the strength of influence in the Bayes Network and FFNN model which is set the same initial weights but scaled by Kaiming He factor demonstrated that Kaiming He scaling factor primarily improves the initial phase of training by stabilizing weight initialization. Therefore, we recommend scaling the initial weights as specified in our method to optimize FFNN training process. Conclusions. The scientific novelty of the results obtained is as follows: 1) a new methodology for defining the architecture of a Feedforward Neural Network (FFNN) based on the discrete Bayesian network structure is introduced; 2) the initial weights that connect neurons across layers are set.

Keywords


Kafka cluster latency; Bayes Network; Feedforward Neural network; strength of influence; initial weights

Full Text:

PDF

References


Honcharenko, T., Khrolenko, V., Gorbatyuk, I., Liashchenko, M., Bodnar, N., & Sherif, N. H. Smart Integration of Information Technologies for City Digital Twins. In 2024 35th Conference of Open Innovations Association (FRUCT), IEEE, 2024, pp. 253-258. DOI: 10.23919/FRUCT61870.2024.10516358.

Raptis, T. P., & Passarella, A. A survey on networked data streaming with apache kafka. IEEE Access, 2023, vol. 11, pp. 85333-85350. DOI: 10.1109/ACCESS.2023.3303810.

Solovei, O., Honcharenko, T., & Fesan, A. Tekhnolohiyi upravlinnya velykymy danymy proyektiv misʹkoho budivnytstva [Technologies to manager big data of urban building projects]. Upravlinnya rozvytkom skladnykh system – Management of Development of Complex Systems, 2024, no. 60, pp. 121–128, DOI: 10.32347/2412-9933.2024.60.121-128. (In Ukrainian).

Vogel, A., Henning, S., Ertl, O., & Rabiser, R. A systematic mapping of performance in distributed stream processing systems. In 2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), IEEE, 2023, pp. 293-300. DOI: 10.1109/SEAA60479.2023.00052.

Metta, C., Fantozzi, M., Papini, A., Amato, G., Bergamaschi, M., Galfrè, S. G., Marchetti, A., Veglio, M., Parton, M., & Morandin, F. Increasing biases can be more efficient than increasing weights. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 2810-2819. DOI: 10.1109/WACV57701.2024.00279.

Hosamo, H. H., Nielsen, H. K., Kraniotis, D., Svennevig, P. R., & Svidt, K. Improving building occupant comfort through a digital twin approach: A Bayesian network model and predictive maintenance method. Energy and Buildings, 2023, vol. 288, article no. 112992. DOI: 10.1016/j.enbuild.2023.112992.

Bortolini, R., & Forcada, N. A probabilistic performance evaluation for buildings and constructed assets. Building Research & Information, 2020, vol. 48, iss. 8, pp. 838-855. DOI: 10.1080/09613218.2019.1704208.

Mousavi, M., Shen, X., Zhang, Z., Barati, K., & Li, B. IoT-Bayes fusion: Advancing real-time environmental safety risk monitoring in under-ground mining and construction. Reliability Engineering & System Safety, 2025, vol. 256, article no. 110760. DOI: 10.1016/j.ress.2024.110760.

Kafka Producer Configuration Reference for Confluent Platform. Available at: https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html. (accessed 12.01.2025).

Pacella, M., Papa, A., Papadia, G., & Fedeli, E. A Scalable Framework for Sensor Data Ingestion and Real-Time Processing in Cloud Manufacturing. Algorithms, 2025, vol. 18, iss. 1, article no. 22. DOI: 10.3390/a18010022.

Elshoubary, E. E., & Radwan, T. Studying the Efficiency of the Apache Kafka System Using the Reduction Method, and Its Effectiveness in Terms of Reliability Metrics Subject to a Copula Approach. Applied Sciences, 2024, vol. 14, iss. 15, article no. 6758. DOI: 10.3390/app14156758.

Sathupadi, K., Achar, S., Bhaskaran, S. V., Faruqui, N., & Uddin, J. BankNet: Real-Time Big Data Analytics for Secure Internet Banking. Big Data and Cognitive Computing, 2025, vol. 9, iss. 2, article no. 24. DOI: 10.3390/bdcc9020024.

Ezzeddine, M., Baude, F., Huet, F., & Laaziz, F. Latency Aware and Resource-Efficient Bin Pack Autoscaling for Distributed Event Queues: Parameters Impact and Setting. SN Computer Science, 2025, vol. 6, article no. 219. DOI: 10.1007/s42979-025-03740-9.

Harle, S. M. Advancements and challenges in the application of artificial intelligence in civil engineering: a comprehensive review. Asian Journal of Civil Engineering, 2024, vol. 25, iss. 1, pp.1061-1078. DOI: 10.1007/s42107-023-00760-9.

Moller, M. Efficient training of feed-forward neural networks. DAIMI Report Series, 1993, no. 464, article no. PB-464. pp. 136-173. DOI: 10.7146/dpb.v22i464.6937.

Narkhede, M. V., Bartakke, P. P., & Sutaone, M. S. A review on weight initialization strategies for neural networks. Artificial intelligence review, 2022, vol. 55, pp. 291-322. DOI: 10.1007/s10462-021-10033-z.

Ebid, S. E., El-Tantawy, S., Shawky, D., & Abdel-Malek, H. L. Correlation-based pruning algorithm with weight compensation for feedforward neural networks. Neural Computing and Applications, 2025, vol. 37, pp. 6351-6367. DOI: 10.1007/s00521-024-10932-6.

Kitson, N. K., Constantinou, A. C., Guo, Z., Liu, Y., & Chobtham, K. A survey of Bayesian Network structure learning. Artificial Intelligence Review, 2023, vol. 56, pp. 8721-8814. DOI: 10.1007/s10462-022-10351-w.

Lu, N. Y., Zhang, K., & Yuan, C. Improving causal discovery by optimal bayesian network learning. Proceedings of the AAAI Conference on artificial intelligence, 2021, vol. 35, iss. 10, pp. 8741-8748. DOI: 10.1609/aaai.v35i10.17059.

Tawakuli, A., & Engel, T. Make your data fair: A survey of data preprocessing techniques that address biases in data towards fair AI. Journal of Engineering Research, 2024. DOI: 10.1016/j.jer.2024.06.016.

Kharchenko, V., Fesenko, H., & Illiashenko, O. Quality models for artificial intelligence systems: characteristic-based approach, development and application. Sensors, 2022, vol. 22, iss. 13, article no. 4865. DOI: 10.3390/s22134865.




DOI: https://doi.org/10.32620/reks.2025.3.05

Refbacks

  • There are currently no refbacks.