QPMMCOA and Bayesian Fuzzy Clustering: A Novel Approaches For Optimizing Queries in Big Data

Rani, Mursubai Sandhya; Sai, Raghavendra

doi:10.22111/ijfs.2025.47412.8350

تعداد نشریات	31
تعداد شماره‌ها	848
تعداد مقالات	8,174
تعداد مشاهده مقاله	16,091,923
تعداد دریافت فایل اصل مقاله	10,614,213

	QPMMCOA and Bayesian Fuzzy Clustering: A Novel Approaches For Optimizing Queries in Big Data
Iranian Journal of Fuzzy Systems
دوره 22، شماره 2، خرداد و تیر 2025، صفحه 1-24 اصل مقاله (1.68 M)
نوع مقاله: Research Paper
شناسه دیجیتال (DOI): 10.22111/ijfs.2025.47412.8350
نویسندگان
Mursubai Sandhya Rani^* ¹؛ Raghavendra Sai²
¹Koneru Lakshmaiah Educational Foundation, Vaddeswaram, Andhra Pradesh, India
²Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India
چکیده
The explosion of data in the last ten years has led to a substantial focus on big data (BD) in information area. The philosophical applications of "query optimization (QO)" are crucial in BD environments' data retrieval processes. Several distributed data processing platforms in cloud were developed to provide BD query optimization services that are both affordable and effective. Nevertheless, due to a lack of consideration for energy-related concerns and query characteristics, most solutions resulted in higher "energy consumption (EC)" and lower accuracy. We introduced an innovative deep-learning approach to arrange big data to overcome the issue. This work presents an effective query optimization that uses the Quantum parallel multi-layer Monte Carlo optimization method (QPMMCOA) optimizer and a load balancer based on Bayesian fuzzy clustering to address the problems associated with query optimization process. There are two phases to the suggested technique: (1) Big data arrangement and (2) Query Optimization. The first step arranges BD using preprocessing, feature extraction, feature selection, and deep learning-based BD arrangement. The improved Deep Residual Shrinkage Network (IDRSN) algorithm is used for the BD arrangement. The essential features are selected using the Chaotic Vertex Search algorithm (CVSA). During the second phase, a Bayesian fuzzy clustering-based load balancer is used with the QPMMCOA optimizer to improve overall query processing performance and ignore energy-efficient query plans. At last, the process of evaluating similarity is carried out. The experimental results demonstrated that the method performed better than other existing algorithms.
کلیدواژه‌ها
energy consumption؛ deep-learning؛ algorithms

مراجع
[1] H. B. Abdalla, A. M. Ahmed, M. A. Al Sibahee, Optimization-driven mapreduce framework for indexing and retrieval of big data, KSII Transactions on Internet and Information Systems (TIIS), 14(5) (2020), 1886-1908. http://doi. org/10.3837/tiis.2020.05.002 [2] L. Abualigah, A. H. Gandomi, M. A. Elaziz, H. A. Hamad, M. Omari, M. Alshinwan, A. M. Khasawneh, Advances in meta-heuristic optimization algorithms in big data text clustering, Electronics, 10(2) (2021), 101. https://doi.org/10.3390/electronics10020101 [3] R. Akram, N. Ayub, I. Khan, F. R. Albogamy, G. Rukh, S. Khan, K. Rizwan, Towards big data electricity theft detection based on improved rusboost classifiers in smart grid, Energies, 14(23) (2021), 8029. https://doi.org/ 10.3390/en14238029 [4] M. Q. Bashabsheh, L. Abualigah, M. Alshinwan, Big data analysis using hybrid meta-heuristic optimization algorithm and MapReduce framework, in integrating meta-heuristics and machine learning for real-world optimization problems, Cham: Springer International Publishing, (2022), 181-223. https://doi.org/10.1007/ 978-3-030-99079-4_8 [5] J. Bater, Y. Park, X. He, X. Wang, J. Rogers, Saqe: Practical privacy-preserving approximate query processing for data federations, Proceedings of the VLDB Endowment, 13(12) (2020), 2691-2705. https://doi.org/10.14778/ 3407790.3407854 [6] R. Chi, H. Li, D. Shen, Z. Hou, B. Huang, Enhanced P-type control: Indirect adaptive learning from set-point updates, IEEE Transactions on Automatic Control, 68(3) (2022), 1600-1613. https://doi.org/10.1109/TAC.2022.3154347 [7] D. Choi, J. Wee, S. Song, H. Lee, J. Lim, K. Bok, J. Yoo, K-NN query optimization for high-dimensional index using machine learning, Electronics, 12(11) (2023), 2375. https://doi.org/10.3390/electronics12112375 [8] Q. T. Doan, A. S. M. Kayes, W. Rahayu, K. Nguyen, A framework for IoT streaming data indexing and query optimization, IEEE Sensors Journal, 22(14) (2022), 14436-14447. https://doi.org/10.1109/JSEN.2022.3149901 [9] K. Dubey, A. Kumar, R. Agrawal, An efficient ACO-PSO-based framework for data classification and preprocessing in big data, Evolutionary Intelligence, 14 (2021), 909-922. https://doi.org/10.1007/s12065-020-00477-7 [10] W. Ge, X. Li, C. Yuan, Y. Huang, Correlation-aware partitioning for skewed range query optimization, World Wide Web, 22(1) (2019), 125-151. https://doi.org/10.1007/s11280-018-0547-4 [11] D. Geng, C. Zhang, C. Xia, X. Xia, Q. Liu, X. Fu, Big data-based improved data acquisition and storage system for designing industrial data platform, IEEE Access, 7 (2019), 44574-44582. https://doi.org/10.1109/ACCESS. 2019.2909060 [12] S. B. Goyal, P. Bedi, A. S. Rajawat, R. N. Shawand A. Ghosh, Multi-objective fuzzy-swarm optimizer for data partitioning, In Advanced Computing and Intelligent Technologies: Proceedings of ICACIT 2021, Springer Singapore, 1 (2022), 307-318. https://doi.org/10.1007/978-981-16-2164-2_25 [13] Y. Guo, Z. Shao, Cymo: A storage model with query-aware indexing for spatio-temporal big data, In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS), (2022), 122-132. https://doi.org/ 10.1109/ICDCS54860.2022.00021 [14] H. Hu, J. Liu, X. Zhang, M. Fang, An effective and adaptable K-means algorithm for big data cluster analysis, Pattern Recognition, 139 (2023), 109404. https://doi.org/10.1016/j.patcog.2023.109404 [15] M. Jagdish, N. Anand, K. Gaurav, S. Baseer, A. Alqahtani, V. Saravanan, Multihoming big data network using blockchain-based query optimization scheme, Wireless Communications and Mobile Computing, 1 (2022), 1-12. https://doi.org/10.1155/2022/7768169 [16] N. I. N. G. Jing, Neural network-based pattern recognition in the framework of edge computing, Science and Technology, 27(1) (2024), 106-119. [17] H. Kour, M. K. Gupta, Hybrid evolutionary intelligent network for sentiment analysis using twitter data during COVID-19 pandemic, Expert Systems, 41(3) (2024), e13489. https://doi.org/10.1111/exsy.13489 [18] D. Kumar, V. K. Jha, An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique, Distributed and Parallel Databases, 39 (2021), 79-96. https://doi.org/10.1007/ s10619-020-07285-z [19] D. Kumar, V. K. Jha, An efficient query optimization technique in big data using σ-ANFIS load balancer and CaM-BW optimizer, The Journal of Supercomputing, 77(11) (2021), 13018-13045. https://doi.org/10.1007/ s11227-021-03793-6 [20] R. Kumar, P. Kumar, Y. Kumar, Integrating big data driven sentiments polarity and ABC-optimized LSTM for time series forecasting, Multimedia Tools and Applications, 81(24) (2022), 34595-34614. https://doi.org/10. 1007/s11042-021-11029-1 [21] V. N. Kumar, A. Kumar P. S., An efficient and scalable SPARQL query processing framework for big data using MapReduce and hybrid optimum load balancing, Data and Knowledge Engineering, 148(1) (2023), 102239. https: //doi.org/10.1016/j.datak.2023.102239 [22] D. Li, L. Deng, Z. Cai, Statistical analysis of tourist flow in tourist spots based on big data platform and DA-HKRVM algorithms, Personal and Ubiquitous Computing, 24 (2020), 87-101. https://doi.org/10.1007/ s00779-019-01341-x [23] X. Li, H. Liu, W. Wang, Y. Zheng, H. Lv, Z. Lv, Big data analysis of the internet of things in the digital twins of smart city based on deep learning, Future Generation Computer Systems, 128 (2022), 167-177. https: //doi.org/10.1016/j.future.2021.10.006 [24] D. Mahajan, C. Blakeney, Z. Zong, Improving the energy efficiency of relational and NoSQL databases via query optimizations, Sustainable Computing: Informatics and Systems, 22(1) (2019), 120-133. https://doi.org/10. 1016/j.suscom.2019.01.017 [25] G. Manogaran, P. M. Shakeel, S. Baskar, C. H. Hsu, S. N. Kadry, R. Sundarasekar, B. A. Muthu, FDM: Fuzzyoptimized data management technique for improving big data analytics, IEEE Transactions on Fuzzy Systems, 29(1) (2020), 177-185. https://doi.org/10.1109/TFUZZ.2020.3016346 [26] S. Meera, C. Sundar, A hybrid metaheuristic approach for efficient feature selection methods in big data, Journal of Ambient Intelligence and Humanized Computing, 12 (2021), 3743-3751. https://doi.org/10.1007/ s12652-019-01656-w [27] P. Michiardi, D. Carra, S. Migliorini, Cache-based multi-query optimization for data-intensive scalable computing frameworks, Information Systems Frontiers, 23(1) (2021), 35-51. https://doi.org/10.1007/s10796-020-09995-2 [28] S. Migliorini, A. Belussi, E. Quintarelli, D. Carra, CoPart: A context-based partitioning technique for big data, Journal of Big Data, 8 (2021), 1-28. https://doi.org/10.1186/s40537-021-00410-4 [29] A. Murugan, D. Gobinath, S. G. Kumar, B. Muruganantham, S. Velusamy, A time efficient and accurate retrieval of range aggregate queries using fuzzy clustering means (FCM) approach, International Journal of Electrical and Computer Engineering, 10(1) (2020), 415. https://doi.org/10.11591/ijece.v10i1.pp415-420 [30] N. Orensa, A design framework for efficient distributed analytics on structured big data, Doctoral Dissertation, University of Saskatchewan, 2021. [31] N. G. Praveena, S. S. Nath, A fuzzy based efficient and blockchain oriented secured routing in vehicular Ad-Hoc networks, Iranian Journal of Fuzzy Systems, 21(6) (2024), 15-31. [32] M. M. Rahman, S. Islam, M. Kamruzzaman, Z. H. Joy, Advanced query optimization in SQL databases for real-time big data analytics, Academic Journal on Business Administration, Innovation and Sustainability, 4(3) (2024), 1-14. https://doi.org/10.1109/access.2022.3141589 [33] V. Ravuri, S. Vasundra, Moth-flame optimization-bat optimization: Map-reduce framework for big data clustering using the Moth-flame bat optimization and sparse fuzzy C-means, Big Data, 8(3) (2020), 203-217. https://doi. org/10.1089/big.2019.0125 [34] R. C. Roman, R. E. Precup, E. M. Petriu, A. I. Borlea, Hybrid data-driven active disturbance rejection sliding mode control with tower crane systems validation, Science and Technology, 27 (2024), 3-17. [35] R. Sahal, M. H. Khafagy, F. A. Omara, Exploiting coarse-grained reused-based opportunities in big data multi-query optimization, Journal of Computational Science, 26 (2018), 432-452. https://doi.org/10.1016/j.jocs.2017.05. 023 [36] R. Sahal, M. Nihad, M. H. Khafagy, F. A. Omara, iHOME: Index-based JOIN query optimization for limited big data storage, Journal of Grid Computing, 16 (2018), 345-380. https://doi.org/10.1007/s10723-018-9431-9 [37] M. Sharma, G. Singh, R. Singh, Clinical decision support system query optimizer using hybrid firefly and controlled genetic algorithm, Journal of King Saud University-Computer and Information Sciences, 33(7) (2021), 798-809. https://doi.org/10.1016/j.jksuci.2018.06.007 [38] T. Siddiqui, A. Jindal, S. Qiao, H. Patel, W. Le, Cost models for big data query processing: Learning, retrofitting, and our findings, In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (2020), 99-113. https://doi.org/10.1145/3318464.3380584 [39] D. Sujatha, M. Subramaniam, C. R. Rene Robin, A new design of multimedia big data retrieval enabled by deep feature learning and adaptive semantic similarity function, Multimedia Systems, 28(3) (2022), 1039-1058. https: //doi.org/10.1007/s00530-022-00897-8 [40] M. Sun, L. Sun, Optimization of artificial intelligence in localized big data real-time query processing task scheduling algorithm, Frontiers in Physics, 12 (2024), 1484115. https://doi.org/10.3389/fphy.2024.1484115 [41] M. R. Sundarakumar, D. Salangai Nayagi, V. Vinodhini, S. VinayagaPriya, M. Marimuthu, S. Basheer, J. A. Renoald, A heuristic approach to improve the data processing in big data using enhanced Salp Swarm algorithm (ESSA) and MK-means algorithm, Journal of Intelligent and Fuzzy Systems, 45(2) (2023), 2625-2640. https: //doi.org/10.3233/JIFS-231389 [42] M. I. Tariq, S. Tayyaba, M. W. Ashraf, V. E. Balas, Deep learning techniques for optimizing medical big data, In Deep Learning Techniques for Biomedical and Health Informatics, 1 (2020), 187-211. https://doi.org/10.1016/ B978-0-12-819061-6.00008-2 [43] D. R. Thirupurasundari, R. Kumar, H. K. Palani, S. Ilangovan, P. G. Senthilvel, Optimizing query performance in big data systems using machine learning algorithms, In 2023 International Conference on Communication, Security and Artificial Intelligence (ICCSAI), (2023), 891-895. https://doi.org/10.1109/ICCSAI59793.2023.10421253 [44] W. Wang, H. Guo, X. Li, S. Tang, J. Xia, Z. Lv, Deep learning for assessment of environmental satisfaction using BIM big data in energy efficient building digital twins, Sustainable Energy Technologies and Assessments, 50 (2022), 101897. https://doi.org/10.1016/j.seta.2021.101897 [45] C. Xu, X. Du, Z. Yan, X. Fan, ScienceEarth: A big data platform for remote sensing data processing, Remote Sensing, 12(4) (2020), 607. https://doi.org/10.3390/rs12040607 [46] J. Yang, C. Zhao, C. Xing, Big data market optimization pricing model based on data quality, Complexity, 1 (2019), 1-13. https://doi.org/10.1155/2019/5964068 [47] M. Zhang, Y. Chen, W. Susilo, PPO-CPQ: A privacy-preserving optimization of clinical pathway query for ehealthcare systems, IEEE Internet of Things Journal, 7(10) (2020), 10660-10672. https://doi.org/10.1109/JIOT. 2020.3007518 [48] P. Zhang, S. Cui, B. Du, Fuzzy portfolio selection with different risk attitudes based on machine learning, Iranian Journal of Fuzzy Systems, 22(1) (2025), 1-21. https://doi.org/10.22111/ijfs.2025.47341.8338 [49] W. Zhang, T. Leng, H. Sun, Optimization research of spatial big data approximate query algorithm in the context of smart city, In International Conference on Smart Applications and Sustainability in the Artificial Intelligence of Things, Cham: Springer Nature Switzerland, (2024), 737-745. https://doi.org/10.1007 [50] https://www.kaggle.com/datasets/noordeen/big-city-health-data [51] https://github.com/LearnDataSci/articles/blob/master/Python [52] https://www.kaggle.com/datasets/cms/hospital-general-information [53] https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter
آمار تعداد مشاهده مقاله: 549 تعداد دریافت فایل اصل مقاله: 413

سامانه مدیریت نشریات علمی. طراحی و پیاده سازی از سیناوب

پیوندهای مفید

اخبار و اعلانات

آمار

QPMMCOA and Bayesian Fuzzy Clustering: A Novel Approaches For Optimizing Queries in Big Data