Optimizing Software Defect Detection using advanced Feature Selection, Ensemble Learning, and Class Imbalance Solutions

Main Article Content

Tarunim Sharma, Shalini Bhaskar, Aman Jatain, Kavita Pabreja

Abstract

  Software Anamoly Prediction (SBP) is a vital process in software development,designed to identify potential software defects early in the development lifecycle. Early detection not only enhances software quality and performance but also significantly reduces development costs. The advent of Machine Learning (ML) algorithms has markedly improved the accuracy of bug prediction, leading to more efficient resource allocation and cost management. However, traditional ML models often struggle with managing non-linear relationships, addressing data imbalances, ensuring adequate feature representation, and handling complex scenarios, resulting in sub-optimal performance.This research proposes a novel approach that optimizes the selection and refinement of classifiers, improving the accuracy and reliability of SAP. A key focus of this study is on addressing class imbalance, a crucial factor that significantly impacts the accuracy of software defect detection, as evidenced by performance metrics. Moreover, feature selection, which involves removing irrelevant features from a dataset, is also identified as essential for building more effective learning models. Recent research have also emphasized the importance of tuning of model parameters in boosting the performance of individual classifiers in SAP tasks. Additionally, Ensemble Learning (EL) techniques have demonstrated superior accuracy and effectiveness when applied to SAP datasets.This research introduces an innovative model that integrates Ensemble Learning with hyperparameter tuning, alongside class imbalance handling and careful feature selection, to predict software bugs more effectively. The study investigates whether Ensemble Learning models outperform individual models in software bug prediction and if integrating hyperparameter optimization, class imbalance handling, and feature selection further boosts their accuracy.The findings underscore the key role of these integrated approaches in improving the predictive power of SAP models . The proposed model, tested using Python software, shows a substantial improvement in accuracy compared to single classifier models on the PROMISE repository's PC1 and CM1 data sets, highlighting the potential of these advanced methods in advancing software bug prediction.


 

Article Details

Section
Articles