Predicting software test effectiveness

We recently published an original research article on the measurement of test effectiveness with machine learning techniques. The high cost of the test can be dramatically reduced, provided that the coverability as an inherent feature of the code under test is predictable. This article offers a machine learning model to predict the extent to which the test could cover a class in terms of a new metric called Coverageability. The prediction model consists of an ensemble of four regression models. The learning samples consist of feature vectors, where features are source code metrics computed for a class. The samples are labeled by the Coverageability values computed for their corresponding classes. We offer a mathematical model to evaluate test effectiveness in terms of size and coverage of the test suite generated automatically for each class. We extend the size of the feature space by introducing a new approach to define submetrics in terms of existing source code metrics. Using feature importance analysis on the learned prediction models, we sort sources code metrics in the order of their impact on the test effectiveness. As a result of which we found the class strict cyclomatic complexity as the most influential source code metric. Our experiments with our prediction models on a large corpus of Java projects containing about 23,000 classes demonstrate the Mean Absolute Error (MAE) of 0.032, Mean-Squared Error (MSE) of 0.004, and an R² score of 0.855. Compared with the state-of-the-art coverage prediction models, our models improve MAE, MSE, and an R² score by 5.78%, 2.84%, and 20.71%, respectively.

The full text of the paper can be found here.

Predicting software test effectiveness

Related posts

Software Testing Automation book has been announced as a top-rated book

Our paper was published on the front cover of International Journal of Intelligent Systems

Meet CodART: Our refactoring engine