TY - GEN
T1 - Application of a statistical methodology to simplify software quality metric models constructed using incomplete data samples
AU - Chan, Victor K.Y.
AU - Wong, W. Eric
AU - Xie, T. F.
PY - 2006
Y1 - 2006
N2 - During the construction of a software metric model, incomplete data often appear in the data sample used for the construction. Moreover, the decision on whether a particular predictor metric should be included is most likely based on an intuitive or experience-based assumption that the predictor metric has an impact on the target metric with a statistical significance. However, this assumption is usually not verifiable "retrospectively" after the model is constructed, leading to redundant predictor metric(s) and/or unnecessary predictor metric complexity. To solve all these problems, the authors have earlier derived a methodology consisting of the k-nearest neighbors (k-NN) imputation method, statistical hypothesis testing, and a "goodness-of- ftt" criterion. Whilst the methodology has been applied successfully to software effort metric models, it is applied only recently to software quality metric models which usually suffer from far more serious incomplete data. This paper documents the latter application based on a successful case study.
AB - During the construction of a software metric model, incomplete data often appear in the data sample used for the construction. Moreover, the decision on whether a particular predictor metric should be included is most likely based on an intuitive or experience-based assumption that the predictor metric has an impact on the target metric with a statistical significance. However, this assumption is usually not verifiable "retrospectively" after the model is constructed, leading to redundant predictor metric(s) and/or unnecessary predictor metric complexity. To solve all these problems, the authors have earlier derived a methodology consisting of the k-nearest neighbors (k-NN) imputation method, statistical hypothesis testing, and a "goodness-of- ftt" criterion. Whilst the methodology has been applied successfully to software effort metric models, it is applied only recently to software quality metric models which usually suffer from far more serious incomplete data. This paper documents the latter application based on a successful case study.
UR - http://www.scopus.com/inward/record.url?scp=34250775940&partnerID=8YFLogxK
U2 - 10.1109/QSIC.2006.13
DO - 10.1109/QSIC.2006.13
M3 - Conference contribution
AN - SCOPUS:34250775940
SN - 0769527183
SN - 9780769527185
T3 - Proceedings - International Conference on Quality Software
SP - 15
EP - 21
BT - Proceedings - Sixth International Conference on Quality Software, QSIC 2006
T2 - 6th International Conference on Quality Software, QSIC 2006
Y2 - 27 October 2006 through 28 October 2006
ER -