Software effort prediction models using maximum likelihood methods require multivariate normality of the software metrics data sample: Can such a sample be made multivariate normal?

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

Missing data often appear in software metrics data samples used to construct software effort prediction models1. So far, the least biased and thus the most strongly recommended family of such models capable of handling missing data are those using maximum likelihood methods. However, the theory of such maximum likelihood methods assumes that the data samples underlying the model construction are multivariate normal. Previous researches on such models simply ignored the violation of such an assumption by the empirical data samples. This paper proposes and empirically illustrates a not-so-complicated but effective technique to transform the data sample for the purpose of meeting such an assumption. This technique is empirically proven to work for typical software metrics data samples and the author recommends applying such a technique in any further researches on and practical industrial application of software effort prediction models using maximum likelihood methods.

Original languageEnglish
Pages (from-to)274-279
Number of pages6
JournalProceedings - International Computer Software and Applications Conference
Volume1
Publication statusPublished - 2004
EventProceedings of the 28th Annual International Computer Software and Applications Conference, COMPSAC 2004 - Hong Kong, China, Hong Kong
Duration: 28 Sept 200430 Sept 2004

Fingerprint

Dive into the research topics of 'Software effort prediction models using maximum likelihood methods require multivariate normality of the software metrics data sample: Can such a sample be made multivariate normal?'. Together they form a unique fingerprint.

Cite this