## Abstract

Missing data often appear in software metrics data samples used to construct software effort prediction models^{1}. So far, the least biased and thus the most strongly recommended family of such models capable of handling missing data are those using maximum likelihood methods. However, the theory of such maximum likelihood methods assumes that the data samples underlying the model construction are multivariate normal. Previous researches on such models simply ignored the violation of such an assumption by the empirical data samples. This paper proposes and empirically illustrates a not-so-complicated but effective technique to transform the data sample for the purpose of meeting such an assumption. This technique is empirically proven to work for typical software metrics data samples and the author recommends applying such a technique in any further researches on and practical industrial application of software effort prediction models using maximum likelihood methods.

Original language | English |
---|---|

Pages (from-to) | 274-279 |

Number of pages | 6 |

Journal | Proceedings - International Computer Software and Applications Conference |

Volume | 1 |

Publication status | Published - 2004 |

Event | Proceedings of the 28th Annual International Computer Software and Applications Conference, COMPSAC 2004 - Hong Kong, China, Hong Kong Duration: 28 Sept 2004 → 30 Sept 2004 |