Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors

Kondlo, Lwando Orbet

Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors

Files

Kondlo_MA_MSci_2010.pdf (38.66 MB)

Date

2010

Authors

Kondlo, Lwando Orbet

Publisher

University of the Western Cape

Abstract

Estimation of population distributions, from samples that are contaminated by measurement errors, is a common problem. This study considers the problem of estimating the population distribution of independent random variables Xi, from error-contaminated samples ~i (.j = 1, ... , n) such that Yi = Xi + f·.i, where E is the measurement error, which is assumed independent of X. The measurement error ( is also assumed to be normally distributed. Since the observed distribution function is a convolution of the error distribution with the true underlying distribution, estimation of the latter is often referred to as a deconvolution problem. A thorough study of the relevant deconvolution literature in statistics is reported. We also deal with the specific case when X is assumed to follow a truncated Pareto form. If observations are subject to Gaussian errors, then the observed Y is distributed as the convolution of the finite-support Pareto and Gaussian error distributions. The convolved probability density function (PDF) and cumulative distribution function (CDF) of the finite-support Pareto and Gaussian distributions are derived. The intention is to draw more specific connections bet.ween certain deconvolution methods and also to demonstrate the application of the statistical theory of estimation in the presence of measurement error. A parametric methodology for deconvolution when the underlying distribution is of the Pareto form is developed. Maximum likelihood estimation (MLE) of the parameters of the convolved distributions is considered. Standard errors of the estimated parameters are calculated from the inverse Fisher's information matrix and a jackknife method. Probability-probability (P-P) plots and Kolmogorov-Smirnov (K-S) goodnessof- fit tests are used to evaluate the fit of the posited distribution. A bootstrapping method is used to calculate the critical values of the K-S test statistic, which are not available. Simulated data are used to validate the methodology. A real-life application of the methodology is illustrated by fitting convolved distributions to astronomical data