Philosophiae Doctor - PhD (Statistics and Population Studies)
Permanent URI for this collection
Browse
Browsing by Issue Date
Now showing 1 - 20 of 28
Results Per Page
Sort Options
Item Analysis and estimation of customer survival Time in subscription-based businesses(University of the Western Cape, 2008) Mohammed, Zakariya Mohammed Salih; Kotze, Danelle; Maritz, Johannes Stefan; Dept. of Statistics; Faculty of ScienceSubscription-based industries have seen a massive expansion in recent decades. In this type of industry the customer has to subscribe to be able to enjoy the service; there-fore, well-de ned start and end points of the customer relationship with the service provider are known. The length of this relationship, that is the time from subscription to service cancellation, is de ned as customer survival time. Unlike transaction-based businesses, where the emphasis is on the quality of a product and customer acquisition, subscription-based businesses focus on the customer and customer retention. A customer focus requires a new approach: managing according to customer equity (the value of a rm's customers) rather than brand equity (the value of a rm's brands). The concept of customer equity is attractive and straightforward, but the implementation and management of the customer equity approach do present some challenges. Amongst these challenges is that customer asset metric - customer lifetime value (the present value of all future pro ts generated from a customer) - depends upon assumptions about the expected survival time of the customer (Bell et al., 2002; Gupta and Lehmann, 2003). In addition, managing and valuing customers as an asset require extensive data and complex modelling. The aim of this study is to illustrate, adapt and develop methods of survival analysis in analysing and estimating customer survival time in subscription-based businesses. Two particular objectives are studied. The fi rst objective is to rede ne the existing survival analysis techniques in business terms and to discuss their uses in order to understand various issues related to the customer-fi rm relationship. The lesson to be learnt here is the ability of survival analysis techniques to extract important information on customers with regard to their loyalties, risk of cancellation of the service, and lifetime value. The ultimate outcome of this process of studying customer survival time will be to understand the dynamics and behaviour of customers with respect to their risk of cancellation, survival probability and lifetime value. The results of the estimates of customer mean survival time obtained from different nonparametric and parametric approaches; namely, the Kaplan-Meier method as well as exponential, Weibull and gamma regression models were found to vary greatly showing the importance of the assumption imposed on the distribution of the survival time. The second objective is to extrapolate the customer survival curve beyond the empirical distribution. The practical motivation for extrapolating the survival curve beyond the empirical distribution originates from two issues; that of calculating survival probabilities (retention rate) beyond the empirical data and of calculating the conditional survival probability and conditional mean survival time at a speci c point in time and for a speci c time window in the future. The survival probabilties are the main components needed to calculate customer lifetime value and thereafter customer equity. In this regard, we propose a survivor function that can be used to extrapolate the survival probabilities beyond the last observed failure time; the estimation of parameters of the newly proposed extrapolation function is based completely on the Kaplan-Meier estimate of the survival probabilities. The proposed function has shown a good mathematical accuracy. Furthermore, the standard error of the estimate of the extrapolation survival function has been derived. The function is ready to be used by business managers where the objective is to enhance customer retention and to emphasise a customer-centric approach. The extrapolation function can be applied and used beyond the customer survival time data to cover clinical trial applications. In general the survival analysis techniques were found to be valuable in understanding and managing a customer- rm relationship; yet, much still needs to be done in this area of research to make these techniques that are traditionally used in medical studies more useful and applicable in business settings.Item Contraception and unmet-needs in Africa(University of the Western Cape, 2009) Stiegler, Nancy; Dept. of Statistics; Faculty of ScienceThe first objective of this study is to show if diffusion of contraception in areas of traditional high fertility has gone through profound changes. Indeed, we would like to know if contraceptive behaviours have evolved because of new fertility perceptions and also because partners now have greater freedom to make choices in a relationship. The second objective of this study is not only to highlight the levels and trends of contraception and the factors influencing their use (government policies, role of family planning, etc.) in developing countries, but also to consider the population of unmet-needs of contraception. Indeed, the level of contraceptive use depends obviously on users, but also on non-users with no needs and non-users with unsatisfied needs. The understanding of this last category of females is essential to a more accurate estimation of contraception levels, and, therefore for the estimation of fertility levels. This study analyses the contraceptive use in several developing countries in Africa and highlights the unsatisfied needs of contraception, to understand why such needs exist. To do so, we shall analyse available demographic data for thirty-five African countries by using the available Demographic and Health Surveys (DHS), from the 1980's to 2000's considering the DHS I, DHS II, DHS III and DHS IV. This great variety of surveys, seventy-nine in total, permits one to compare levels of contraception and 'unmet-needs' from country to country. The surveys also, make it possible to compare the evolution over time of specific countries or specific regions, and to subsequently comprehend the determining factors of contraceptive use or non-use.Item A framework for evaluating an introductory statistics programme at the University of the Western Cape(University of the Western Cape, 2009) Makapela, Nomawabo; Kotze, Danelle; Dept. of Statistics; Faculty of ScienceThere have been calls both from the government and private sector for Higher Education institutions to introduce programmes that produce employable graduates whilst at the same time contributing to the growing economy of the country by addressing the skills shortage. Transformation and intervention committees have since been introduced to follow the extent to which the challenges are being addressed (DOE, 1996; 1997; Luescher and Symes, 2003; Forbes, 2007). Amongst the list of issues that needed urgent address were the skills shortage and underperformance of students particularly university entering students (Daniels, 2007; De Klerk, 2006; Cooper, 2001). Research particularly in the South African context, has revealed that contributing to the underperformance of university entering students and shortage of skills are: the legacy of apartheid (forcing certain racial groups to focus on selected areas such as teaching and nursing), the schooling system (resulting in university entering students to struggle), the home language and academic language. Barrell (1998), places stress on language as a contributing factor towards the performance of students. Although not much research has been done on skills shortage, most of the areas with skills shortage require Mathematics, either on a minimum or comprehensive scale. Students who have a strong Mathematics background have proved to perform better compared to students who have a limited or no Mathematics background at all in Grade 12 (Hahn, 1988; Conners, McCown & Roskos-Ewoldsen, 1998; Nolan, 2002).The department of Statistics offers an Introductory Statistics (IS) course at first year level. Resources available to enhance student learning include: a problem-solving component with web-based tutorials and students attending lectures three hours per week. The course material and all the necessary information regarding the course including teach yourself problems, useful web-sites and links students can make use of, are all stored under the Knowledge- Environment for Web-based learning (KEWL). Despite all the available information, the students were not performing well and they were not interested in the course. The department regards statistical numeracy as a life skill. The desire of the department is to break down the fear of Statistics and to bring about a perspective change in students' mindsets. The study was part of a contribution to ensuring that the department has the best first year students in Statistics in the Western Cape achieving a success rate comparable to the national norm.Item Methodological approach of the spatial distribution of maternal mortality in Burkina Faso and explanatory factors associated(University of the Western Cape, 2013) Lougue, Siaka; Susuman, Sathiya A.Maternal mortality is one of the most important problems related to the reproductive health. This is why the reduction by three quarters of maternal mortality by 2015 has been fixed as target No. 5 of the Millennium Development Goals (MDGs). Achieving this goal requires an annual decline of 5.5% of maternal mortality between 1990 and 2015. Unfortunately, the reduction as estimated in 1997 was less than 1% per year. Africa is the continent most affected by this problem. In 2010, the number of maternal mortality in the world was estimated to 287 000 and Africa was hosting more than 52 % (148 000) of the occurrence in the world In Burkina Faso, maternal mortality ratio decreased from 566 in 1991 to 484 in 1998 and 341 in 2010 according to the DHS data while the census estimate was 307 in 2006 and United Nation agencies provided the number of 300 maternal deaths per 100 000 live births in 2010. Statistics provided by the different sources vary considerably. This situation creates confusion among data users. In addition, researches made on the issue remain very insufficient because of the complexity of the issue, lack of data and poor quality of existing data on maternal mortality. This study has been initiated to fill the gap of knowledge about the determinants and estimates of maternal mortality at national and sub-national levels. Results of this research highlighted explanatory factors of maternal mortality at national and regional level with a focus on factors of regional disparities. Findings also provided estimate by adjusting the census 2006 data from missingness and incoherences, improving the census method and testing different other methods. Finally, projection of maternal mortality level is made from 2006 to 2050.Item Female migration and housing in South Africa: evidence from the 2007 community survey(University of the Western Cape, 2013) Nsengiyumva, Philomene; Tati, GabrielThroughout the world, growing evidence suggests an increase of female migrants in migration streams. In the context of South Africa, women are not exempted from migration mechanisms. This new migration phenomenon is observed to influence housing accessibility among female migrants in the areas of destinations specifically in metropolitan and non-metropolitan areas of South Africa. Yet, little is known about the forms of housing tenure female migrants use to acquire a place to live in. The methods of housing acquisition of female migrants are still imperfectly documented. Moreover, it is not clear of how housing tenure differs among female migrants between metropolitan and non-metropolitan areas. Factors determining housing tenure and at what extent those factors are selective towards women in the places of destination are not properly elaborated in the existing body of knowledge. The aim of this research is to highlight the relationship between female migration and housing acquisition in South Africa by specifically looking at household headship in a gender perspective, and how housing acquisition differ between metropolitan and non-metropolitan areas of South Africa. It is assumed that inasmuch as migration is selective, so is a really differentiated selectivity of such places as metropolises and non-metropolises. This research makes use of the 2007 Community Survey secondary data derived from Statistics South Africa. The data analysis was carried out, first, by means of univariate analysis, cross-tabulation, and Chi-square statistical test for association. Logistic regression analysis was used in order to identify the determining factors of housing tenure among female migrants. The two groups of female migrants were considered namely: female migrants heading households and those who were not heading households. The units of analysis were metropolitan and non-metropolitan areas. This research focuses on internal female migration and housing in South Africa by examining different socio-demographic, socioeconomic, migratory, households, and housing attributes, by taking into account variables such as age, population group, marital status, level of education, just to name the few. By bringing together female migrants characteristics, migratory characteristics; and housing characteristics, the study found that female migrants heading households living in metropolitan areas are more likely to stay in rented dwellings, while those who were living in areas outside metropolitan (non-metropolitan areas) were highly represented in owned and fully paid dwellings. This study found further that, besides duration of residence, housing structure type, especially the availability of standalone housing type increases the likelihood of staying in owned and fully paid housing. This study concludes that, this new female migration stream creates more tension and pressure on housing provision in metropolitan areas in relation to non-metropolitan areas. Thus, policy makers should be aware of female migration and its impact in the housing sector in order to plan accordingly.Item Statistical modelling of clustered and incomplete data with applications in population health studies in developing countries(University of Western Cape, 2014) Adegboye, Oyelola Abdulwasiu; Kotze, DanelleThe United Nations (UN) Millennium Development Goals (MDGs) drafted eight goals to be achieved by the year 2015, namely: eradicating extreme poverty and hunger, achieving universal primary education, promoting gender equality and women empowerment, reducing child mortality, improving maternal health, combating HIV/AIDS, malaria and other diseases, ensuring environmental sustainability and lastly developing a global partnership for development. Many public health studies often result in complicated and complex data sets, the nature of these data sets could be clustered, multivariate, longitudinal, hierarchical, spatial, temporal or spatio-temporal. This often results in what is called correlated data, because the assumption of independence among observations may not be appropriate. The shared genetic traits in the studies of illness or shared household characteristics among family members in the studies of poverty are examples of correlated data. In cross-sectional studies, individuals may be nested within sub-clusters (e.g., families) that are nested within clusters (e.g., environment), thus causing correlation within clusters. Ignoring the structure of the data may result in asymptotically biased parameter estimates. Clustered data may also be a result of geographical location or time (spatial and temporal). A crucial step in modelling correlated data is the speci cation of the dependency by choosing the covariance/correlation function. However, often the choice for a particular application is unclear and diagnostic tests will have to be carried out, following tting of a model. This study's view of developing countries investigates the prospects of achieving MDGs through the development of flexible predictor statistical models. The first objective of this study is to explore the existing methods for modelling correlated data sets (hierarchical, multilevel and spatial) and then apply the methods in a novel way to several data sets addressing the underlying MDGs. One of the most challenging issue in spatial or spatio-temporal analysis is the choice of a valid and yet exible correlation (covariance) structure. In cases of high dimensionality of the data, where the number of spatial locations or time points that produced the observations is large, the analysis of such data presents great computational challenges. It is debatable whether some of the classical correlation structures adequately reect the dependency in the data. The second objective is to propose a new flexible technique for handling spatial, temporal and spatio-temporal correlations. The goal of this study is to resolve the dependencies problems by proposing a more robust method for modelling spatial correlation. The techniques are used for di erent correlation structures and then combined to form the resulting estimating equations using the platform of the Generalized Method of Moments. The proposed model will therefore be built on a foundation of the Generalized Estimating Equations; this has the advantage of producing consistent regression parameter estimates under mild conditions due to separation of the processes of estimating the regression parameters from the modelling of the correlation. These estimates of the regression parameters are consistent under mild conditions. Thirdly, to account for spatio-temporal correlation in data sets, a method that decouples the two sources of correlations is proposed. Speci cally, the spatial and temporal e ects were modelled separately and then combined optimally. The approach circumvents the need of inverting the full covariance matrix and simpli es the modelling of complex relationships such as anisotropy, which is known to be extremely di cult or Lastly, large public health data sets consist of a high degree of zero counts where it is very di cult to distinguish between "true zeros" and "imputed" zeros. This can be due to the reporting mechanism as a result of insecurity, technical and logistics issues. The focus is therefore on the implementation of a technique that is capable of handling such a problem. The study will make the assumption that "imputed" zeros are a random event and consider the option of discarding the zeros, and then model a conditional Poisson model, conditioning on all cases greater than 0.Item An investigation into the health and well-being of older people in South Africa.(University of the Western Cape, 2014) Chirinda, Witness; Susuman, Sathiya A.Populations are rapidly growing older across the globe. In South Africa, life expectancy has been on the increase over the past decade, and the proportion of older people is projected to increase dramatically over the coming years. Whilst this is a remarkable achievement, it does not mean that additional years of life will be healthy. To this end, the question being asked by researchers and policy makers is whether people are living longer and healthier lives? In order to answer this important question, health expectancies have been developed which combine morbidity and mortality data into a single index that measures population health. The health expectancies have become standard measures of population health across first world countries. Unfortunately, there is little awareness about their use in developing countries, including South Africa. The aim of this study was to estimate health expectancies based on various objective and subjective measures, in order to give a first comprehensive analysis of the health and wellbeing of older people in South Africa. The data were drawn from two nationally representative surveys namely; the WHO-Study on Global Ageing and Adult Health (SAGE) and the South African National HIV Incidence, Prevalence, Behaviour and Communication Survey (SABSSM) surveys. The results are presented in the form of five manuscripts each submitted for publication. The first manuscript estimates sexually active life expectancies and factors associated with sexual activity. The results show that older people are gaining more years of sexual activity. HIV in older women and chronic conditions in older men reduced odds of sexual activity. The second manuscript found that there was both absolute and relative compression of morbidity in older people between 2005 and 2012, based on self-rated health measure. The third manuscript estimates happy life expectancy and examines factors associated with happiness in older people. Happy life expectancy was greater for men than women, and wealth status was the strongest predictor of happiness. In the fourth manuscript, subjective and objective measures were used to estimate health expectancies. The former showed a more positive outlook compared to the latter. Gender differentials were evident in that although women live longer than men, they spent a greater part of their lifetime in poorer health than men. The fifth manuscript goes a crucial step further, to estimate the contribution of specific diseases to disability. This is important for policymakers as this identifies entry points of interventions aimed at reducing the onset and burden of disability in the elderly population. The most contributors of disability were musculoskeletal and cardiovascular diseases. The thesis concludes that the health of older people is complex and multidimensional, and therefore requires several measures to give a comprehensive analysis. When measured using subjective measures, it can be concluded that the health of older people has been improving. However, a different conclusion could be reached, if objective measures are used. It is important to continue to monitor the health status of older people, and make appropriate interventions in order to improve their health, wellbeing and quality of life.Item Public sector spending in Nigeria: implications for poverty, demographic changes and millennium development goals target(University of the Western Cape, 2015) Kanayo, Ogujiuba; Stiegler, NancyOver the last two decades, budgetary allocations to both the Health and Education sectors have been on the increase in Nigeria, while a counter-factual feedback on its effects for various economic groups and distributional effect for different population households has not been defined and well known. The resultant effect has been gross inefficiency and sub-optimality in terms of observed outcomes of the fiscal framework. In-addition, there have been a continuous quest by the citizenry for increased allocations to these sectors because of its supposed impact on the poverty index and standard of living. Although this is a compelling reason, but what is worrisome and equally troubling, is that the increasing incidence of poverty and expanding inequality in the Nigerian society have not mitigated, despite the scaling up of funding on the social sectors. Furthermore, the current level of socioeconomic development in Nigeria is not in tandem with the distributive outcome targets set by the 2004 reforms. Thus, understanding the current structure of poverty in Nigeria as well as beneficiaries of public sector spending provides a sound basis for tackling inequality and redesigning the current pro-poor frameworks. However, our analysis is focused on the distributional spread of beneficiaries from services and the counterfactual reciprocity of expenditure benefits rather than measuring the exact value to recipients of government-sponsored services. Our research methodology used the 2004 Nigerian Living Standard Survey; 2010 Harmonized Nigerian Living Standard Survey; Recent Cros-sectional data (2014) in South East Nigeria and secondary sources. Econometric methods (Error Correction Method); Marginal Odds estimation techniques, Concentration Curves and Ordered Logistic Regression were used for our analysis. Statistical and Econometric Software’s (E-Views; SPSS; DAD and STATA) were used for estimations. Econometric results showed misalignments between population dynamics and public sector expenditure on education, health and economic services. The government consumption expenditure was not sensitive to demographic changes. The derived adjustment coefficients of -1.38, -1.51 and 0.51 respectively, for education, health and economic services indicate huge gaps in terms of what optimal spending should have been, giving the population dynamics. Our benefit incidence analysis indicates that substantive gains have been made at the primary education and health care level, at the state level for SE Nigeria but there is a gross misapplication of funds at the secondary and tertiary levels of both education and health sectors. Results show that the state governments’ is subsidizing the rich at the levels of both secondary and tertiary for education and health care. In addition, country wide results indicate that apart from public primary education and health care for urban residents, no other level of social service was absolutely progressive in general terms, by gender or by location while the tertiary level of both services were regressive as shown by the 2010 survey results, in comparism to the 2004 survey results. Using the Ordered Logistic Regression, our result inclines to the lifecycle hypothesis which maintains that poverty oscillates depending on the age. At a younger age, it tends to be on the high side and decreases during the middle ages and increases with age. Our results discards the feminization of poverty general framework that women or female headed households are more prone to poverty due principally to low education and lack of opportunity to own assets such as land amongst others. This wasn’t the case for the South East Region of Nigeria. Estimates indicate that education status, health status and access to health facilities affected the category of welfare of head of households and invariable, the entire household. In general, our analysis shows misalignment of social expenditure for various population groups, both at the federal and state levels; making doubtful the realization of basic MDGs. Nigeria has to combine growth policies and assuring that demographics count, with the poor fully participating in economic development. Also, the need for a refocusing in resource allocation taking into cognizance gender dimensions cannot be overemphasized. A general re-allocation of spending going to females and the poorer households would lead to improvement in gender equality and health status of women and children. Expediting actions towards qualitative education will lead directly to an acceleration of many of the other MDGs, especially those focusing on the reduction of poverty and inequality. To attain MDG targets (post 2015) within a shorter period of time, there is the need to improve the quality of social infrastructure and services. Furthermore, research should be focused on improving knowledge and understanding of what policies, technologies and investments matter for sustained growth in the country. This will create the much needed multiplier effect on other aggregates. The degree to which the poor participate in the growth process and share in its proceeds matter; both in the pace and pattern of growth. It is therefore important to have categorization of the population into economic groups when formulating a developmental framework for poverty reduction programmes. The study recommends sequencing of interventions, strengthening of institutions and other several interrelated areas to attain effectiveness of public sector spending.Item Determinants of youth sexual behaviours and knowledge of reproductive tract infections (RTIs) and sexually transmitted infections (STIs) in Malawi : evidenced from the Demographic Health Survey 2010(University of the Western Cape, 2015) Ningpuanyeh, Wilson Chialepeh; Susuman, Sathiya A.The sexual behaviour of youths is believed to play a role in the spread of SexuallyTransmitted Infections (STIs) and Reproductive Tract Infections (RTIs). This study examinesthe determinants of youth sexual behaviours and knowledge of reproductive tract infections (RTIs) and sexually transmitted infections (STIs) in Malawi. It explores rural/urbandifferentials in sexual behaviours using indicators such as early sexual initiation, multiplesexual partnerships, and non-use of condoms, in order to establish policy recommendationstoward improving sexual behaviour among youths. The Malawi Demographic Health Survey2010 data was used. Out of a sample of 2987 males and 9559 females aged 15-24 years,5652 females and 1405 males (condom use), 675 females and 511 males (inconsistentcondom use), 6470 females and 2026 males (multiple sexual partnerships (MSP)), and 15217females and 1405 males (early sexual debut) were filtered in the study.Chi-square and logistic regression techniques were performed to test for association betweensexual behaviour indicators and socio-demographic variables. The prevalence of non-use ofcondom was higher among catholic females (OR=1.11), lower among Muslim males (OR=0.81) and higher among CCAP females (OR=1.19). Muslim females were (OR=1.42) more likely to initiate sexual activities early, while Muslim males were (OR= 0.57) less likelyto initiate sexually activities early. Females in the central region (OR=1.51) and catholicmales (OR=1.63) were more likely to have more sexual partners.Encouraging these young people to be faithful to one uninfected partner, abstinence fromsexual activities, use condoms consistently and delay sexual initiation will help curb the spread of STIs in Malawi.Item A forgotten diaspora : forced Indian Migration to the Cape Colony, 1658 to 1834(University of the Western Cape, 2015) Rama, Parbavati; Shell, Robert C. H.; Stiegler, NancyThis thesis aims to explore Indian forced migration to the Cape Colony from 1658 to 1834. The forgotten diaspora‘ of its title refers to the first Indians who had come to the shores of South Africa, long before the arrival—between 1860 and 1911—of the indentured Indians. This diaspora has been forgotten, partially because these migrants came as slaves. The author uses data extracted from the newly transcribed Master of the Orphan Chamber (MOOC) series and slave transfers which are housed in the Western Cape Provincial Archives and Records Service (WCARS). The Cape colonial data is considered among the best in the world. Earlier historians such as Victor de Kock, Anna Böeseken, Frank Bradlow and Margaret Cairns, have made us aware of their existence primarily through Transportenkennis and Schepenkennis (transport and shipping information) documents in the Deeds Registry. Not nearly enough, however, is known about these Indian slaves, especially about those who arrived between 1731 and 1834. These lacunae include the number of arrivals; their sex ratios; ages and origins; and the circumstances under which they came. This thesis aims to construct a census of Indian slaves brought to the Cape from 1658 to 1834—along the lines of Philip Curtin's aggregated census of the Trans- Atlantic slave trade, but based on individual case level data coded directly from primary sources. This is the first time the size of the creole population born at the Cape will be established.Item Survival modelling and analysis of HIV/AIDS patients on HIV care and antiretroviral treatment to determine longevity prognostic factors(University of the Western Cape, 2016) Maposa, Innocent; Blignaut, RenetteThe HIV/AIDS pandemic has been a torment to the African developmental agenda, especially the Southern African Development Countries (SADC), for the past two decades. The disease and condition tends to affect the productive age groups. Children have also not been spared from the severe effects associated with the disease. The advent of antiretroviral treatment (ART) has brought a great relief to governments and patients in these regions. More people living with HIV/AIDS have experienced a boost in their survival prospects and hence their contribution to national developmental projects. Survival analysis methods are usually used in biostatistics, epidemiological modelling and clinical research to model time to event data. The most interesting aspect of this analysis comes when survival models are used to determine risk factors for the survival of patients undergoing some treatment or living with a certain disease condition. The purpose of this thesis was to determine prognostic risk factors for patients' survival whilst on ART. The study sought to highlight the risk factors that impact the survival time negatively at different survival time points. The study utilized a sample of paediatric and adult datasets from Namibia and Zimbabwe respectively. The paediatric dataset from Katutura hospital (Namibia) comprised of the adolescents and children on ART, whilst the adult dataset from Bulawayo hospital (Zimbabwe) comprised of those patients on ART in the 15 years and above age categories. All datasets used in this thesis were based on retrospective cohorts followed for some period of time. Different methods to reduce errors in parameter estimation were employed to the datasets. The proportional hazards, Bayesian proportional hazards and the censored quantile regression models were utilized in this study. The results from the proportional hazards model show that most of the variables considered were not signifcant overall. The Bayesian proportional hazards model shows us that all the considered factors had different risk profiles at the different quartiles of the survival times. This highlights that by using the proportional hazards models, we only get a fixed constant effect of the risk factors, yet in reality, the effect of risk factors differs at different survival time points. This picture was strongly highlighted by the censored quantile regression model which indicated that some variables were significant in the early periods of initiation whilst they did not significantly affect survival time at any other points in the survival time distribution. The censored quantile regression models clearly demonstrate that there are significant insights gained on the dynamics of how different prognostic risk factors affect patient survival time across the survival time distribution compared to when we use proportional hazards and Bayesian propotional hazards models. However, the advantages of using the proportional hazards framework, due to the estimation of hazard rates as well as it's application in the competing risk framework are still unassailable. The hazard rate estimation under the censored quantile regression framework is an area that is still under development and the computational aspects are yet to be incorporated into the mainstream statistical softwares. This study concludes that, with the current literature and computational support, using both model frameworks to ascertain the dynamic effects of different prognostic risk factors for survival in people living with HIV/AIDS and on ART would give the researchers more insights. These insights will then help public health policy makers to draft relevant targeted policies aimed at improving these patients' survival time on treatment.Item Health inequalities of children in sub-Saharan Africa from 1990 to 2010 : comparative analysis using data from Health and Demographic Surveys(University of the Western Cape, 2016) Bado, Aristide Romaric; Susuman, Sathiya A.This study is based on the assumption that the under-five mortality rate, in recent decades, has declined, particularly in developing countries. However, all the social strata across many countries do not seem to benefit from this reduction of mortality - and mortality remains abnormally high among children especially those from underprivileged social strata. This research is, therefore, a holistic approach to analyse and quantify the inequalities of health among children under five in sub-Saharan Africa over the last two decades (1990-2010). The research sought to investigate the trend and determinants of health inequalities of under-five years (mortality and morbidity) in sub-Saharan Africa (SSA) from 1990 to 2010. An essential point has been devoted to the decomposition of effects and analysis of the contribution of the factors explaining these inequalities. The data used in the study come from Demographic and Heath Surveys (DHS) done between 1990 and 2015 in sub-Saharan Africa countries. In order to analyse the inequalities in trends of mortality and morbidity of children, different selected countries that have conducted at least three DHS during the 1990-2010 period. Several statistical methods were used for data analysis. There were four chapters which is prepared with an article style. For the first paper titled "Decomposing Inequalities in Under- Five Mortality in Selected African Countries", concentration index (CI) and Generalised Linear Model (GLM) with a logit link were used to analyse and measure under 5 mortality inequalities and the associated factors. This paper has been published in the Iranian Journal of Public Health. For the second paper titled "Determinants of Under-Five Mortality in Burkina Faso: A Concentration Dimension". The study used logistics regression and Oaxaca-Blinder decomposition method for the binary outcome to analyse data was involved. For data analysis of the third paper titled "Women Education, Health Inequalities in Under-Five Mortality in sub-Saharan Africa, 1990 – 2013", logistic regression and Bius's decomposition method were used to examine the effect of mother's education level on childhood mortality. In the fourth paper titled "Trends and Risk Factors for Childhood Diarrheal in sub-Saharan Countries (1990-2010): Assessing the Neighbourhood Inequalities", a multilevel logistic regression modelling was used to determine the fixed and random effects of the risk factors associated with the diarrheal morbidity. The work carried out during this on-going thesis helps to understand the magnitude of inequalities in under-five mortality in sub-Saharan countries. The findings showed that the contributing factors of inequalities of child mortality were birth order, maternal age, parity and household size. With regards to the relationship between mother's education level and inequalities in mortality of children under-five in sub-Saharan Africa, findings showed that children of mothers who did not attend school have a higher rate of death compared to those who had been to school. However, we have observed that the inequalities have narrowed over time. The results showed the risk factors of diarrheal morbidity varied from one country to another, but the main factors included: child's age, the size of the child at birth, the quality of the main floor material, mother's education and her occupation, type of toilet, and place of residence. In conclusion, the results of this study show that inequalities in under-five mortality are still important among different social strata in sub-Saharan Africa countries. It is then urgent to take actions to save the lives of children in disadvantaged social strata.Item Some non-standard statistical dependence problems(University of the Western Cape, 2016) Bere, Alphonce; Koen, ChrisThe major result of this thesis is the development of a framework for the application of pair-mixtures of copulas to model asymmetric dependencies in bivariate data. The main motivation is the inadequacy of mixtures of bivariate Gaussian models which are commonly fitted to data. Mixtures of rotated single parameter Archimedean and Gaussian copulas are fitted to real data sets. The method of maximum likelihood is used for parameter estimation. Goodness-of-fit tests performed on the models giving the highest log-likelihood values show that the models fit the data well. We use mixtures of univariate Gaussian models and mixtures of regression models to investigate the existence of bimodality in the distribution of the widths of autocorrelation functions in a sample of 119 gamma-ray bursts. Contrary to previous findings, our results do not reveal any evidence of bimodality. We extend a study by Genest et al. (2012) of the power and significance levels of tests of copula symmetry, to two copula models which have not been considered previously. Our results confirm that for small sample sizes, these tests fail to maintain their 5% significance level and that the Cramer-von Mises-type statistics are the most powerful.Item Imputation techniques for non-ordered categorical missing data(University of the Western Cape, 2016) Karangwa, Innocent; Kotze, Danelle; Blignaut, RenetteMissing data are common in survey data sets. Enrolled subjects do not often have data recorded for all variables of interest. The inappropriate handling of missing data may lead to bias in the estimates and incorrect inferences. Therefore, special attention is needed when analysing incomplete data. The multivariate normal imputation (MVNI) and the multiple imputation by chained equations (MICE) have emerged as the best techniques to impute or fills in missing data. The former assumes a normal distribution of the variables in the imputation model, but can also handle missing data whose distributions are not normal. The latter fills in missing values taking into account the distributional form of the variables to be imputed. The aim of this study was to determine the performance of these methods when data are missing at random (MAR) or completely at random (MCAR) on unordered or nominal categorical variables treated as predictors or response variables in the regression models. Both dichotomous and polytomous variables were considered in the analysis. The baseline data used was the 2007 Demographic and Health Survey (DHS) from the Democratic Republic of Congo. The analysis model of interest was the logistic regression model of the woman’s contraceptive method use status on her marital status, controlling or not for other covariates (continuous, nominal and ordinal). Based on the data set with missing values, data sets with missing at random and missing completely at random observations on either the covariates or response variables measured on nominal scale were first simulated, and then used for imputation purposes. Under MVNI method, unordered categorical variables were first dichotomised, and then K − 1 (where K is the number of levels of the categorical variable of interest) dichotomised variables were included in the imputation model, leaving the other category as a reference. These variables were imputed as continuous variables using a linear regression model. Imputation with MICE considered the distributional form of each variable to be imputed. That is, imputations were drawn using binary and multinomial logistic regressions for dichotomous and polytomous variables respectively. The performance of these methods was evaluated in terms of bias and standard errors in regression coefficients that were estimated to determine the association between the woman’s contraceptive methods use status and her marital status, controlling or not for other types of variables. The analysis was done assuming that the sample was not weighted fi then the sample weight was taken into account to assess whether the sample design would affect the performance of the multiple imputation methods of interest, namely MVNI and MICE. As expected, the results showed that for all the models, MVNI and MICE produced less biased smaller standard errors than the case deletion (CD) method, which discards items with missing values from the analysis. Moreover, it was found that when data were missing (MCAR or MAR) on the nominal variables that were treated as predictors in the regression model, MVNI reduced bias in the regression coefficients and standard errors compared to MICE, for both unweighted and weighted data sets. On the other hand, the results indicated that MICE outperforms MVNI when data were missing on the response variables, either the binary or polytomous. Furthermore, it was noted that the sample design (sample weights), the rates of missingness and the missing data mechanisms (MCAR or MAR) did not affect the behaviour of the multiple imputation methods that were considered in this study. Thus, based on these results, it can be concluded that when missing values are present on the outcome variables measured on a nominal scale in regression models, the distributional form of the variable with missing values should be taken into account. When these variables are used as predictors (with missing observations), the parametric imputation approach (MVNI) would be a better option than MICE.Item Proximate determinants of fertility and contraceptive use among currently married women in Ethiopia(University of the Western Cape, 2017) Lailulo, Yishak Abraham; Susuman, A. SathiyaFertility is one of the elements in population dynamics that has significant contribution towards changing population size and structure over time. In Ethiopia, fertility dropped only slightly between 2000 and 2005, from 5.5 children per woman to 5.4, and then decreased further to 4.8 children in 2011(CSA, 2012). Although a slight decreasing trend has shown from year to year, it is still high as compared to developed nations (Tewodros,2011). The age at which childbearing begins is an important factor in the overall level of fertility as well as of the health and well-being of the mother and the child (CSA, 2012).In 2008, of the 1.4 billion women in the developing world of reproductive age (15-49 years), more than 570 women die per 100,000 live births, and 70 percent of them die due to totally avoidable reasons (World Bank,2010). These women live in countries where their status is poor to extremely poor, and these conditions threaten their health in many ways. Sedgh, Hussain, Bankole, and Singh (2007) found that wherever fertility is high, maternal and infant and child mortality rates are high. In addition to these, high fertility and shorter birth intervals affect the survival chance of children and the health status of mothers. Demographic and Health Surveys (DHS) data from 18 developing countries in Asia, Latin America, Africa, and the Middle East showed that a birth interval of threeyears increases the survival status of under-five children (Rutstein, 2003). Moreover, a similar survey of 52 developing countries found that markedly short birth intervals have a negative effect on pregnancy outcomes, increased morbidity in pregnancy, and increased infant and child mortality (Rutstein,2005). Setty-Venugopal and Upadhyay (2002) have documented that, in Sub-Saharan Africa, about 60% of women deliver the next child before the index child celebrates his/her third birthday, and almost a quarter before the second birth day.Item Missing imputation methods explored in big data analytics(University of the Western Cape, 2018) Brydon, Humphrey Charles; Blignaut, RenetteThe aim of this study is to look at the methods and processes involved in imputing missing data and more specifically, complete missing blocks of data. A further aim of this study is to look at the effect that the imputed data has on the accuracy of various predictive models constructed on the imputed data and hence determine if the imputation method involved is suitable. The identification of the missingness mechanism present in the data should be the first process to follow in order to identify a possible imputation method. The identification of a suitable imputation method is easier if the mechanism can be identified as one of the following; missing completely at random (MCAR), missing at random (MAR) or not missing at random (NMAR). Predictive models constructed on the complete imputed data sets are shown to be less accurate for those models constructed on data sets which employed a hot-deck imputation method. The data sets which employed either a single or multiple Monte Carlo Markov Chain (MCMC) or the Fully Conditional Specification (FCS) imputation methods are shown to result in predictive models that are more accurate. The addition of an iterative bagging technique in the modelling procedure is shown to produce highly accurate prediction estimates. The bagging technique is applied to variants of the neural network, a decision tree and a multiple linear regression (MLR) modelling procedure. A stochastic gradient boosted decision tree (SGBT) is also constructed as a comparison to the bagged decision tree. Final models are constructed from 200 iterations of the various modelling procedures using a 60% sampling ratio in the bagging procedure. It is further shown that the addition of the bagging technique in the MLR modelling procedure can produce a MLR model that is more accurate than that of the other more advanced modelling procedures under certain conditions. The evaluation of the predictive models constructed on imputed data is shown to vary based on the type of fit statistic used. It is shown that the average squared error reports little difference in the accuracy levels when compared to the results of the Mean Absolute Prediction Error (MAPE). The MAPE fit statistic is able to magnify the difference in the prediction errors reported. The Normalized Mean Bias Error (NMBE) results show that all predictive models constructed produced estimates that were an over-prediction, although these did vary depending on the data set and modelling procedure used. The Nash Sutcliffe efficiency (NSE) was used as a comparison statistic to compare the accuracy of the predictive models in the context of imputed data. The NSE statistic showed that the estimates of the models constructed on the imputed data sets employing a multiple imputation method were highly accurate. The NSE statistic results reported that the estimates from the predictive models constructed on the hot-deck imputed data were inaccurate and that a mean substitution of the fully observed data would have been a better method of imputation. The conclusion reached in this study shows that the choice of imputation method as well as that of the predictive model is dependent on the data used. Four unique combinations of imputation methods and modelling procedures were concluded for the data considered in this study.Item Differentials in unemployment duration across households in South Africa: A two-level modelling approach(University of the Western Cape, 2018) Lartey, Nathaniel; Tati, GabrielThis study aimed to examine the structural changes affecting the duration of unemployment across households in South Africa. It made use of existing datasets from the Labour Force Survey produced by Statistics South Africa, covering a period of six years (2011-2016). Relations among demographic and household variables were explored to determine how they related to unemployment duration. On the basis of the relations identified, a predictive analysis of unemployment duration was attempted using two-level modelling. The results suggest a significant difference in the duration of unemployment, according to the individual socio-demographic characteristics and the household moderating variables. More specifically, the greatest share percentage of both men and women experiencing long-term unemployment were found within the age group 25-34 years. The study also found that the percentage share of Non-White population groups experiencing longer duration of unemployment was more than for the White population group. Another variable found to have great influence on the duration of unemployment was the individual’s previous work experience. Going beyond the individual’s socio-demographic characteristics to consider household variables. It was found that unemployed workers living in households headed by a female are more vulnerable to longer unemployment duration. The study found individuals living in smaller households displaying longer unemployment duration. Also, it was found that individuals living in less endowed households (households where no one or few people were in gainful employment) were more vulnerable to experiencing longer unemployment spells. The study concluded with some recommendations for employment policy and follow-up research.Item Developing a model of school climate unique to secondary schools in South Africa: A multilevel analysis approach(University of the Western Cape, 2018) Winnaar, Lolita Desiree; Blignaut, Rénette; Zuze, LindaThe educational landscape in South Africa is unique and has also seen many changes since the dawn of democracy more than 20 years ago. The apartheid education system was marred by severe inequalities between schools and, for this reason, the democratic government post 1994 established a number of policies and interventions in an attempt to improve access, equity and quality between schools. The country has made significant advances in improving access to education. This is reflected in the Millennium Development Goals progress indicators showing that, as of 2013, almost all learners between the ages of 7 and 15 were enrolled in schools. While great strides have also been made with regard to equity, evidence shows that many schools in South Africa are still largely inequitable. Education quality, however, is an area that is still of grave concern and the matter requires much attention from educational stakeholders. International studies, such as the Trends in International Mathematics and Science Study (TIMSS) and the Progress in International Reading Literacy Study (PIRLS), use learner performance to measure the quality of the system. Such studies consistently report that South Africa is performing poorly and that large inequalities still exist between schools in the country. Improved quality is associated with effective schools and, in South Africa, only 20% of schools have been found to be functional or effective. Much of research focussed on school effectiveness, both nationally and internationally, however has been explained by factors in the school, including the appropriateness of curriculum content, infrastructure, resources in the school and teacher content knowledge. These factors have been found to be strongly correlated with effective schools.Item Inequalities in the use of maternal and reproductive health services in Sierra Leone(University of the Western Cape, 2019) Tsawe, Mluleki; Susuman, SathiyaThis thesis extends the literature on the trends and magnitude of health inequalities in the area of maternal and reproductive health services in Sierra Leone, and particular across sub-Saharan Africa. It attempted to provide a good understanding of, not only the determinants of maternal and reproductive healthcare use, but also factors that enable health inequalities to exist in Sierra Leone. This is an appropriate topic in population health studies as it aims to address important questions on the research agenda in the context of sub-Saharan Africa, particularly in a country with poor health outcomes such as Sierra Leone. A proper understanding of not only the coverage rates of population health outcomes but also the extent of health inequalities as well as the factors that contribute to these inequalities is crucial for any government. The thesis applied various techniques in the analysis of DHS data (from 2008 and 2013 rounds) in an attempt to answer the research questions.Item Fostering collaboration amongst business intelligence, business decision makers and statisticians for the optimal use of big data in marketing strategies(University of the Western Cape, 2019) De Koker, Louise; Tati, GabrielThe aim of this study was to propose a model of collaboration adaptable for the optimal use of big data in an organisational environment. There is a paucity of knowledge on such collaboration and the research addressed this gap. More specifically, the research attempted to establish whether leadership, trust and knowledge sharing influence collaboration among the stakeholders identified at large organisations. The conceptual framework underlying this research was informed by collaboration theory and organisational theory. It was assumed that effective collaboration in the optimal use of big data possibly is associated with leadership, knowledge sharing and trust. These concepts were scientifically hypothesised to determine whether such associations exist within the context of big data. The study used a mixed methods approach, combining a qualitative with a quantitative study. The qualitative study was in the form of in-depth interviews with senior managers from different business units at a retail organisation in Cape Town. The quantitative study was an online survey conducted with senior marketing personnel at JSE-listed companies from various industries in Cape Town. A triangulation methodology was adopted, with additional in-depth interviews of big data and analytics experts from both South Africa and abroad, to strengthen the research. The findings of the research indicate the changing role of the statistician in the era of big data and the new discipline of data science. They also confirm the importance of leadership, trust and knowledge sharing in ensuring effective collaboration. Of the three hypotheses tested, two were confirmed. Collaboration has been applied in many areas. Unexpected findings of the research were the role the chief data officer plays in fostering collaboration among stakeholders in the optimal use of big data in marketing strategies, as well as the importance of organisational structure and culture in effective collaboration in the context of big data and data science in large organisations. The research has contributed to knowledge by extending the theory of collaboration to the domain of big data in the organisational context, with the proposal of an integrated model of collaboration in the context of big data. This model was grounded in the data collected from various sources, establishing the crucial new role of the chief data officer as part of the executive leadership and main facilitator of collaboration in the organisation. Collaboration among the specified stakeholders, led by the chief data officer, occurs both horizontally with peers and vertically with specialists at different levels within the organisation in the proposed model. The application of such a model of collaboration should facilitate the successful outcome of the collaborative efforts in data science in the form of financial benefits to the organisation through the optimal use of big data.