Browsing by Author "Witbooi, Peter"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Computational prediction of host-pathogen protein-protein interactions(University of the Western Cape, 2017) Ahmed, Ibrahim H.I.; Christo els, Alan; Witbooi, PeterSupervised machine learning approaches have been applied successfully to the prediction of protein-protein interactions (PPIs) within a single organism, i.e., intra-species predictions. However, because of the absence of large amounts of experimentally validated PPIs data for training and testing, fewer studies have successfully applied these techniques to host-pathogen PPI, i.e., inter-species comparisons. Among the host-pathogen studies, most of them have focused on human-virus interactions and specifically human-HIV PPI data. Additional improvements to machine learning techniques and feature sets are important to improve the classification accuracy for host-pathogen protein-protein interactions prediction. The primary aim of this bioinformatics thesis was to develop a binary classifier with an appropriate feature set for host-pathogen protein-protein interaction prediction using published human-Hepatitis C virus PPI, and to test the model on available host-pathogen data for human-Bacillus anthracis PPI. Twelve different feature sets were compared to find the optimal set. The feature selection process reveals that our novel quadruple feature (a subsequence of four consecutive amino acid) combined with sequence similarity and human interactome network properties (such as degree, cluster coefficient, and betweenness centrality) were the best set. The optimal feature set outperformed those in the relevant published material, giving 95.9% sensitivity, 91.6% specificity and 89.0% accuracy. Using our optimal features set, we developed a neural network model to predict PPI between human-Mycobacterium tuberculosis. The strategy is to develop a model trained with intra-species PPI data and extend it to inter-species prediction. However, the lack of experimentally validated PPI data between human-Mycobacterium tuberculosis (Mtuberculosis), leads us to first assess the feasibility of using validated intra-species PPI data to build a model for inter-species PPI. In this model we used human intra-species PPI combined with Bacillus anthracis intra-species data to develop a binary classification model and extend the model for human-Bacillus anthracis inter-species prediction. Thus, we test our hypotheses on known human-Bacillus anthracis PPI data and the result shows good performance with 89.0% as average accuracy. The same approach was extended to the prediction of PPI between human-Mycobacterium tuberculosis. The predicted human-M-tuberculosis PPI data were further validated using functional enrichment of experimentally verified secretory proteins in M-tuberculosis, cellular compartment analysis and pathway enrichment analysis. Results show that five of the M-tuberculosis secretory proteins within an infected host macrophage that correspond to the mycobacterial virulent strain H37Rv were extracted from the human-M- tuberculosis PPI dataset predicted by our model. Finally, a web server was created to predict PPIs between human and Mycobacterium tuberculosis which is available online at URL:http://hppredict.sanbi.ac.za. In summary, the concepts, techniques and technologies developed as part of this thesis have the potential to contribute not only to the understanding PPI analysis between human and Mycobacterium tuberculosis, but can be extended to other pathogens. Further materials related to this study are available at ftp://ftp.sanbi.ac.za/machine learning.Item Does phylogeny have an influence on the date of first description? A comparative study of the world's fishes(Elsevier, 2020) Beukes, Brandon; Witbooi, Peter; Gibbons, Mark J.The process of species description is not random, and understanding the factors that in- fluence when a species is first described (the date of first description, DoFD) allows us to target environments and/or species' traits to increase our knowledge of diversity. Such studies typically correlate species traits (e.g. maximum size, occupational depth) against DoFD, forgetting that species are not statistically independent of each other, owing to the inheritance of shared characteristics. A recent study of extant fishes by Costello et al. (2015) identified depth and geographic range size as the most important (of many) pre- dictors of the DoFD, implying that newly described species will likely occupy restricted areas and occur deep in the water column. However, these authors failed to accommodate for “identity by descent” in their analyses. We correct that oversight here, and conclude that while the adjustments strengthen the associations between the different predictors and the DoFD, the overall affects are minimal and they do not materially change Costello et al.’s (2015) conclusions. This is briefly discussed.Item A model of malaria population dynamics with migrants(Mathematical Biosciences and Engineering, 2021-08) Witbooi, Peter; Abiodun, Gbenga; Nsuami, MozartWe present a compartmental model in ordinary differential equations of malaria disease transmission, accommodating the effect of indoor residual spraying on the vector population. The model allows for influx of infected migrants into the host population and for outflow of recovered migrants. The system is shown to have positive solutions. In the special case of no infected immigrants, we prove global stability of the disease-free equilibrium. Existence of a unique endemic equilibrium point is also established for the case of positive influx of infected migrants. As a case study we consider the combined South African malaria region. Using data covering 31 years, we quantify the effect of malaria infected immigrants on the South African malaria region.Item Prediction of human-Bacillus anthracis protein–protein interactions using multi-layer neural network(Oxford University Press, 2018) Ahmed, Ibrahim; Witbooi, Peter; Christoffels, AlanTriplet amino acids have successfully been included in feature selection to predict human-HPV protein-protein interactions (PPI). The utility of supervised learning methods is curtailed due to experimental data not being available in sufficient quantities. Improvements in machine learning techniques and features selection will enhance the study of PPI between host and pathogen.We present a comparison of a neural network model versus SVM for prediction of hostpathogen PPI based on a combination of features including: amino acid quadruplets, pairwise sequence similarity, and human interactome properties. The neural network and SVM were implemented using Python Sklearn library. The neural network model using quadruplet features and other network features outperformance the SVM model. The models are tested against published predictors and then applied to the human-B.anthracis case. Gene ontology term enrichment analysis identifies immunology response and regulation as functions of interacting proteins. For prediction of Human-viral PPI, our model (neural network) is a significant improvement in overall performance compared to a predictor using the triplets feature and achieves a good accuracy in predicting human-B.anthracis PPI.Item Relative homotopy in relational structures(Cambridge University Press, 2018) Witbooi, PeterThe homotopy groups of a finite partially ordered set (poset) can be described entirely in the context of posets, as shown in a paper by B. Larose and C. Tardif. In this paper we describe the relative version of such a homotopy theory, for pairs (X, A) where X is a poset and A is a subposet of X. We also prove some theorems on the relevant version of the notion of weak homotopy equivalences for maps of pairs of such objects. We work in the category of reflexive binary relational structures which contains the posets as in the work of Larose and Tardif.Item A stochastic TB model for a crowded environment(Hindawi, 2018) Vyambwera, Sibaliwe Maku; Witbooi, PeterWe propose a stochastic compartmental model for the population dynamics of tuberculosis. The model is applicable to crowded environments such as for people in high density camps or in prisons. We start off with a known ordinary differential equation model, and we impose stochastic perturbation.We prove the existence and uniqueness of positive solutions of a stochastic model. We introduce an invariant generalizing thebasic reproductionnumber andprove the stabilityof thedisease-free equilibriumwhen it is below unity or slightly higher than unity and the perturbation is small. Ourmain theorem implies that the stochastic perturbation enhances stability of the disease-free equilibrium of the underlying deterministic model. Finally, we perform some simulations to illustrate the analytical findings and the utility of the model.