Magister Scientiae - MSc (Bioinformatics)
Permanent URI for this collection
Browse
Browsing by Author "Bajic, Vladimir"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Computational verification of published human mutations(University of the Western Cape, 2008) Kamanu, Frederick Kinyua; Lehväslaiho, Heikki; Bajic, Vladimir; Faculty of ScienceThe completion of the Human Genome Project, a remarkable feat by any measure, has provided over three billion bases of reference nucleotides for comparative studies. The next, and perhaps more challenging step is to analyse sequence variation and relate this information to important phenotypes. Most human sequence variations are characterized by structural complexity and, are hence, associated with abnormal functional dynamics. This thesis covers the assembly of a computational platform for verifying these variations, based on accurate, published, experimental data.Item The development of a single nucleotide polymorphism database for forensic identification of specified physical traits(University of the Western Cape, 2009) Naidu, Alecia Geraldine; Bajic, Vladimir; NULL; Faculty of ScienceMany Single Nucleotide Polymorphisms (SNPs) found in coding or regulatory regions within the human genome lead to phenotypic differences that make prediction of physical appearance, based on genetic analysis, potentially useful in forensic investigations. Complex traits such as pigmentation can be predicted from the genome sequence, provided that genes with strong effects on the trait exist and are known. Phenotypic traits may also be associated with variations in gene expression due to the presence of SNPs in promoter regions. In this project, the identification of genes associated with these physical traits of potential forensic relevance have been collated from the literature using a text mining platform and hand curation. The SNPs associated with these genes have been acquired from public SNP repositories such as the International HapMap project, dbSNP and Ensembl. Characterization of different population groups based on the SNPs has been performed and the results and data stored in a MySQL database. This database contains SNP genotyping data with respect to physical phenotypic differences of forensic interest. The potential forensicrelevance of the SNP information contained in this database has been verified through in silico SNP analysis aimed at establishing possible relationships between SNP occurrence and phenotype. The software used for this analysis is MATCH™. Data management and access has been enhanced by the use of a functional web-based front-end which enables the users to extract and display SNP information without running complex Structured Query Language (SQL) statements from the command line. This Forensic SNP Phenotype resource can be accessed at http://forensic.sanbi.ac.za/alecia_forensics/Index.htmlItem Incidence and regulatory implications of single Nucleotide polymorphisms among established ovarian cancer genes(University of the Western Cape, 2009) Ramdayal, Kavisha; Lehväslaiho, Heikki; Bajic, Vladimir; NULL; Faculty of ScienceOVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled.Item A knowledgebase of stress reponsive gene regulatory elements in arabidopsis Thaliana(University of the Western Cape, 2011) Adam, Muhammed Saleem; Bajic, Vladimir; Christoffels, Alan; South African National Bioinformatics Institute (SANBI); Faculty of ScienceStress responsive genes play a key role in shaping the manner in which plants process and respond to environmental stress. Their gene products are linked to DNA transcription and its consequent translation into a response product. However, whilst these genes play a significant role in manufacturing responses to stressful stimuli, transcription factors coordinate access to these genes, specifically by accessing a gene's promoter region which houses transcription factor binding sites. Here transcriptional elements play a key role in mediating responses to environmental stress where each transcription factor binding site may constitute a potential response to a stress signal. Arabidopsis thaliana, a model organism, can be used to identify the mechanism of how transcription factors shape a plant's survival in a stressful environment. Whilst there are numerous plant stress research groups, globally there is a shortage of publicly available stress responsive gene databases. In addition a number of previous databases such as the Generation Challenge Programme's comparative plant stressresponsive gene catalogue, Stresslink and DRASTIC have become defunct whilst others have stagnated. There is currently a single Arabidopsis thaliana stress response database called STIFDB which was launched in 2008 and only covers abiotic stresses as handled by major abiotic stress responsive transcription factor families. Its data was sourced from microarray expression databases, contains numerous omissions as well as numerous erroneous entries and has not been updated since its inception.The Dragon Arabidopsis Stress Transcription Factor database (DASTF) was developed in response to the current lack of stress response gene resources. A total of 2333 entries were downloaded from SWISSPROT, manually curated and imported into DASTF. The entries represent 424 transcription factor families. Each entry has a corresponding SWISSPROT, ENTREZ GENBANK and TAIR accession number. The 5' untranslated regions (UTR) of 417 families were scanned against TRANSFAC's binding site catalogue to identify binding sites. The relational database consists of two tables, namely a transcription factor table and a transcription factor family table called DASTF_TF and TF_Family respectively. Using a two-tier client-server architecture, a webserver was built with PHP, APACHE and MYSQL and the data was loaded into these tables with a PYTHON script. The DASTF database contains 60 entries which correspond to biotic stress and 167 correspond to abiotic stress while 2106 respond to biotic and/or abiotic stress. Users can search the database using text, family, chromosome and stress type search options. Online tools have been integrated into the DASTF, database, such as HMMER, CLUSTALW, BLAST and HYDROCALCULATOR. User's can upload sequences to identify which transcription factor family their sequences belong to by using HMMER. The website can be accessed at http://apps.sanbi.ac.za/dastf/ and two updates per year are envisaged.Item miRNAMatcher: High throughput miRNA discovery using regular expressions obtained via a genetic algorithm(University of the Western Cape, 2008) Duvenage, Eugene; Bajic, Vladimir; Faculty of ScienceIn summary there currently exist techniques to discover miRNA however both require many calculations to be performed during the identification limiting their use at a genomic level. Machine learning techniques are currently providing the best results by combining a number of calculated and statistically derived features to identify miRNA candidates, however almost all of these still include computationally intensive secondary-structure calculations. It is the aim of this project to produce a miRNA identification process that minimises and simplifies the number of computational elements required during the identification process.