Magister Scientiae - MSc (Bioinformatics)
Permanent URI for this collection
Browse
Browsing by Issue Date
Now showing 1 - 20 of 44
Results Per Page
Sort Options
Item HIV subtype C diversity: analysis of the relationship of sequence diversity to proposed epitope locations(University of the Western Cape, 2002) Ernstoff, Elana Ann; Hide, Winston; Seoighe, Cathal; South African National Bioinformatics Institute (SANBI); Faculty of ScienceSouthern Africa is facing one of the most serious HIV epidemics. This project contributes to the HIVNET, Network for Prevention Trials cohort for vaccine development. HIVÂ’s biology and rapid mutation rate have made vaccine design difficult. We examined HIV-1 subtype C diversity and how it relates to CTL epitope location along viral gag sequences. We found a negative correlation between codon sites under positive selection and epitope regions; suggesting epitope regions are evolutionarily conserved. It is possible that epitopes exist in non-conserved regions, yet fail to be detected due to the reference strain diverging from the circulating viral population. To test if CTL clustering is an artifact of the reference strain, we calculated differences between the gag codons and the reference strain. We found a weak negative correlation, suggesting epitopes in less conserved regions maybe evading detection. Locating conserved and optimal epitopes that can be recognized by CTLs is essential for the design of vaccine reagents.Item HIV Subtype C Diversity: Analysis of the Relationship of Sequence Diversity to Proposed Epitope Locations(University of the Western Cape, 2002) Ernstoff, Elana Ann; Hide, WinstonSouthern Africa is facing one of the most serious HIV epidemics. This project contributes to the HIVNET, Network for Prevention Trials cohort for vaccine development. HIV's biology and rapid mutation rate have made vaccine design difficult. We examined HIV-l subtype C diversity and how it relates to CTL epitope location along viral gag sequences. We found a negative correlation between codon sites under positive selection and epitope regions; suggesting epitope regions are evolutionarily conserved. It is possible that detected due to the reference regions, yet fail to be viral population. To test if CTL clustering is an we calculated differences between the gag codons and the a weak negative correlation, suggesting epitopes in less conserved regions maybe evading detection. Locating conserved and optimal epitopes that can be recognized by CTLs is essential for the design of vaccine reagents.Item Analyses of sequence divergence using completely sequenced genomes(University of the Western Cape, 2003) Nembaware, Victoria P.; Seoighe, CathalUsing the complete genome, Saccharomyces cerevisiae, which duplicated after its speciation fuom Kluyveromyces lactics, a dataset of 119 putative S. cerevisiae - K. lactis ortholog-pairs was constructed. S. cerevisiae paralogous pairs that are likely to have duplicated during the whole genome duplication of S. cerevisiae were obtained and the approach taken in our previous work (Nembaware et al., 20OZ), was repeated to test whether the presence of a paralogue in S. cerevisiae had an effect on the rate of sequence divergence of the 119 pairs of orthologous genes. We found, however, that substitutions at synonymous sites had reached saturation and this prevented us from being able to repeat the previous finding with S. cerevistae and K. lactis . From this study a publicly available web-server (http://hamlyn.sanbi.ac.zal-victoria) that automates the calculation of Ka:Ks values given a pairs homologous CDS sequences is presented.Item Assessment of genome visualization tools relevant to HIV genome research: development of a genome browser prototype(University of the Western Cape, 2004) Boardman, Anelda Philine; Hide, Winston; Faculty of ScienceOver the past two decades of HIV research, effective vaccine candidates have been elusive. Traditionally viral research has been characterized by a gene -by-gene approach, but in the light of the availability of complete genome sequences and the tractable size of the HIV genome, a genomic approach may improve insight into the biology and epidemiology of this virus. A genomic approach to finding HIV vaccine candidates can be facilitated by the use of genome sequence visualization. Genome browsers have been used extensively by various groups to shed light on the biology and evolution of several organisms including human, mouse, rat, Drosophila and C.elegans. Application of a genome browser to HIV genomes and related annotations can yield insight into forces that drive evolution, identify highly conserved regions as well as regions that yields a strong immune response in patients, and track mutations that appear over the course of infection. Access to graphical representations of such information is bound to support the search for effective HIV vaccine candidates. This study aimed to answer the question of whether a tool or application exists that can be modified to be used as a platform for development of an HIV visualization application and to assess the viability of such an implementation. Existing applications can only be assessed for their suitability as a basis for development of an HIV genome browser once a well-defined set of assessment criteria has been compiled.Item High performance computing and algorithm development: application of dataset development to algorithm parameterization(University of the Western Cape, 2006) Jonas, Mario Ricardo Edward; Hide, Winston A; South African National Bioinformatics Institute (SANBI); Faculty of ScienceA number of technologies exist that captures data from biological systems. In addition, several computational tools, which aim to organize the data resulting from these technologies, have been created. The ability of these tools to organize the information into biologically meaningful results, however, needs to be stringently tested. The research contained herein focuses on data produced by technology that records short Expressed Sequence Tags (EST's).Item miRNAMatcher: High throughput miRNA discovery using regular expressions obtained via a genetic algorithm(University of the Western Cape, 2008) Duvenage, Eugene; Bajic, Vladimir; Faculty of ScienceIn summary there currently exist techniques to discover miRNA however both require many calculations to be performed during the identification limiting their use at a genomic level. Machine learning techniques are currently providing the best results by combining a number of calculated and statistically derived features to identify miRNA candidates, however almost all of these still include computationally intensive secondary-structure calculations. It is the aim of this project to produce a miRNA identification process that minimises and simplifies the number of computational elements required during the identification process.Item Market segmentation and factors affecting stock returns on the JSE(2008) Chimanga, Artwell S.; Kotze, DanelleThis study examines the relationship between stock returns and market segmentation. Monthly returns of stocks listed on the JSE from 1997-2007 are analysed using mostly the analytic factor and cluster analysis techniques. Evidence supporting the use of multi-index models in explaining the return generating process on the JSE is found. The results provide additional support for Van Rensburg (1997)'s hypothesis on market segmentation on the JSE.Item Computational verification of published human mutations(University of the Western Cape, 2008) Kamanu, Frederick Kinyua; Lehväslaiho, Heikki; Bajic, Vladimir; Faculty of ScienceThe completion of the Human Genome Project, a remarkable feat by any measure, has provided over three billion bases of reference nucleotides for comparative studies. The next, and perhaps more challenging step is to analyse sequence variation and relate this information to important phenotypes. Most human sequence variations are characterized by structural complexity and, are hence, associated with abnormal functional dynamics. This thesis covers the assembly of a computational platform for verifying these variations, based on accurate, published, experimental data.Item Development and implementation of ontology-based systems for mammalian gene expression profiling(University of the Western Cape, 2009) Kruger, Adele; Hide, WinstonThe use of ontologies in the mapping of gene expression events provides an effective and comparable method to determine the expression profile of an entire genome across a large collection of experiments derived from different expression sources. In this dissertation I describe the development of the developmental human and mouse e voe ontologies and demonstrate the ontologies by identifying genes showing a bias for developmental brain expression in human and mouse, identifying transcription factor complexes, and exploring the mouse orthologs of human cancer/testis genes.Item An evolutionary genomics approach towards analysis of genes implicated in transmission of trypanosomes between tsetse fly and mammalian host(2009) Mwangi, Sarah Wambui; Christoffels, AlanHuman African trypanosomiasis is the world’s third most important parasitic disease affecting human health after malaria and schistosomiaisis. The world health organization estimates approximately 60 million people at risk in sub-Saharan Africa and up to 50,000 deaths per year caused by trypanosomiasis. Current management of human African trypanosomiasis relies on active surveillance and chemotherapy of infected patients. Efforts to develop a vaccine to immunize the human host have been hampered by antigenic variation of the parasites cell coat. The advent of the genome era has opened up opportunities for developing novel strategies for interrupting the transmission cycle of trypanosomes, specifically using any of the three players,the human host, the tsetse fly vector and/or the parasite. The human genome has been deciphered and the genomes of several trypanosome species have been sequenced. Sequencing of additional neglected trypanosome species is in progress. The tsetse fly genome is currently being sequenced as part of the genomic activities of the International Glossina genome initiative (IGGI). In an attempt to support the tsetse fly sequencing effort, expressed sequence tags (ESTs) from various tissues and developmental stages of Glossina morsitans have been generated.In this study, tsetse fly EST data was analyzed using bioinformatics approaches, focusing on transcripts encoding serpin genes implicated in the immune defenses of tsetse flies. Glossina morsitans homologues to Drosophila melanogaster serpin4, serpin5, and serpin27A and Anopheles gambiae serpin10 were identified in the tsetse fly EST contigs. Comparison of the reactive center loop of tsetse fly serpins with human α-1-antitrypsin suggests that these tsetse serpins are inhibitory. Preliminary EST clustering did not succeed in assembling 3564 Tsal encoded ESTs into one contig. In this study, these ESTs were assembled together with three published Tsal cDNAs. A total of 29 Tsal-encoded contigs were generated. An analysis of the sequence variation within the Tsal EST assembled contigs identified five single base mismatches namely A-T, T-A, G-T and T-G.Results from this study form a basis onto which genetic and biochemical experimental studies can be designed, a process that will be successfully carried out once we have a reference genome. Specifically, studies aimed at genetic modification of tsetse flies towards populations that are inhabitable to trypanosomes. Ultimately, this will supplement current vector control strategies towards elimination of human African trypanosomiasis.Item A comparative genomics approach towards classifying immunity-related proteins in the tsetse fly(2009) Mpondo, Feziwe; Hide, Winston; Christoffels, AlanTsetse flies (Glossina spp) are vectors of African trypanosome (Trypanosoma spp) parasites, causative agents of Human African trypanosomiasis (sleeping sickness) and Nagana in livestock. Research suggests that tsetse fly immunity factors are key determinants in the success and failure of infection and the maturation process of parasites. An analysis of tsetse fly immunity factors is limited by the paucity of genomic data for Glossina spp. Nevertheless, completely sequenced and assembled genomes of Drosophila melanogaster, Anopheles gambiae and Aedes aegypti provide an opportunity to characterize protein families in species such as Glossina by using a comparative genomics approach. In this study we characterize thioester-containing proteins (TEPs), a sub-family of immunity-related proteins, in Glossina by leveraging the EST data for G.morsitans and the genomic resources of D. melanogaster, A. gambiae as well as A.aegypti.A total of 17 TEPs corresponding to Drosophila (four TEPs), Anopheles (eleven TEPs) and Aedes aegypti (two TEPs) were collected from published data supplemented with Genbank searches. In the absence of genome data for G. morsitans, 124 000 G.morsitans ESTs were clustered and assembled into 18 413 transcripts (contigs and singletons). Five Glossina contigs (Gmcn1115, Gmcn1116, Gmcn2398, Gmcn2281 and Gmcn4297) were identified as putative TEPs by BLAST searches. Phylogenetic analyses were conducted to determine the relationship of collected TEP proteins.Gmcn1115 clustered with DmtepI and DmtepII while Gmcn2398 is placed in a separate branch, suggesting that it is specific to G. morsitans.The TEPs are highly conserved within D. melanogaster as reflected in the conservation of the thioester domain, while only two and one TEPs in A. gambiae and A. aegypti thioester domain show conservation of the thioester domain suggesting that these proteins are subjected to high levels of selection. Despite the absence of a sequenced genome for G. morsitans, at least two putative TEPs where identified from EST data.Item Incidence and regulatory implications of single Nucleotide polymorphisms among established ovarian cancer genes(University of the Western Cape, 2009) Ramdayal, Kavisha; Lehväslaiho, Heikki; Bajic, Vladimir; NULL; Faculty of ScienceOVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled.Item The development of a single nucleotide polymorphism database for forensic identification of specified physical traits(University of the Western Cape, 2009) Naidu, Alecia Geraldine; Bajic, Vladimir; NULL; Faculty of ScienceMany Single Nucleotide Polymorphisms (SNPs) found in coding or regulatory regions within the human genome lead to phenotypic differences that make prediction of physical appearance, based on genetic analysis, potentially useful in forensic investigations. Complex traits such as pigmentation can be predicted from the genome sequence, provided that genes with strong effects on the trait exist and are known. Phenotypic traits may also be associated with variations in gene expression due to the presence of SNPs in promoter regions. In this project, the identification of genes associated with these physical traits of potential forensic relevance have been collated from the literature using a text mining platform and hand curation. The SNPs associated with these genes have been acquired from public SNP repositories such as the International HapMap project, dbSNP and Ensembl. Characterization of different population groups based on the SNPs has been performed and the results and data stored in a MySQL database. This database contains SNP genotyping data with respect to physical phenotypic differences of forensic interest. The potential forensicrelevance of the SNP information contained in this database has been verified through in silico SNP analysis aimed at establishing possible relationships between SNP occurrence and phenotype. The software used for this analysis is MATCH™. Data management and access has been enhanced by the use of a functional web-based front-end which enables the users to extract and display SNP information without running complex Structured Query Language (SQL) statements from the command line. This Forensic SNP Phenotype resource can be accessed at http://forensic.sanbi.ac.za/alecia_forensics/Index.htmlItem A comparative genomics approach towards classifying immunity-related proteins in the tsetse fly(University of the western cape, 2009) Mpondo, Feziwe; Hide, Winston; Christoffels, AlanTsetse flies (Glossina spp) are vectors of African trypanosome (Trypanosoma spp) parasites, causative agents of Human African trypanosomiasis (sleeping sickness) and Nagana in livestock. Research suggests that tsetse fly immunity factors are key determinants in the success and failure of infection and the maturation process of parasites. An analysis of tsetse fly immunity factors is limited by the paucity of genomic data for Glossina spp. Nevertheless, completely sequenced and assembled genomes Drosophila melanogaster, Anopheles gambiae and Aedes aegypti provide an opportunity to characterize protein families in species such as G/ossiza by using a comparative genomics approach. In this study, we characterize thioester-containing proteins (TEPs), a sub-family of immunity-related proteins, in Glossinaby leveraging the EST data for G. morsitans and the genomic resources of D. melanogaster, A. gambiae as well as A. aegyptiItem A knowledgebase of stress reponsive gene regulatory elements in arabidopsis Thaliana(University of the Western Cape, 2011) Adam, Muhammed Saleem; Bajic, Vladimir; Christoffels, Alan; South African National Bioinformatics Institute (SANBI); Faculty of ScienceStress responsive genes play a key role in shaping the manner in which plants process and respond to environmental stress. Their gene products are linked to DNA transcription and its consequent translation into a response product. However, whilst these genes play a significant role in manufacturing responses to stressful stimuli, transcription factors coordinate access to these genes, specifically by accessing a gene's promoter region which houses transcription factor binding sites. Here transcriptional elements play a key role in mediating responses to environmental stress where each transcription factor binding site may constitute a potential response to a stress signal. Arabidopsis thaliana, a model organism, can be used to identify the mechanism of how transcription factors shape a plant's survival in a stressful environment. Whilst there are numerous plant stress research groups, globally there is a shortage of publicly available stress responsive gene databases. In addition a number of previous databases such as the Generation Challenge Programme's comparative plant stressresponsive gene catalogue, Stresslink and DRASTIC have become defunct whilst others have stagnated. There is currently a single Arabidopsis thaliana stress response database called STIFDB which was launched in 2008 and only covers abiotic stresses as handled by major abiotic stress responsive transcription factor families. Its data was sourced from microarray expression databases, contains numerous omissions as well as numerous erroneous entries and has not been updated since its inception.The Dragon Arabidopsis Stress Transcription Factor database (DASTF) was developed in response to the current lack of stress response gene resources. A total of 2333 entries were downloaded from SWISSPROT, manually curated and imported into DASTF. The entries represent 424 transcription factor families. Each entry has a corresponding SWISSPROT, ENTREZ GENBANK and TAIR accession number. The 5' untranslated regions (UTR) of 417 families were scanned against TRANSFAC's binding site catalogue to identify binding sites. The relational database consists of two tables, namely a transcription factor table and a transcription factor family table called DASTF_TF and TF_Family respectively. Using a two-tier client-server architecture, a webserver was built with PHP, APACHE and MYSQL and the data was loaded into these tables with a PYTHON script. The DASTF database contains 60 entries which correspond to biotic stress and 167 correspond to abiotic stress while 2106 respond to biotic and/or abiotic stress. Users can search the database using text, family, chromosome and stress type search options. Online tools have been integrated into the DASTF, database, such as HMMER, CLUSTALW, BLAST and HYDROCALCULATOR. User's can upload sequences to identify which transcription factor family their sequences belong to by using HMMER. The website can be accessed at http://apps.sanbi.ac.za/dastf/ and two updates per year are envisaged.Item Normalization and statistical methods for crossplatform expression array analysis(University of the Western Cape, 2012) Mapiye, Darlington S; Gamieldien, Junaid; Christoffels, AlanA large volume of gene expression data exists in public repositories like the NCBI’s Gene Expression Omnibus (GEO) and the EBI’s ArrayExpress and a significant opportunity to re-use data in various combinations for novel in-silico analyses that would otherwise be too costly to perform or for which the equivalent sample numbers would be difficult to collects exists. For example, combining and re-analysing large numbers of data sets from the same cancer type would increase statistical power, while the effects of individual study-specific variability is weakened, which would result in more reliable gene expression signatures. Similarly, as the number of normal control samples associated with various cancer datasets are often limiting, datasets can be combined to establish a reliable baseline for accurate differential expression analysis. However, combining different microarray studies is hampered by the fact that different studies use different analysis techniques, microarray platforms and experimental protocols. We have developed and optimised a method which transforms gene expression measurements from continuous to discrete data points by grouping similarly expressed genes into quantiles on a per-sample basis. After cross mapping each probe on each chip to the gene it represents, thereby enabling us to integrate experiments based on genes they have in common across different platforms. We optimised the quantile discretization method on previously published prostate cancer datasets produced on two different array technologies and then applied it to a larger breast cancer dataset of 411 samples from 8 microarray platforms. Statistical analysis of the breast cancer datasets identified 1371 differentially expressed genes. Cluster, gene set enrichment and pathway analysis identified functional groups that were previously described in breast cancer and we also identified a novel module of genes encoding ribosomal proteins that have not been previously reported, but whose overall functions have been implicated in cancer development and progression. The former indicates that our integration method does not destroy the statistical signal in the original data, while the latter is strong evidence that the increased sample size increases the chances of finding novel gene expression signatures. Such signatures are also robust to inter-population variation, and show promise for translational applications like tumour grading, disease subtype classification, informing treatment selection and molecular prognostics.Item The potential of commercial praziquantel formulations as "off label" treatments for diplectanum oliveri (monogenea) infecting cultured argyrosomus species in the South African marine finfish aquaculture industry(University of the Western Cape, 2012) Joubert, Casper Jan Hendrik; Christison, Kevin; Kaiser, HorstAquaculture is a vast industry all over the world and has increased significantly during the past 30 years. In South Africa, finfish aquaculture farms stretch from Gansbaai to as far as Richards bay with the potential of extending into Mozambique. The future success of this fast growing industry in South Africa strongly relies on the development of the supporting sector such as government legislation, sponsorship, participation of the pharmaceutical industry and research and development in aquatic organism health management. Diplectanum oliveri Williams, 1989, a monogenean gill parasite of both Argyrosomus japonicus (Temminck & Schlegel, 1843) (dusky kob) and A. inodorus Griffiths & Heemstra, 1995 (silver kob) is currently regarded in South Africa as the most persistent ectoparasite associated with the culture of both fish species, causing pathological tissue changes in the areas associated with attachment and feeding which can result in stock losses. The egg production of D. oliveri was used to evaluate and develop a method to quantify monogenean infections on fish, by counting the eggs produced by infra-populations of these parasites over a 24-hour period and to determine the reliability of this method as a non-invasive/non-destructive method to quantify the intensity of an individual infra-population of parasites on a single host. Currently, Diplectanum spp. on dusky kob are being controlled in local mariculture facilities using methods and drugs that are traditionally used for monogeneans (flukes) and are regarded as effective. Most of these drugs are, however, no longer approved for use in food fish and none of them has proven to be very effective in controlling D. oliveri in culture facilities, which can result in subsequent re-infections of epidemic proportion. Currently, there are no anthelmintics registered for aquaculture in South Africa. An registered anthelmintic used in terrestrial animals (sheep, goats, cattle and ostriches) containing praziquantel was tested at various concentrations and exposures against D. oliveri on A. japonicus to determine the efficacy of two different formulations and the potential for "off label" use. The 20 ppm (high) praziquantel concentration treatments eliminated all adult parasites, but caused significant measureable stress and affected the central nervous system of the fish, which resulted in death of all fish in the solution group after 18 hours. The 2 ppm (low) concentrations failed to remove all adult parasites. Although both the 2 hour (short) exposure/high concentration and 24 hour (long) exposure/low concentration of the suspension formulation were effective, but only the short exposure/high concentration eliminated all adult parasites with little change in behaviour by the treated fish.Item The identification of biologically important secondary structures in disease-causing RNA viruses.(University of the Western Cape, 2012) Tanov, Emil Pavlov; Harkins, Gordon W.; Christoffels, AlanViral genomes consist of either deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The viral RNA molecules are responsible for two functions, firstly, their sequences contain the genetic code, which encodes the viral proteins, and secondly, they may form structural elements important in the regulation of the viral life-cycle. Using a host of computational and bioinformatics techniques we investigated how predicted secondary structure may influence the evolutionary dynamics of a group of single-stranded RNA viruses from the Picornaviridae family. We detected significant and marginally significant correlations between regions predicted to be structured and synonymous substitution constraints in these regions, suggesting that selection may be acting on those sites to maintain the integrity of certain structures. Additionally, coevolution analysis showed that nucleotides predicted to be base paired, tended to co-evolve with one another in a complimentary fashion in four out of the eleven species examined. Our analyses were then focused on individual structural elements within the genome-wide predicted structures. We ranked the predicted secondary structural elements according to their degree of evolutionary conservation, their associated synonymous substitution rates and the degree to which nucleotides predicted to be base paired coevolved with one another. Top ranking structures coincided with well characterized secondary structures that have been previously described in the literature. We also assessed the impact that genomic secondary structures had on the recombinational dynamics of picornavirus genomes, observing a strong tendency for recombination breakpoints to occur in non-coding regions. However, convincing evidence for the association between the distribution of predicted RNA structural elements and breakpoint clustering was not detected.Item The identification of biologically important secondary structures in disease-causing RNA viruses(University of the Western Cape, 2012) Tanov, Emil Pavlov; Harkins, Gordon W.; Christoffels, AlanViral genomes consist of either deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The viral RNA molecules are responsible for two functions, firstly, their sequences contain the genetic code, which encodes the viral proteins, and secondly, they may form structural elements important in the regulation of the viral life-cycle. Using a host of computational and bioinformatics techniques we investigated how predicted secondary structure may influence the evolutionary dynamics of a group of single-stranded RNA viruses from the Picornaviridae family. We detected significant and marginally significant correlations between regions predicted to be structured and synonymous substitution constraints in these regions, suggesting that selection may be acting on those sites to maintain the integrity of certain structures. Additionally, coevolution analysis showed that nucleotides predicted to be base paired, tended to co-evolve with one another in a complimentary fashion in four out of the eleven species examined. Our analyses were then focused on individual structural elements within the genome-wide predicted structures. We ranked the predicted secondary structural elements according to their degree of evolutionary conservation, their associated synonymous substitution rates and the degree to which nucleotides predicted to be base paired coevolved with one another. Top ranking structures coincided with well characterized secondary structures that have been previously described in the literature. We also assessed the impact that genomic secondary structures had on the recombinational dynamics of picornavirus genomes, observing a strong tendency for recombination breakpoints to occur in non-coding regions. However, convincing evidence for the association between the distribution of predicted RNA structural elements and breakpoint clustering was not detected.Item Regulatory attributes of the carotenoid biosynthetic pathway in Arabidopsis Thaliana under abiotic stress(University of the Western Cape, 2012) Khan, Firdous; Christoffels, AlanCarotenoids are tetraprenoid (C40) molecules synthesized in plants, fungi, bacteria and algae, via the carotenoid biosynthetic pathway (CBP). Some carotenoids are readily converted to vitamin A (VA) in humans, e.g. 13-carotene, c(-carotene and B-cryptoxanthin 1,2. Vitamin a deficiency (VAD) affect millions especially children under the age of five. The CBP in plants is a key source of pro-vitamin A and is vital to the biofortification of staple crops such as maize, rice and sorghum, could alleviate the global VAD problem. However the incomplete understanding of regulation of the pathway is a limiting factor to predictably control carotenoid content at the systems level. Previous studies have shown that growth conditions, such as light, play a major role in the biosynthesis of carotenoids. A systems biology approach was therefore used to analyse microarray data sets derived from A. thaliana grown under various conditions and treated with different stimuli. Thirty two genes have previously been identified as being involved in the CBP. These genes were found to be highly differentially expressed depending on stress type. All stimuli including drought, cold, heat, osmotic, oxidative and salt but wounding had a significant influence on the CBP genes. Gene expression induced by abiotic stress occured 30 min after exposure. These findings are indicative that an immediate systemic signal is sent to the rest of the plant in response to stress. A correlation analyses revealed strongly positive correlation between PSY and its co-expressed genes, suggesting they share a common regulatory mechanism. Promoter content analyses identified 20 enriched TFBMs among carotenoid genes. The most prevalent TFBMs found in the promoter regions of the CBP genes show a 1.25-3 fold increase in prevalence with a p-value < 0.05. Similar GO terms are enriched for CBP genes and their co-expressed genes. These findings indicate that carotenoid biosynthetic pathway genes and their co-expressed genes are involved in similar metabolic pathways and functional processes. This study identified cold, drought and heat to influence carotenoid gene expression and has led to the identification of molecular switches that can be modulated to control the biosynthetic pathway. Four motifs without any GO annotation and no specific known motif in plant databases were identified using MEME suite. In this study I propose that these predictions might be novel motifs and could be specific to carotenoid genes, and may be directly involved in the regulation of carotenoid biosynthesis. These findings may lead to a better understanding of the underlying regulatory mechanisms involved in the biosynthesis of carotenoids. Furthermore, these findings may assist in establishing ways of enhancing the production of carotenoids, especially pro-vitamin A, in Arabidopsis thaliana.
- «
- 1 (current)
- 2
- 3
- »