Bajic, VladimirSchaefer, UIf2023-06-092024-05-092023-06-092024-05-092009https://hdl.handle.net/10566/13557Philosophiae Doctor - PhDThe initiation of transcription in mammalian genomes predominatly occurs at 5' promoter regions, however increasingly initiation events have been observed within introns, coding exons and 3' UTRs. Nevertheless there are large segments of mammalian genomes that are not prone to transcription initiation. These locations can be understood to be 'transcription initiation deserts'. It is challenging and useful to demarcate these segments or locations of the genome. The availability of a huge number of transcript data has provided an opportunity to develop a methodology to predict and annotate these genomic segments. A comprehensive collection of data for Homo sapiens ard Mus musculus, consisting of CAGE tags and other evidence for the existence of ffanscription was used to develop a methodology that allows the annotation of locations of mammalian genomes as those that are highly likely to initiate tanscription and those that are unlikely to harbour transcription start sites (TSSs). The algorithm allows the recognition of TSSs with 100% sensitivity, which makes it the superior choice over other existing algorithms for promoter prediction for the task of annotating TSS deserts.enGenome-wide identification and comprehensive analysis of transcriptionsl desert regionsUniversity of the Western Cape