Mr. Peter van Heusden

Permanent URI for this collectionhttps://hdl.handle.net/10566/2163

Position:	Systems Administrator
Department:	South African National Bioinformatics Institute (SANBI)
Faculty:	Faculty of Natural Science
	My publications in this repository
Tel:	021 959 2356
Fax:	021 959 2512
Email:	pvh@sanbi.ac.za

Browse

Now showing 1 - 6 of 6

Transcriptomics of the Rooibos (Aspalathus linearis) Species Complex
(MDPI, 2020) Stander, Emily Amor; Williams, Wesley; Mgwatu, Yamkela; Hesse, Uljana; van Heusden, Peter A.
Rooibos (Aspalathus linearis), widely known as a herbal tea, is endemic to the Cape Floristic Region of South Africa (SA). It produces a wide range of phenolic compounds that have been associated with diverse health-promoting properties of the plant. The species comprises several growth forms that differ in their morphology and biochemical composition, only one of which is cultivated and used commercially. Here, we established methodologies for non-invasive transcriptome research of wild-growing South African plant species, including (1) harvesting and transport of plant material suitable for RNA sequencing; (2) inexpensive, high-throughput biochemical sample screening; (3) extraction of high-quality RNA from recalcitrant, polysaccharide- and polyphenol-rich plant material; and (4) biocomputational analysis of Illumina sequencing data, together with the evaluation of programs for transcriptome assembly (Trinity, IDBA-Trans, SOAPdenovo-Trans, CLC), protein prediction, as well as functional and taxonomic transcript annotation. In the process, we established a biochemically characterized sample pool from 44 distinct rooibos ecotypes (1–5 harvests) and generated four in-depth annotated transcriptomes (each comprising on average ≈86,000 transcripts) from rooibos plants that represent distinct growth forms and differ in their biochemical profiles. These resources will serve future rooibos research and plant breeding endeavors.
Capacity building for whole genome sequencing of Mycobacterium tuberculosis and bioinformatics in high TB burden countries
(Oxford University Press, 2020) van Heusden, Peter A.
Background Whole-genome sequencing (WGS) is increasingly used for Mycobacterium tuberculosis (Mtb) research. Countries with the highest tuberculosis (TB) burden face important challenges to integrate WGS into surveillance and research. Methods We assessed the global status of Mtb WGS and developed a 3-week training course coupled with long-term mentoring and WGS infrastructure building. Training focused on genome sequencing, bioinformatics and development of a locally relevant WGS research project. The aim of the long-term mentoring was to support trainees in project implementation and funding acquisition. The focus of WGS infrastructure building was on the DNA extraction process and bioinformatics. Findings Compared to their TB burden, Asia and Africa are grossly underrepresented in Mtb WGS research. Challenges faced resulted in adaptations to the training, mentoring and infrastructure building. Out-of-date laptop hardware and operating systems were overcome by using online tools and a Galaxy WGS analysis pipeline. A case studies approach created a safe atmosphere for students to formulate and defend opinions. Because quality DNA extraction is paramount for WGS, a biosafety level 3 and general laboratory skill training session were added, use of commercial DNA extraction kits was introduced and a 2-week training in a highly equipped laboratory was combined with a 1-week training in the local setting. Interpretation By developing and sharing the components of and experiences with a sequencing and bioinformatics training program, we hope to stimulate capacity building programs for Mtb WGS and empower high-burden countries to play an important role in WGS-based TB surveillance and research.
Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding
(Public Library of Science, 2016) Vij, Shubha; van Heusden, Peter A.; Christoffels, Alan; Mbandi, Stanley K.; Mwangi, Sarah
We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species’ native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.
Taste and odorant receptors of the coelecanth- a gene repertoire in transition
(Wiley, 2014) Picone, Barbara; Hesse, Uljana; Panji, Sumir; van Heusden, Peter A.; Jonas, Mario; Christoffels, Alan
G-protein coupled chemosensory receptors (GPCR-CRs) aid in the perception of odors and tastes in vertebrates. So far, six GPCR-CR families have been identified that are conserved in most vertebrate species. Phylogenetic analyses indicate differing evolutionary dynamics between teleost fish and tetrapods. The coelacanth Latimeria chalumnae belongs to the lobe-finned fishes, which represent a phylogenetic link between these two groups. We searched the genome of L. chalumnae for GPCR-CRs and found that coelacanth taste receptors are more similar to those in tetrapods than in teleost fish: two coelacanth T1R2s co-segregate with the tetrapod T1R2s that recognize sweet substances, and our phylogenetic analyses indicate that the teleost T1R2s are closer related to T1R1s (umami taste receptors) than to tetrapod T1R2s. Furthermore, coelacanths are the first fish with a large repertoire of bitter taste receptors (58 T2Rs). Considering current knowledge on feeding habits of coelacanths the question arises if perception of bitter taste is the only function of these receptors. Similar to teleost fish, coelacanths have a variety of olfactory receptors (ORs) necessary for perception of water-soluble substances. However, they also have seven genes in the two tetrapod OR subfamilies predicted to recognize airborne molecules. The two coelacanth vomeronasal receptor families are larger than those in teleost fish, and similar to tetrapods, form V1R and V2R monophyletic clades. This may point to an advanced development of the vomeronasal organ as reported for lungfish. Our results show that the intermediate position of Latimeria in the phylogeny is reflected in its GPCR-CR repertoire.
The African Coelecanth genome provides insights into tetrapod evolution
(Macmillan Publishers, 2013) Christoffels, Alan; Hesse, Uljana; Gamieldien, Junaid; Panji, Sumir; Picone, Barbara; van Heusden, Peter A.
The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.
Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms
(BioMed Central, 2015) Mbandi, Stanley K.; Hesse, Uljana; van Heusden, Peter A.; Christoffels, Alan
Background: De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique transfrags annotations and propagation of mis-assemblies. Results: Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences, 2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated (5′ and 3′) regions and non-coding gene loci. Conclusions: IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.

Browse

Recent Submissions