An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

dc.contributor.advisorChristoffels, Alan
dc.contributor.authorOluwaseun, Ajayi Olabode
dc.date.accessioned2019-05-09T11:09:06Z
dc.date.accessioned2024-05-17T07:20:04Z
dc.date.available2019-05-09T11:09:06Z
dc.date.available2024-05-17T07:20:04Z
dc.date.issued2018
dc.description>Magister Scientiae - MScen_US
dc.description.abstractFunctional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs.en_US
dc.identifier.urihttps://hdl.handle.net/10566/15221
dc.language.isoenen_US
dc.publisherUniversity of the Western Capeen_US
dc.rights.holderUniversity of the Western Capeen_US
dc.subjectData-intensiveen_US
dc.subjectGalaxy Frameworken_US
dc.subjectFunctional Genomicen_US
dc.subjectRuffus Frameworken_US
dc.subjectRuntime Executionen_US
dc.titleAn evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ajayi_msc_nsc_2019.pdf
Size:
8.14 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: