Magister Scientiae - MSc (Computer Science)

Permanent URI for this collectionhttps://hdl.handle.net/10566/14793

Browse

Now showing 1 - 20 of 103

Substrate blockchain-enabled interplanetary file system for distributed data storage
(University of the Western Cape, 2025) Nododile, Thandile
Distributed data storage systems are vital in the era of expanding data across various applications. Industries like finance, healthcare, and the Internet of Things (IoT) require efficient management of vast and significantly growing data. Traditional centralised storage systems face challenges in scalability, security, and availability. They’re prone to bottlenecks, cyber-attacks, and data loss together with latency during network failures. These limitations significantly impact the reliability and efficiency of data storage, making them unsuitable for critical IoT applications where real-time data access and integrity are important.
A semantic knowledge base of the national cybersecurity environment in South Africa
(University of the Western Cape, 2024) Kondlo, Aphile Gift
Cybercriminals are placing increasing focus on South Africa, making it a notable target for cyber-attacks. The South African government has initiated several measures to counter cyber-attacks such as the National Cybersecurity Policy Framework (NCPF) and Cybercrimes Act. In the author’s review, the NCPF represents a vital initiative aimed at fostering a secure and resilient cyberspace, thereby enabling government, businesses, and civil society to fully harness the benefits of digital technologies. Various structures are being established because of the NCPF but to achieve an efficient cybersecurity strategy there must be a strong partnership amongst business, government, and civil society. South Africa requires a holistic approach to ensure a more secure cyberspace to avoid major cybercrimes, cyber-attacks, and cyber warfare. The structures and policies that follow from these measures have not been fully implemented yet. Although the government published the NCPF in 2015 and enacted the Cybercrimes Act in May 2021, there is still a gap in terms of interoperability and shared understanding within the environment. Numerous new structures have been established, and others are still being planned. One example of a new structure is the Cybersecurity Hub (Hub), the national Computer Security Incident Response Team (CSIRT), that is mandated to co-ordinate attack information and provide support for cyber incidents
Investigating Frequent Pattern-based Models for Im- proving Community Policing in South Africa
(University of te Western Cape, 2025) Macingwane Apiwe
South Africa (SA) faces significant challenges due to a high crime rate, posing threats to citizens’ safety and economic growth. Adopting a comprehensive approach that integrates traditional crime prevention methods with advanced technologies is essential to address these challenges effectively. One of the relevant advanced techniques is pattern mining in Data Mining (DM), which utilizes association rules to identify relationships within dataset variables. This method is pivotal in the crime sector as it mines frequent patterns associated with prevailing criminal activities, aiding law enforcement in making strategic decisions to combat crime. Despite limited research on pattern-based models in SA, Frequent Pattern Growth (FP-Growth) and Hyper Structure Mining (Hmine) models have proven to be effective and efficient across various fields. Therefore, this research delves into exploring two prominent frequent pattern-based mining algorithms, FP-Growth and Hmine, and then proposes a novel Hybrid Pattern- Growth algorithm (HP-Growth), which combines the strengths of FP-Growth and Hmine. An experiment was conducted with these models using Python to generate frequent patterns of crime with the South African crime statistics (Stats SA crime) dataset between 2005 and 2016 across all nine provinces. The experiment employed the Mean and Floor functions to compute the Minimum Support Value (MSV) required for pattern generation. Each model was evaluated using standard metrics such as memory usage, runtime, scalability, and the reliability of their frequent patterns. The study found that among the three models, HP-Growth performed more efficiently with sparse datasets, while Hmine performed best with dense datasets. FP-Growth exhibits higher time complexity and memory usage compared to Hmine and HP-Growth. Thus, Hmine emerges as the preferred algorithm for pattern generation due to its speed and memory efficiency with dense datasets, making it ideal for resource-constrained environments. The study establishes association rule thresholds and emphasizes the importance of selecting the most appropriate pattern-based model with low time and memory complexity for large-scale datasets and real-time processing for crime knowledge support. Furthermore, this research developed a crime knowledge support system, called CrimeTracker, which allows users to report and view crime information, and receive crime alerts. The system integrates the most suitable pattern-based model using a Representational State Transfer (REST) Application Programming Interface (API). The development of CrimeTracker utilized Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), Python, and JavaScript. The research findings on the developed CrimeTracker are intended to have far-reaching implications for law enforcement agencies and crime analysts in SA. These insights could be crucial in guiding efficient resource allocation, refining crime prevention strategies, and bolstering community empowerment for improved safety and social cohesion. This further aligns with Sustainable Development Goal (SDG) 16 on peace, justice, and strong institutions.
Enhancing federated learning performance in realistic network conditions using a customised udp protocol and greedy parameter aggregation
(University of the Western Cape, 2025) Mahembe, Bright Kudzaishe
Enhancing federated learning performance in realistic network conditions using a customised udp protocol and greedy parameter aggregation Federated Learning (FL) is an emerging paradigm enabling decentralized Machine Learning (ML) model training and updates while prioritizing data privacy. Extensive research in this field has led to the development and enhancement of multiple solutions, including as TensorFlow and PyTorch, to facilitate decentralized ML simulations. However, existing frameworks often have limited capabilities for customizing network configurations and transport protocols to create realistic network environments. Transport layer protocols, situated at Layer 4 of the Open Systems Interconnection (OSI) model, facilitate end-to-end communication between multiple hosts, ensuring data is reliably transmitted across the network. This study leverages the NS-3 network simulator alongside Ten- Sor Flow's deep learning framework to create a realistic network environment tailored for Federated Learning applications. To address the specific efficiency and reliability requirements of these applications, a modified User Datagram Protocol (UDP) was developed. A detailed implementation of the proposed NS-3-based Federated Learning simulator is provided, along with an in-depth explanation of the modified UDP protocol. The simulator was employed to validate the simulation by comparing the performance of standard and modified UDP protocols using the CIFAR-10 (Canadian Institute for Advanced Research) and MNIST (Modified National Institute of Standards and Technology) datasets. Results indicate that the modified UDP model demonstrated robust performance, achieving an accuracy of 78% under poor network conditions, representing only a 2% decline from the 80% accuracy attained in ideal conditions. This performance is primarily attributed to its effective packet retrieval mechanism, whereas the standard UDP protocol model suffered a significant performance drop, achieving only 10% accuracy under poor network conditions, corresponding to a 69% decline from its performance in ideal conditions.
Automated detection of race condition vulnerabilities in binary programs using symbolic execution
(University of the Western Cape, 2025) Makan, Keith
Identifying race conditions in binary programs is challenging due to limited research on the subject and the lack of thoroughly evaluated methods, especially those applied to consumer-grade, off-the-shelf binary programs without requiring source code. Symbolic execution is a static analysis technique that significantly enhances vulnerability identi- fication, especially in fuzzing-driven security analysis. However, its scalability is often restricted by the computational demands of constraint solving, high memory consump-tion, and the state explosion problem—the rapid increase in the number of states caused by program structures. To overcome these challenges, many vulnerability analysis meth-ds incorporate techniques to manage state explosion, enabling the analysis of more complex programs. The approach presented by this thesis employs symbolic vulnerability analysis for bi-naries by introducing a novel approach called "Xegmap" which uses Directed Symbolic Execution to strategically guide the exploration of program states. Xegmap is designed to prioritise the detection of Global Memory Access Points, operating on the premise that identifying these points facilitates the detection of potentially race-prone threadinteractions. It accomplishes this through a two-phase process: a naive symbolic execu-tion phase, Negmap, followed by a directed phase, Degmap. Degmap directs symbolic execution toward global memory access points and evaluates memory interactions using a hybrid lock-set and happens-before analysis. Experimentation demonstrates Xegmap’s enhanced capacity to increase code coverage in consumer-grade, off-the-shelf binaries and to detect race conditions in binaries of varying complexity, including those with intricate input constraints and numerous threads, all without the need for source code.
Advanced computational techniques for pipe burst detection and localisation in water distribution networks
(University of the Western Cape, 2025) Mzembegwa, Takudzwa Sikumbuzo
Pipe bursts cause a considerable loss of treated water, increase the risks of environmental contamination and are a health hazard for the end-user as they can create a passage for contaminants to enter water distribution networks (WDN). Identifying pipe burst locations will help water service providers repair pipe bursts in a timely manner. Given the ever-increasing importance of water, a great number of methods to locate pipe bursts have been proposed. But none have proved to produce results accurate enough for water service providers to heavily rely on. Therefore, this thesis presents a comprehensive investigation addressing two critical challenges in pipe burst localisation: optimising fully-linear deep learning (FL-DL) architectures for accurate detection and developing real-time localisation methods using Change point detection (CPD) algorithms. The research is structured in two main phases to tackle these challenges. The first phase conducts a comparative analysis of hyperparameter optimisation techniques, Particle Swarm Optimisation (PSO) and Population-Based Training (PBT), for FL-DL architectures. This investigation addresses the limitations of traditional detection methods, which are often costly, labour-intensive, and limited in scalability. Results demonstrate PSO’s superior performance, with PSO-optimised models consistently achieving higher accuracy and lower variance compared to PBT implementations. Notably, PSOFL-ResNet achieved a mean accuracy of 98.92% and PSOFL- DenseNet reached 98.78%, significantly outperforming their PBT counterparts at 96.70% and 97.22% respectively.
Research asset management for promoting collaboration using publish-subscribe and immersive technology
(University of the Western Cape, 2025) Kessel, Okinga Koumou
Assets, ranging from tangible assets (equipment and infrastructure) to intangible assets (knowledge), are essential economic resources owned or controlled by a person or organizations and play a vital role across diverse domains. Effective asset management is critical for optimal utilization and fostering collaboration among researchers within an institution. However, the level of collaboration can vary depending on the researchers’ needs. The International Organization for Standardization (ISO) 55000 standard defines asset management as the coordinated activity of an organization to realize value from assets. A preliminary investigation at a South African institution reveals that researchers often work in silos and face difficulties accessing potentially mutual assets such as chemicals or equipment, leading to under-utilization and high disposal costs for them. While several researchers have highlighted similar challenges in asset management within some organizations, there is limited evidence of implemented solutions for the problem. Therefore, this research developed an innovative asset management system, coined SciAssetHub (https://www.sciassethub.com/), using a publish-subscribe mechanism and immersive technology. The system notifies users of newly added assets and allows them to request the assets.
Adversarial deep reinforcement learning for autonomous cyber defense in software defined networks
(University of the Western Cape, 2024) Borchjes, Luke David
The rapid advancement of technologies such as IoT devices, Internet-based AI systems, and 5G networks has significantly heightened the demand for robust cybersecurity solutions. Traditional network architectures face challenges in scalability, security, and manageability, leading to the adoption of software-defined networking (SDN) as a flexible and secure alternative. However, the dynamic nature of tra!c patterns in SDN necessitates frequent reconfiguration, highlighting the importance of autonomous decision-making tools. Reinforcement learning (RL), and its deep learning extension, deep reinforcement learning (DRL), have proven e”ective in addressing these challenges. Nevertheless, DRL algorithms are susceptible to adversarial attacks, underscoring the need for research into more resilient algorithms, such as NEC2DQN and DDQN, for both defensive and o”ensive applications in SDN environments. The evolution of network security has shifted from rule-based systems to autonomous solutions, with Deep Reinforcement Learning (DRL) emerging as a leading approach. Algorithms such as Deep Q-Network (DQN), Double Deep Q-Network (DDQN), and Neural Episodic Control to Deep Q-Network (NEC2DQN) have advanced the field, but their adaptability has introduced vulnerabilities to adversarial attacks, necessitating ongoing testing and improvement to ensure robust, adaptive models.
Machine learning techniques for the determination of vehicle hijacking spots using twitter data
(University of the Western Cape, 2024) Patel, Taahir Aiyoob; Nyirenda, Clement
A vehicle hijacking is one of the leading crime-related incidents in South Africa. Each day, many travelers are caught unprepared due to a lack of knowledge regarding incident locations. This information is usually not easily accessible to the general public, and the currently available information is commonly released in large time increments, such as monthly or yearly reports. Therefore, an alternative approach to obtaining this data is needed. Social media provides an open-source alternative for data collection. One of the largest of these platforms is Twitter (newly named X.com). With Twitter, users share information regarding each aspect of their lives, such as notable daily incidents. It is also quite common for certified news outlets to inform users of current events as they occur. However, when dealing with Twitter data, the issue of relevant information is encountered. Due to the users’ free roam regarding topics and textual format, not all obtained data would be a relevant hijacking report. To remedy this, the employment of Machine Learning is observed. In nature, this is a textual classification problem. When dealing with such problems, there has been a large number of works employing supervised and unsupervised learning methods. For supervised learning approaches, an issue is a need for data to train a defined model, this removes the possibility of a true real-time approach. Unsupervised learning voids this requirement through the learning as occurring nature, however, it has commonly been found to have a reduction in performance. Therefore, variations of both methods are implemented in this work.
Semantic data access for relational databases using an ontology
(University of the Western Cape, 2024) Jafta, Yahlieel; Leenen, Louise
Data analysis-based decision-making is performed daily by domain experts. As data grows in size and heterogeneity, accessing relevant data becomes challenging. In an Ontology-based data access (OBDA) approach, ontologies are advocated as a suitable formal tool to address complex data access. This technique falls within the Semantic Web domain, combining a domain ontology with a data source by using a declarative mapping specification to enable data access using a domain vocabulary. In this research, we investigate this approach by: a) studying the theoretical background that enables this technique; b) conducting a literature review on the existing open source tools that implement OBDA; c) implementing OBDA on a “real-world” relational dataset using an OBDA tool; and d) providing results and analysis of query answering. We selected Ontop (https://ontop-vkg.org) among various OBDA tools to illustrate how this technique enhances the data usage of the GitHub community. Ontop is an open-source tool applying OBDA in the domain of relational databases. We used the GHTorrent dataset, a relational database, in combination with the SemanGit ontology for our implementation.
On the efficacy of enhanced feature selection methods for supervised crime prediction
(University of the Western Cape, 2023) May, Sphamandla Innocent; Isafiade, Omowunmi
The challenge of crime across the globe has necessitated several considerations for crime preventive measures. There exist a variety of crime prevention strategies, such as the use of necessary weapons or tools to respond to crime. However, for resource-constrained nations such as South Africa, where the current police to civilian ratio is overwhelming, this may not suffice. Consequently, crime continues to be on the rise, necessitating alternative prevention strategies. Among alternative prevention approaches, the use of historical crime data can be explored through machine learning. Crime prediction using machine learning has been explored and has shown promising results. However, the choice of algorithm and feature selection methods play a critical role in creating an effective predictive model. This study, therefore, explores the efficacy of enhanced feature selection methods in supervised machine learning algorithms for crime prediction. Four (4) baseline algorithms are adopted, which are Random Forest (RF), Extremely Randomized Trees (ERT), Na¨ıve Bayes (NB), and Support Vector Machine (SVM). This research further proposes three algorithms, with the first derived from hybridizing RF and ERT (RF-Plus), while the other two (2) were obtained from enhancing NB and SVM using recursive feature elimination (RFE), obtaining (RFE-NB) and (RFE-SVM) respectively, totaling seven algorithms. Finally, a comparative evaluation of these algorithms with their respective baselines is conducted to report on their efficacy and contrasted against additional two (2) algorithms from the literature, which amounts to a total of nine (9) algorithms. The study conducted performance evaluation on the models using two distinct publicly available datasets, which are the Chicago and Los Angeles crime datasets. Results confirm that feature selection positively impacts prediction accuracy. The enhancement on the pure NB improved its accuracy from 72.5% to 96.6% and 80.45% to 95.78% for Chicago and Los Angeles datasets, respectively. The enhancement improved the accuracy of pure SVM from 74.73% to 89.91% and 75.73% to 88.70% for the Chicago and Los Angeles datasets, respectively, while achieving 97.04% and 95.5% on RF-Plus for both Chicago and Los Angeles datasets, respectively.
Exploring low-cost solution for 3D crime scene data gathering with immersive technology
(University of the Western Cape, 2023) Mfundo Andrew, Maneli; Isafiade, Omowunmi Elizabeth
3D crime scene data gathering is critical for law enforcement and investigators during crime scene investigations. Crime scene investigations have seen the effective usage of Light Detection and Ranging (LiDAR) scanners for 3D reconstruction alongside immersive technologies, such as Augmented Reality (AR) and Virtual Reality (VR). However, the inability to afford the existing high-end devices that can offer the desired accuracy of 3D scene data collection in low-resource settings cannot be overlooked, as this may impede crime investigations or render some crime cases insoluble.
Application of Several Time Series Methods to Three Important Financial Time Series
(University of the Western Cape, 2007) O'Connell, Bryan; Koean, C
This study is concerned with three different financial time series over an eight year period, namely: the government repurchase rate, the Rand-Dollar exchange rate and the Allshare Index. The aim is to better understand the statistical nature of the time series. The theory employed will be discussed briefly and then the results will be reported. Different methods are employed to model the different time series. The following topics are discussed: unit root tests, autoregressive integrated moving average models, outlier tests, transformations, generalised autoregressive conditional heteroscedasticity models, cointegration, transfer function models and vector autoregressive models.
Semi-synchronous video for deaf telephony with an adapted synchronous codec
(University of the Western Cape, 2009) Ma, Zhenyu; Tucker, William D.
As Information and Communication Technology (ICT) matures, communication services must be improved to meet the needs of all types of users. For some uses, current Video over Internet Protocol (IP) brings unsatisfactory and even unrecognisable quality of video sequences. Such communication does not always meet the needs of Deaf 1 people. Asynchronous video messaging, such as EyeJot (www.eyejot.com), offers Deaf people the ability to send and receive video messages like email. Unfortunately, communicating like this incurs much delay, resulting in slow response. Even though text messaging is popular among Deaf people via cellphone or Internet, but they would prefer to use sign language for communication. Video Relay Service (VRS) attempts to help Deaf users communicate with hearing people in sign language. VRS provides synchronous video and voice services to enable those who use sign language to communicate with hearing people through a relay interpreter across the world via the Internet.
Long short-term memory recurrent neural networks for signature verification
(UWC, 2003) Tiflin, C; Omlin, C
Handwritten signature verification is defined as the classification process that strives to learn the manner in which an individual makes use of the muscular memory of their hands, fingers, and wrist to reproduce a signature. A handwritten signature is captured by a pen input device and sampled at a high frequency which results in time series with several hundred data points. A novel recurrent neural network architecture known as long short-term memory was designed for modeling such a long-time series. This research investigates the suitability of long short-term memory recurrent neural networks for the task of online signature verification. We design and experiment with various network architectures to determine if this model can be trained to discriminate between authentic and fraudulent signatures. We further determine whether the complexity of a signature impacts the performance level of the network when applied to fraudulent signatures. We also investigate the performance level of the network when varying the number of signature features.
Towards a chereme based dynamic South African sign language gesture recognition system
(University of the Western Cape, 2007) Machanja, Addmore; Bajic, Vladimir B.
Hand gestures are a natural and intuitive way of human to human communication. Motivated by the achievements made towards automatic speech recognition, and by the ease with which people sign, many researchers started working on sign language recognition systems. Besides, technologies used to build gesture recognition systems pose as an alternative to the cumbersome and the failure prone mechanical devices that are currently used as human-machine interface devices. Most of the available gesture recognition systems represent each sign language gesture with an individual gesture model. Such systems can only recognize a limited number of dynamic sign language gestures. It is cumbersome to build and maintain a gesture recognition system that uses thousands and thousands of individual gesture models. Sign language linguists argue that all sign language gestures are derived from small sets of reusable components, the cheremes.
Automatic real-time facial expression recognition for signed language translation
(University of the Western Cape, 2006) Whitehill, Jacob Richard; Omlin, Christian W
We investigated two computer vision techniques designed to increase both the recognition accuracy and computational efficiency of automatic facial expression recognition. In particular, we compared a local segmentation of the face around the mouth, eyes, and brows to a global segmentation of the whole face. Our results indicated that, surprisingly, classifying features from the whole face yields greater accuracy despite the additional noise that the global data may contain. We attribute this in part to correlation effects within the Cohn-Kanade database. We also developed a system for detecting FACS action units based on Haar features and the Adaboost boosting algorithm. This method achieves equally high recognition accuracy for certain AUs but operates two orders of magnitude more quickly than the Gabor+SVM approach. Finally, we developed a software prototype of a real-time, automatic signed language recognition system using FACS as an intermediary framework.
Handwritten alphabet character recognition using audio signatures and machine learning
(University of the Western Cape, 2023) Beck, Bruce; Ghaziasgar, Mehrdad
This research investigates the creation of an audio-based character recognition system that is able to segment, process and recognise uppercase English letters continuously drawn by the user on a given writing surface such as a table-top using a generic writing implement. The aim is to make use of the microphones on a single smartphone to capture the acoustic signal generated by the user as they draw letters on the writing surface, followed by the application of audio segmentation to subdivide the audio signal into segments corresponding to each letter, and finally the application of a combination of the Mel-Frequency Cepstral Coefficients feature descriptor and Support Vector Machines to recognise the segmented letters.
A comparative evaluation of population-based optimization algorithms for workflow scheduling in cloud-fog environments
(University of the Western Cape, 2022) Subramoney, Dineshan; Nyirenda, Clement
Scientific workflows are denoted by interdependent tasks and computations that are aimed at achieving some scientific objectives. The scheduling of these workflows involve the allocation of the tasks to particular computational resources, traditionally on the cloud infrastructure. This process is, however, very challenging. It is associated with high computation and communication costs because scientific workflows are data-intensive and computationally complex. In recent years, there has been overwhelming interest in using population-based optimization algorithms such as Particle Swarm Optimization (PSO) and Genetic Algorithms (GA) for scientific workflow scheduling, predominantly, in the cloud environments.
Credit Card Transactions Fraud Detection, and Machine Learning: Modelling Time with LSTM Recurrent Neural Networks
(University of the Western Cape, 2007) Wiese, Benard Jacobus; Omlin, Christian W.
In recent years, topics such as fraud detection and fraud prevention have received a lot of attention on the research front, in particular from plastic card issuers. The reason for this increase in research activity can be attributed to the huge annual financial losses incurred by card issuers due to fraudulent use of their card products. A successful strategy for dealing with fraud can quite literally mean millions of dollars in savings per year on operational costs. Artificial neural networks have come to the front as an at least partially successful method for fraud detection. The success of neural networks in this field is, however, limited by their underlying design - a feedforward neural network is simply a static mapping of input vectors to output vectors, and as such is incapable of adapting to changing shopping profiles of legitimate card holders. Thus, fraud detection systems in use today are plagued by misclassifications and their usefulness is hampered by high false positive rates. We address this problem by proposing the use of a dynamic machine learning method in an attempt to model the time series inherent in sequences of same card transactions. We believe that, instead of looking at individual transactions; it makes more sense to look at sequences of transactions as a whole; a technique that can model time in this context will be more robust to minor shifts in legitimate shopping behaviour. In order to form a clear basis for comparison, we did some investigative research on feature selection, pre-processing, and on the selection of performance measures; the latter will facilitate comparison of results obtained by applying machine learning methods to the biased data sets largely associated with fraud detection. We ran experiments on real world credit card transactional data using three machine learning techniques: a conventional feedforward neural network (FFNN), and two innovative methods, the support vector machine (SVM) and the long short-term memory recurrent neural network (LSTM).

Browse

Recent Submissions