UWCScholar :: Browsing by Author "Lochner, Michelle"

Browsing by Author "Lochner, Michelle"

Now showing 1 - 16 of 16

Anomaly Detection With Machine Learning In Astronomical Images
(University of the Western Cape, 2020) Etsebeth, Verlon; Lochner, Michelle
Observations that push the boundaries have historically fuelled scientific breakthroughs, and these observations frequently involve phenomena that were previously unseen and unidentified. Data sets have increased in size and quality as modern technology advances at a record pace. Finding these elusive phenomena within these large data sets becomes a tougher challenge with each advancement made. Fortunately, machine learning techniques have proven to be extremely valuable in detecting outliers within data sets. Astronomaly is a framework that utilises machine learning techniques for anomaly detection in astronomy and incorporates active learning to provide target specific results. It is used here to evaluate whether machine learning techniques are suitable to detect anomalies within the optical astronomical data obtained from the Dark Energy Camera Legacy Survey. Using the machine learning algorithm isolation forest, Astronomaly is applied on subsets of the Dark Energy Camera Legacy Survey (DECaLS) data set. The pre-processing stage of Astronomaly had to be significantly extended to handle real survey data from DECaLS, with the changes made resulting in up to 10% more sources having their features extracted successfully. For the top 500 sources returned, 292 were ordinary sources, 86 artefacts and masked sources and 122 were interesting anomalous sources. A supplementary machine learning algorithm known as active learning enhances the identification probability of outliers in data sets by making it easier to identify target specific sources. The addition of active learning further increases the amount of interesting sources returned by almost 40%, with 273 ordinary sources, 56 artefacts and 171 interesting anomalous sources returned. Among the anomalies discovered are some merger events that have been successfully identified in known catalogues and several candidate merger events that have not yet been identified in the literature. The results indicate that machine learning, in combination with active learning, can be effective in detecting anomalies in actual data sets. The extensions integrated into Astronomaly pave the way for its application on future surveys like the Vera C. Rubin Observatory Legacy Survey of Space and Time.
Application of anomaly detection techniques to astrophysical transients
(University of Western Cape, 2021) Ramonyai, Malema Hendrick; Lochner, Michelle
We are fast moving into an era where data will be the primary driving factor for discovering new unknown astronomical objects and also improving our understanding of the current rare astronomical objects. Wide field survey telescopes such as the Square Kilometer Array (SKA) and Vera C. Rubin observatory will be producing enormous amounts of data over short timescales. The Rubin observatory is expected to record ∼ 15 terabytes of data every night during its ten-year Legacy Survey of Space and Time (LSST), while the SKA will collect ∼100 petabytes of data per day. Fast, automated, and datadriven techniques, such as machine learning, are required to search for anomalies in these enormous datasets, as traditional techniques such as manual inspection will take months to fully exploit such datasets.
Astronomaly at scale: searching for anomalies amongst 4 million galaxies
(Oxford University Press, 2024) Etsebeth, Veronica; Lochner, Michelle; Walmsley M
Modern astronomical surveys are producing data sets of unprecedented size and richness, increasing the potential for high- impact scientific discovery. This possibility, coupled with the challenge of exploring a large number of sources, has led to the development of novel machine-learning-based anomaly detection approaches, such as astronomy. For the first time, we test the scalability of astronomy by applying it to almost 4 million images of galaxies from the Dark energy camera legacy survey. We use a trained deep learning algorithm to learn useful representations of the images and pass these to the anomaly detection algorithm isolation forest, coupled with astronomy's active learning method, to discover interesting sources. We find that data selection criteria have a significant impact on the trade-off between finding rare sources such as strong lenses and introducing artefacts into the data set. We demonstrate that active learning is required to identify the most interesting sources and reduce artefacts, while anomaly detection methods alone are insufficient. Using astronomy, we find 1635 anomalies among the top 2000 sources in the data set after applying active learning, including eight strong gravitational lens candidates, 1609 galaxy merger candidates, and 18 previously unidentified sources exhibiting highly unusual morphology. Our results show that by leveraging the human-machine interface, astronomy’s able to rapidly identify sources of scientific interest even in large data sets.
Considerations for optimizing the photometric classification of supernovae from the Rubin observatory
(IOP Publishing, 2022) Alves, Catarina S.; Peiris, Hiranya V.; Lochner, Michelle
The Vera C. Rubin Observatory will increase the number of observed supernovae (SNe) by an order of magnitude; however, it is impossible to spectroscopically confirm the class for all SNe discovered. Thus, photometric classification is crucial, but its accuracy depends on the not-yet-finalized observing strategy of Rubin Observatory’s Legacy Survey of Space and Time (LSST). We quantitatively analyze the impact of the LSST observing strategy on SNe classification using simulated multiband light curves from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). First, we augment the simulated training set to be representative of the photometric redshift distribution per SNe class, the cadence of observations, and the flux uncertainty distribution of the test set. Then we build a classifier using the photometric transient classification library snmachine, based on wavelet features obtained from Gaussian process fits, yielding a similar performance to the winning PLAsTiCC entry. We study the classification performance for SNe with different properties within a single simulated observing strategy. We find that season length is important, with light curves of 150 days yielding the highest performance. Cadence also has an important impact on SNe classification; events with median inter-night gap <3.5 days yield higher classification performance. Interestingly, we find that large gaps (>10 days) in light-curve observations do not impact performance if sufficient observations are available on either side, due to the effectiveness of the Gaussian process interpolation. This analysis is the first exploration of the impact of observing strategy on photometric SN classification with LSST.
Designing an optimal LSST deep drilling program for cosmology with type Ia supernovae
(American Astronomical Society, 2023) Gris, Philippe; Regnault, Nicolas; Lochner, Michelle
The Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) is forecast to collect a large sample of Type Ia supernovae (SNe Ia) expected to be instrumental in unveiling the nature of dark energy. The feat, however, requires accurately measuring the two components of the Hubble diagram, distance modulus and redshift. Distance is estimated from SN Ia parameters extracted from light-curve fits, where the average quality of light curves is primarily driven by survey parameters. An optimal observing strategy is thus critical for measuring cosmological parameters with high accuracy. We present in this paper a three-stage analysis to assess the impact of the deep drilling (DD) strategy parameters on three critical aspects of the survey: redshift completeness, the number of wellmeasured SNe Ia, and cosmological measurements. We demonstrate that the current DD survey plans (internal LSST simulations) are characterized by a low completeness (z ∼ 0.55–0.65), and irregular and low cadences (several days), which dramatically decrease the size of the well-measured SN Ia sample. We propose a method providing the number of visits required to reach higher redshifts. We use the results to design a set of optimized DD surveys for SN Ia cosmology taking full advantage of spectroscopic resources for host galaxy redshift measurements. The most accurate cosmological measurements are achieved with deep rolling surveys characterized by a high cadence (1 day), a rolling strategy (at least two seasons of observation per field), and ultradeep (z  0.8) and deep (z  0.6) fields. A deterministic scheduler including a gap recovery mechanism is critical to achieving a high-quality DD survey.
Detecting anomalous transients in meertrap data
(Universty of the Western Cape, 2024) Petersen-Charles, Jade Lindsay; Lochner, Michelle
In an era distinguished by significant technological progress, the prevalence of large and complex datasets characterizes the "big data" era across various disciplines. With improved telescopes being built aimed at generating datasets of unprecedented volumes, there is incredible potential for discovery. The MeerKAT radio telescope in South Africa has proven to be an excellent telescope to search for fast radio transients such as pulsars and fast radio bursts (FRBs). MeerTRAP (more TRAnsients and Pulsars), which commensally uses MeerKAT to search for fast radio transients, detects tens of thousands of candidate objects daily (on average), although the vast majority are not of astrophysical origin. Automated techniques such as machine learning are routinely used to identify targeted astrophysical transients. However, an emerging application of machine learning is to aid the detection of unidentified or rare sources, referred to as anomalies.
Enabling unsupervised discovery in astronomical images through self-supervised representations
(Oxford University Press, 2024) Mohale, Koketso; Lochner, Michelle
Unsupervised learning, a branch of machine learning that can operate on unlabelled data, has proven to be a powerful tool for data exploration and discovery in astronomy. As large surveys and new telescopes drive a rapid increase in data size and richness, these techniques offer the promise of discovering new classes of objects and of efficient sorting of data into similar types. However, unsupervised learning techniques generally require feature extraction to derive simple but informative representations of images. In this paper, we explore the use of self-supervised deep learning as a method of automated representation learning. We apply the algorithm Bootstrap Your Own Latent to Galaxy Zoo DECaLS images to obtain a lower dimensional representation of each galaxy, known as features. We briefly validate these features using a small supervised classification problem. We then move on to apply an automated clustering algorithm, demonstrating that this fully unsupervised approach is able to successfully group together galaxies with similar morphology. The same features prove useful for anomaly detection, where we use the framework astronomaly to search for merger candidates. While the focus of this work is on optical images, we also explore the versatility of this technique by applying the exact same approach to a small radio galaxy data set. This work aims to demonstrate that applying deep representation learning is key to unlocking the potential of unsupervised discovery in future data sets from telescopes such as the Vera C. Rubin Observatory and the Square Kilometre Array.
Finding radio transients with anomaly detection and active learning based on volunteer classifications
(Oxford University Press, 2025) Lochner, Michelle; Andersson, Alex; Woudt, Patrick
In this work, we explore the applicability of unsupervised machine learning algorithms to finding radio transients. Facilities such as the Square Kilometre Array (SKA) will provide huge volumes of data in which to detect rare transients; the challenge for astronomers is how to find them. We demonstrate the effectiveness of anomaly detection algorithms using 1.3 GHz light curves from the SKA precursor MeerKAT. We make use of three sets of descriptive parameters ('feature sets') as applied to two anomaly detection techniques in the astronomaly package and analyse our performance by comparison with citizen science labels on the same data set. Using transients found by volunteers as our ground truth, we demonstrate that anomaly detection techniques can recall over half of the radio transients in the 10 per cent of the data with the highest anomaly scores. We find that the choice of anomaly detection algorithm makes a minor difference, but that feature set choice is crucial, especially when considering available resources for human inspection and/or follow-up. Active learning, where human labels are given for just 2 per cent of the data, improves recall by up to 20 percentage points, depending on the combination of features and model used. The best-performing results produce a factor of 5 times fewer sources requiring vetting by experts. This is the first effort to apply anomaly detection techniques to finding radio transients and shows great promise for application to other data sets, and as a real-Time transient detection system for upcoming large surveys.
The Impact of Observing Strategy on Cosmological Constraints with LSST
(2022) Lochner, Michelle; Scolnic, Dan; Almoubayyed4, Husni; Anguit, Timo
The generation-defining Vera C. Rubin Observatory will make state-of-the-art measurements of both the static and transient universe through its Legacy Survey for Space and Time (LSST). With such capabilities, it is immensely challenging to optimize the LSST observing strategy across the survey's wide range of science drivers. Many aspects of the LSST observing strategy relevant to the LSST Dark Energy Science Collaboration, such as survey footprint definition, single-visit exposure time, and the cadence of repeat visits in different filters, are yet to be finalized. Here, we present metrics used to assess the impact of observing strategy on the cosmological probes considered most sensitive to survey design; these are large-scale structure, weak lensing, type Ia supernovae, kilonovae, and strong lens systems (as well as photometric redshifts, which enable many of these probes). We evaluate these metrics for over 100 different simulated potential survey designs. Our results show that multiple observing strategy decisions can profoundly impact cosmological constraints with LSST; these include adjusting the survey footprint, ensuring repeat nightly visits are taken in different filters, and enforcing regular cadence. We provide public code for our metrics, which makes them readily available for evaluating further modifications to the survey design. We conclude with a set of recommendations and highlight observing strategy factors that require further research.
Impact of Rubin observatory cadence choices on supernovae photometric classification
(American Astronomical Society, 2023) Alves, Catarina S.; Peiris, Hiranya V.; Lochner, Michelle
The Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) will discover an unprecedented number of supernovae (SNe), making spectroscopic classification for all the events infeasible. LSST will thus rely on photometric classification, whose accuracy depends on the not-yet-finalized LSST observing strategy. In this work, we analyze the impact of cadence choices on classification performance using simulated multiband light curves. First, we simulate SNe with an LSST baseline cadence, a nonrolling cadence, and a presto-color cadence, which observes each sky location three times per night instead of twice. Each simulated data set includes a spectroscopically confirmed training set, which we augment to be representative of the test set as part of the classification pipeline. Then we use the photometric transient classification library snmachine to build classifiers. We find that the active region of the rolling cadence used in the baseline observing strategy yields a 25% improvement in classification performance relative to the background region. This improvement in performance in the actively rolling region is also associated with an increase of up to a factor of 2.7 in the number of cosmologically useful Type Ia SNe relative to the background region. However, adding a third visit per night as implemented in presto-color degrades classification performance due to more irregularly sampled light curves. Overall, our results establish desiderata on the observing cadence related to classification of full SNe light curves, which in turn impacts photometric SNe cosmology with LSST.
Impact of Rubin Observatory Cadence Choices on Supernovae Photometric Classification
(The Astrophysical Journal Supplement Series, 2023) Lochner, Michelle; Alves, Catarina; Peiris, Hiranya
The Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) will discover an unprecedented number of supernovae (SNe), making spectroscopic classification for all the events infeasible. LSST will thus rely on photometric classification, whose accuracy depends on the not-yet-finalized LSST observing strategy. In this work, we analyse the impact of cadence choices on classification performance using simulated multiband light curves. First, we simulate SNe with an LSST baseline cadence, a nonrolling cadence, and a presto-colour cadence, which observes each sky location three times per night instead of twice. Each simulated data set includes a spectroscopically confirmed training set, which we augment to be representative of the test set as part of the classification pipeline. Then we use the photometric transient classification library machine to build classifiers. We find that the active region of the rolling cadence used in the baseline observing strategy yields a25% improvement in classification performance relative to the background region. This improvement in performance in the actively rolling region is also associated with an increase of up to a factor of 2.7 in the number of cosmologically useful Type Ia SNe relative to the background region. However, adding a third visit per night as implemented in presto-color degrades classification performance due to more irregularly sampled light curves. Overall, our results establish desiderata on the observing cadence related to classification of full SNe light curves, which in turn impacts photometric SNe cosmology with LSST.
Optimization of the observing cadence for the Rubin observatory legacy survey of space and time: A pioneering process of community-focused experimental design
(IOP Publishing, 2022) Bianco, Federica B.; Ivezić, Željko; Lochner, Michelle
Vera C. Rubin Observatory is a ground-based astronomical facility under construction, a joint project of the National Science Foundation and the U.S. Department of Energy, designed to conduct a multipurpose 10 yr optical survey of the Southern Hemisphere sky: the Legacy Survey of Space and Time. Significant flexibility in survey strategy remains within the constraints imposed by the core science goals of probing dark energy and dark matter, cataloging the solar system, exploring the transient optical sky, and mapping the Milky Way. The survey’s massive data throughput will be transformational for many other astrophysics domains and Rubin’s data access policy sets the stage for a huge community of potential users. To ensure that the survey science potential is maximized while serving as broad a community as possible, Rubin Observatory has involved the scientific community at large in the process of setting and refining the details of the observing strategy. The motivation, history, and decision-making process of this strategy optimization are detailed in this paper, giving context to the science-driven proposals and recommendations for the survey strategy included in this Focus Issue.
Practical galaxy morphology tools from deep supervised representation learning
(Oxford University Press, 2022) Walmsley, Mike; Scaife, Anna M. M.; Lochner, Michelle
Astronomers have typically set out to solve supervised machine learning problems by creating their own representations from scratch. We show that deep learning models trained to answer every Galaxy Zoo DECaLS question learn meaningful semantic representations of galaxies that are useful for new tasks on which the models were never trained. We exploit these representations to outperform several recent approaches at practical tasks crucial for investigating large galaxy samples. The first task is identifying galaxies of similar morphology to a query galaxy. Given a single galaxy assigned a free text tag by humans (e.g. ‘#diffuse’), we can find galaxies matching that tag for most tags. The second task is identifying the most interesting anomalies to a particular researcher. Our approach is 100 per cent accurate at identifying the most interesting 100 anomalies (as judged by Galaxy Zoo 2 volunteers). The third task is adapting a model to solve a new task using only a small number of newly labelled galaxies. Models fine-tuned from our representation are better able to identify ring galaxies than models fine-tuned from terrestrial images (ImageNet) or trained from scratch. We solve each task with very few new labels; either one (for the similarity search) or several hundred (for anomaly detection or fine-tuning).
A unique, ring-like radio source with quadrilateral structure detected with machine learning
(Oxford University Press, 2023) Lochner, Michelle; Rudnick, Lawrence; Heywood, Ian
We report the discovery of a unique object in the MeerKAT Galaxy Cluster Le gacy Survey (MGCLS) using the machine learning anomaly detection framework ASTRONOMALY. This strange, ring-like source is 30 from the MGCLS field centred on Abell 209, and is not readily explained by simple physical models. With an assumed host galaxy at redshift 0.55, the luminosity (10 25 W Hz −1) is comparable to powerful radio galaxies. The source consists of a ring of emission 175 kpc across, quadrilateral enhanced brightness regions bearing resemblance to radio jets, two ‘ears’ separated by 368 kpc, and a diffuse envelope. All of the structures appear spectrally steep, ranging from −1.0 to −1.5. The ring has high polarization (25 per cent) except on the bright patches (< 10 per cent). We compare this source to the Odd Radio Circles recently discovered in ASKAP data and discuss several possible physical models, including a termination shock from starburst activity, an end-on radio galaxy, and a supermassive black hole merger event. No simple model can easily explain the observed structure of the source. This work, as well as other recent discoveries, demonstrates the power of unsupervised machine learning in mining large data sets for scientifically interesting sources.
Unsupervised machine learning applied to radio data
(Universty of the Western Cape, 2023) Mohale, Koketso; Lochner, Michelle
This thesis presents work motivated by the belief that the next generation of discoveries in the field of astronomy will be made by the marriage of advanced data analysis algorithms in the form of unsupervised learning techniques, and the unprecedented volumes and complexities of data from the next generation of surveys. For several years, computers have been governed by Moore’s law, which posited that computing power would double every two years. The consequence was that computing has also become increasingly cost-effective, which has been a driving force in the ability to generate and analyse large volumes of datasets. These include machine learning advances like the use of deep learning and scalable techniques such as self-supervised learning which have been revolutionising areas of research, for example, natural language processing and computer vision. Similarly, astronomy is also met with a rapid growth in the availability of large datasets. Morden sky observing instruments such as the radio telescope MeerKAT and the optical telescope Blanco (which was used for the Dark Energy Survey) are already producing data volumes at unprecedented scales. The next generation of instruments like the Square Kilometre Array (SKA) and the Vera C. Rubin Observatory are expected to produce orders of magnitude more astronomical data at higher resolution and sensitivity. Ongoing efforts in the form of surveys and data analysis techniques in astronomy are motivated in part by outstanding questions in galaxy evolution and cosmology as well as the potential to discover new unknown phenomena.
Unsupervised machine learning for transient discovery in deeper, wider, faster light curves
(Oxford University Press, 2020) Lochner, Michelle; Webb, Sara; Muthukrishna, Daniel
Identification of anomalous light curves within time-domain surveys is often challenging. In addition, with the growing number of wide-field surveys and the volume of data produced exceeding astronomers’ ability for manual evaluation, outlier and anomaly detection is becoming vital for transient science. We present an unsupervised method for transient discovery using a clustering technique and the ASTRONOMALY package. As proof of concept, we evaluate 85 553 min-cadenced light curves collected over two ∼1.5 h periods as part of the Deeper, Wider, Faster program, using two different telescope dithering strategies.

Browsing by Author "Lochner, Michelle"

Results Per Page

Sort Options