Populating Galaxies into haloes via machine learning on the simba simulation

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Oxford University Press

Abstract

We present a machine learning (ML) based framework, machine inferred galaxy (MIG), designed to populate dark matter haloes with galaxies in N-body simulations. MIG predicts galaxy stellar mass ($M_*$), star formation rate (SFR), atomic and molecular gas masses (Hi mass ($M_{\rm H\,{\small I}}$), and H$_2$ mass ($M_{\rm H2}$)), and metallicity, and can be readily extended to other galaxy properties and simulations. The framework first separates haloes into central and satellite systems, then uses ML classifiers to distinguish star-forming (SF) from quenched (Q) galaxies, followed by separate regressors trained on the SF subgroups for both centrals and satellites. MIG is trained on the $(100\, h^{-1}\mathrm{Mpc})^3$ Simba galaxy formation simulation at $z=0$ and achieves high accuracy for key baryonic properties, including a regression score close to 0.9 for $M_{\rm H\,{\small I}}$ predictions of central galaxies. We further demonstrate its robustness at $z=1$ and $z=2$. Training on fractional quantities (e.g. $M_{\rm H\,{\small I}}/M_*$) and then rescaling by the predicted $M_*$ yields improved performance over direct predictions across all properties and redshifts. MIG also reproduces galaxy mass distribution functions with higher fidelity, an essential step for accurately predicting integrated quantities such as Hi intensity maps. These results establish MIG as an efficient and physically consistent tool for generating mock galaxy catalogues and baryonic tracers in large cosmological volumes for various surveys.

Description

Citation

Das, P.K., Davé, R. and Cui, W., 2026. Populating Galaxies into haloes via machine learning on the simba simulation. Monthly Notices of the Royal Astronomical Society, 545(3), p.staf2096.