Statistics for Data Science techniques for predicting plant genes involved in secondary metabolites production