Automated feature selection for a machine learning approach toward modeling a mosquito distribution
This paper introduces a data science method to determine a set of features for training a vector support machine (SVM). The SVM is used to model the relationship between the distribution of one particular invasive mosquito species and climate data. Two biologists selected training data on the basis of their domain expertise. This was compared with the result of the data science simulation. The paper then explores the possible uses of data science to generate new knowledge as well as to identify the weaknesses of this technique.