This repository contains a separate term project done for the course Introduction to Machine Learning, which is offered by the Data Science Master's program of University of Helsinki. In this term project I evaluated four different classifiers belonging to the classifier families neural networks, generalized linear models, random forests and support vector machines using four different data sets containing a two-class classification problem. The resulting report partially reproduces some of the analysis done in the paper Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? [1].
Originally I completed the term project in March 2021 and this repository (created in July 2022) contains the work that was done back in March 2021. The generated PDF report shows a more recent date since the PDF is generated from an already existing R Markdown file.
The directory data contains files provided by the authors of the original article, and the directory Original_paper_results contains results that the original authors obtained for the classifiers and data sets that I used.
[1] Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do We Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research, 15 (90), 3133–3181. http://jmlr.org/papers/v15/delgado14a.html