Analysis of Feature Selection Across Different Machine Learning Algorithms

For my machine learning class, my partner and I chose to do an analysis on how different machine learning algorithms handle feature selection and dimensionality reduction implicitly or explictly. We chose 4 machine learning models: support vector machines, Bayesian networks, and logistic regression with L1 and L2 norms. We compared the results to existing preprocessing techniques that are used for dimensionality reduction - Principle Component Analysis and Linear Discriminant Analysis. The linear transformation using PCA is shown below.


Feature importance and dimensionality reduction are important for effectively visualizing and interpreting real-world datasets, as well as improving prediction accuracy. Using the Students Academic Performance Dataset from Kaggle, we implemented support vector machines, Bayesian networks, and logistic regression with L1 and L2 norms. These algorithms with various hyperparamets were implemented to determine feature importance and predictive power of the models. These results were then compared to the important features determined from the preprocessing techniques: Principle Component Anaylsis (PCA) and Linear Discriminant Analysis (LDA). Using R2 values, which explain the percentage of the response variable explained by the variation in the model, it was found that L1 regularized logistic regression and Bayesian networks best fit the data. The higher R2 values identify that these algorithms had the most predictive power, allowing them to identify the ost important features in the dataset. SVM, LDA and L2 regularized logistic regression least accurately fit the data with SVM having the next highest R2 value and L1 having the lowerst. The important features between Bayesian networks, PCA, and LDA were compared, while the irrelevant features determined by SVM and L1/L2 logistic regression were also evaluated.

Moving forward, I would like to do more analysis in the sci-kit learn library feature_selection. I would also like to use multiple datasets and have more baselines to compare results to.

-- Link to paper --

About Me

I am a computer science and cognitive science double major at Swarthmore College graduating in the Spring of 2020.
In addition, I am also fascinated by psychology, neuroscience, and nutrition. In my free time, I like to play baksetball and workout, cook, make/listen to music, and go chasing sunsets. I play many instruments, but mainly the piano and GuZheng. My favorite classical pieces include Grades etudes de Paganini by Franz Liszt, all three of Rachmaninioff's Piano Concertos, and Chopin Scherzo No.3 in C Sharp Minor, Op.39.

In computer science, I am passionate about parallel and distributed systems and machine learning applications. In cognitive science, I am fascinated by the microbiome of the gut, the psychology of language, as well as physiological psychology.

On a journey to quench my thirst for knowledge