Projects

Project Portfolio

1. Powerlifting Visualization and Analysis

Github Repository: Powerlifting

Dash App: App

Project Synopsis: This is the final project of my visualization design class. My chosen dataset was about powerlifting and the link to dataset is available in readme file of the repo. The goal was to synthesize what we learned in the course about visualizations and create a Plotly Dash app based on our dataset. My analysis involved focusing on strength distribution, bodyweight and strength connection, and age differences. Features of the dateset spanned from age group, gender, best squat lift, best bench lift, best deadlift, bodyweight, location, tournament, etc. Key findings include that men's group has higher strength variation compared to women across lifts. Also for both groups, strength began to decline around the age of 40. This project helped me solidify many different concepts taught such as color theory, presenting easy to understand visualizations, understanding Plotly library and more.

2. Relational Database Design: Music Industry

Report: Final Report (PDF)

Project Synopsis: This final project for Database Design involved creating a relational database for the music industry from scratch. Key concepts applied included entity-relationship diagrams (ERDs), normalization, and data dictionaries. The database was implemented using MySQL with phpMyAdmin as the interface. Major challenges included defining table relationships and applying normalization, which were addressed through a step-by-step approach. The result was a fully functional and well-structured database that demonstrated practical application of database design principles.

3. Screen Time Analysis: Part One

Github Repository: Screen Time

Project Synopsis: This project was inspired by the pervading problem of high screen time use. I wanted to use data analysis to uncover some patterns on what factors could be contributing to this problem. So I decided to collect some data myself using social media platforms. Features I chose included age, gender, sleep time, workout(yes/no), trying to reduce(yes/no) and the output was phone screen time. Also, because the data was collected from younger individuals (16-30) in State of Indiana, my scope is limited to Midwest region. In the collection form, it was stated that the data will only be used for an educational project for transparency. My original plan was to also use predictive modeling to predict someone's screen usage. But my initial data collection didn't have enough samples to perform machine learning. So I decided to split the project into two parts with first focusing on exploratory data analysis and second on machine learning. First part is completed and the linked Git repo can be found above. The next phase will explore machine learning once more data is collected. The beneficiaries of this project could be anyone from people trying to see which factors affect their screen usage to general data enthusiasts.

4. Statistical Modeling and Analysis for Apartment Rent

Report: Final Report (PDF)

Project Synopsis: This was a group project for Statistical Learning course. We applied machine learning algorithms to Apartment for Rent in US dataset(available on UC Irvine ML Repo) and tested which performed better. I focused on fitting linear regression while my group mates ran other models such as smoothing splines, random forests, SVM. Results showed that linear regression performed the best with lowest RMSE. Full details of the project can be found in the attached report.

Footer