Projects
Visualizing Fantasy Football Matchups (2022):
Used Python and ESPN Football API to create a dataset with fourteen weeks worth of fantasy football league data. Stored the data in excel and used Tableau to create visualizations such as overall team record, cumulative points scored, and identify unlikely wins and unfortunate losses.
Food Delivery Case Study (2021):
Analyzed a dataset with two months of food delivery orders to identify improvements in company performance. Identified key performance indicators and performance trends. Summarized findings in a slideshow. Wrote reproducible functions to validate, clean, and analyze the data. Reproducible functions allow for on demand analysis for other similar datasets.
Classifying Hard Drive Reliability (2020):
Predicted and classified hard drive reliability by utilizing hard drive performance data provided by the cloud storage company Backblaze. Working on a team of four, identified the primary indicators for early drive failure and developed a model to predict early failures using SMART attributes (drive performance metrics).
NLP with Github Repositories (2019) :
Web-scraped 400 repositories on Github.com (via API) related to the Advent of Code challenge. Extracted the Readme files and converted to JSON format. Used this data to design and configure our own database. Used Natural Language Processing to identify keywords that pertained to each programming language. Created a classifier model used to predict the computer language of each repository.
Fitbit Time Series Analysis (2019) :
Analyzed an individual’s Fitbit data to determine their physical attributes and fitness activity patterns. Cleaned and structured csv files into a Pandas data frame, then used time series models to make predictions on future activity levels. We delivered our findings with a two-slide presentation to a general audience.
Predicting Home Price with Clustering (2019) :
Experimented with data clusters in a Zillow property database in order to find patterns between pricing prediction error and property characteristics. We feature engineered new variables and presented our finding to other data scientists in the form of a Jupyter Notebook.
Identifying Churning Customers with Classification (2019) :
Created and evaluated several classification models to determine whether a customer would churn. Identified patterns and characteristics of churning customers. Delivered a slide show presentation tailored to company executives.