Me and My Humble Portfolio

Predict Average Response Time of the Los Angeles Fire Department

Implemented Gradient Boosting Decision Tree Algorithm, and tuned hyperparameters using random search in parallel.

View on GitHub

About the Project

The main goal of this project was to predict average response time of the Los Angeles Fire Department. The evaluation metric was MSE.

Methodology

Imported external data on district information from LA Times to get district information of LA
Engineered new features using regular expression, aggregation and etc.
Selected features through repeatedly adding features to baseline model and see which one contributed the most
Implemented regression XGBoost with selected 10 features

6 were original features, 4 were newly created

Tuned hyperparameter with parallel mapping (hyperparameter tuned: eta, nrounds, max_depth)
10-fold cross validation in parallel to reduce overfitting
Postprocessed predictions by removing negative values
Ranked 3rd/92 teams

Further Details

For more information, check out the Project Markdown here and the Presentation Deck here.