Predict Average Response Time of the Los Angeles Fire Department

Implemented Gradient Boosting Decision Tree Algorithm, and tuned hyperparameters using random search in parallel.

View on GitHub

About the Project

The main goal of this project was to predict average response time of the Los Angeles Fire Department. The evaluation metric was MSE.

Methodology

  • Imported external data on district information from LA Times to get district information of LA
  • Engineered new features using regular expression, aggregation and etc.
  • Selected features through repeatedly adding features to baseline model and see which one contributed the most
  • Implemented regression XGBoost with selected 10 features
    • 6 were original features, 4 were newly created
  • Tuned hyperparameter with parallel mapping (hyperparameter tuned: eta, nrounds, max_depth)
  • 10-fold cross validation in parallel to reduce overfitting
  • Postprocessed predictions by removing negative values
  • Ranked 3rd/92 teams

Further Details

For more information, check out the Project Markdown here and the Presentation Deck here.

Back