Title

Generalized Linear Model for Automobile Fatality Rate Predication in R

Document Type

Book Chapter

Comments

Is part of Open Source Software for Statistical Analysis of Big Data

Keywords

variable grouping; GIGO; 80/20; interavtion term; k-fold cross validation; 75/25 validation; holdout damply; backward selection; forward selection; calendar year validation; stepwise variable selection

Identifier Data

https://doi.org/10.4018/978-1-7998-2768-9.ch005

Publisher

IGI Global

Publication Source

Open Source Software for Statistical Analysis of Big Data: Emerging Research and Opportunities

Abstract

This chapter demonstrates the descriptive and statistical modeling function in R. The automobile fatal accident data of the United States is extracted from the Fatality Analysis Reporting System (FARS). The model will be used to understand significant contributing factors of automobile accident death when a fatal crash happens. First, descriptive analysis is performed by basic R functions and packages. Then, generalized linear model (GLM) with logit link function is explored and constructed. Finally, multiple validation metrics are introduced and calculated to ensure the reasonability and accuracy of the predictions. The focus of this chapter is to demonstrate the power and flexibility of the most popular Open Source Statistical Software (OSSS) through a real data analysis.

This document is currently not available here.

Share

COinS