A Comparison of Machine Learning Algorithms of Big Data for Time Series Forecasting Using Python
autoregressive integrated moving average; perceptron; support vector machines; recurrent neural networks; regression trees; exponential smoothing models; long short-term memory neural netowkrs; multilayer perceptrons
Open Source Software for Statistical Analysis of Big Data: Emerging Research and Opportunities
This chapter compares the performances of multiple Big Data techniques applied for time series forecasting and traditional time series models on three Big Data sets. The traditional time series models, Autoregressive Integrated Moving Average (ARIMA), and exponential smoothing models are used as the baseline models against Big Data analysis methods in the machine learning. These Big Data techniques include regression trees, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), and long short-term memory neural networks (LSTM). Across three time series data sets used (unemployment rate, bike rentals, and transportation), this study finds that LSTM neural networks performed the best. In conclusion, this study points out that Big Data machine learning algorithms applied in time series can outperform traditional time series models. The computations in this work are done by Python, one of the most popular open-sourced platforms for data science and Big Data analysis.