Demand prediction for shared mobility services using time series modelling

Figure 1. The methodology used in this study

People are becoming increasingly interested in new mobility solutions,  such  as  shared  mobility,  which are both cheaper and environmentally friendly when compared to traditional transportation methods.

According to multiple reports, it is estimated that by 2025, the shared-mobility market worldwide will be worth over €550 billion [1]. However, as promising as ride-hailing mobility might be, it also has its challenges, with longer passenger waiting times,  longer trips  due to different pickup points and vacant vehicles driving around aimlessly waiting for a request [2]. The main objective of this project was to analyse and investigate the possibilities of optimising shared mobility using historical data and multiple predictive techniques. A further goal of the research was to be able to predict the total number of generated requests per hour for the near future. Applying the latest ICT technologies to the transportation industry contributes to the development of better transport infrastructure, with safer roads and better traffic management.

The accompanying diagram outlines the methodology used throughout the study, starting from data collection, pre-processing, and data visualisation. In the third step, where necessary, the time series is differentiated into a stationary time series as well as finding the optimal model-parameter combinations. This is followed by training and building the forecasting model. Finally, the trained model can predict future values which will be validated by comparing them with the testing dataset.

Using the data compiled by a Maltese ride-hailing company, this study first investigates where and how pickup requests were made. The data was subsequently examined for any time-series patterns [3]. The three main patterns considered to be the pillars for any series are: trend, seasonality and cyclic. When the ‘requests’ data is visualised geographically, it can be noted that most of these requests were booked around touristic areas, mainly Sliema and Valletta. From the results obtained, it could be implied that exogenous  or  independent data such as temperature and public holidays do not affect the predictive model. Such models have been validated by splitting the data between a training set and a testing set. The purpose of such sets is to fit the model on the training set and compare the outcome with the actual observed values of the latter set. As a result of the seasonality in the time series, the optimum models were the Holt- Winters model [4], the seasonal autoregressive integrated moving average (SARIMA) [5] and Facebook Prophet [6]. Using different performance metric calculations, the results conclude that amongst all models, the accuracy of the Holt-Winters model outperforms other models with an overall mean absolute error of 8.039 and root mean squared error of 11.159.

Finally, the study suggests that some models do not work as expected due to the lack of yearly seasonality. With this in mind, a report generator has been developed to forecast data on a real-time basis using the three above-mentioned predictive models.


[1] “Mobility Scooter Market – Global Industry Analysis, Share, Forecast – 2024.” [Online].

[2] Available: Schaller, “The new automobility: Lyft, Uber and the future of American cities,” 2018.

[3] V. Shah, “A Comparative Study of Univariate Time-series Methods for Sales Forecasting,” 2020.

[4] P. R. Winters, “Forecasting sales by exponentially weighted moving averages,” Manage. Sci., vol. 6, no. 3, pp. 324–342, 1960.

[5] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time series analysis: forecasting and control, vol. 734. John Wiley & Sons, 2011

[6] S. J. Taylor and B. Letham, “Forecasting at scale,” Am. Stat., vol. 72, no. 1, pp. 37–45, 2018.

Student: Rudi Camilleri
Course: B.Sc. IT (Hons.) Software Development
Supervisor: Mr. Tony Spiteri Staines
Co-supervisor: Dr. Peter Albert Xuereb