New Paper: Forecasting reconciliation with a top-down alignment of independent level forecasts


The fuma paper is accepted in Journal of the Operational Research Society

Our fuma paper is accepted in the Journal of the Operational Research Society.

Xiaoqian Wang, Yanfei Kang, Fotios Petropoulos, Feng Li (2021). The uncertainty estimation of feature-based forecast combinations (in press), Journal of the Operational Research Society. 

Working paper | R package ]

Forecasting is an indispensable element of operational research (OR) and an important aid to planning. The accurate estimation of the forecast uncertainty facilitates several operations management activities, predominantly in supporting decisions in inventory and supply chain management and effectively setting safety stocks. In this paper, we introduce a feature-based framework, which links the relationship between time series features and the interval forecasting performance into providing reliable interval forecasts. We propose an optimal threshold ratio searching algorithm and a new weight determination mechanism for selecting an appropriate subset of models and assigning combination weights for each time series tailored to the observed features. We evaluate our approach using a large set of time series from the M4 competition. Our experiments show that our approach significantly outperforms a wide range of benchmark models, both in terms of point forecasts as well as prediction intervals.


New Paper: Forecast with Forecasts: Diversity Matters

Authors: Yanfei Kang, Wei Cao, Fotios Petropoulos, Feng Li

Abstract: Forecast combination has been widely applied in the last few decades to improve forecast accuracy. In recent years, the idea of using time series features to construct forecast combination model has flourished in the forecasting area. Although this idea has been proved to be beneficial in several forecast competitions such as the M3 and M4 competitions, it may not be practical in many situations. For example, the task of selecting appropriate features to build forecasting models can be a big challenge for many researchers. Even if there is one acceptable way to define the features, existing features are estimated based on the historical patterns, which are doomed to change in the future, or infeasible in the case of limited historical data. In this work, we suggest a change of focus from the historical data to the produced forecasts to extract features. We calculate the diversity of a pool of models based on the corresponding forecasts as a decisive feature and use meta-learning to construct diversity-based forecast combination models. A rich set of time series are used to evaluate the performance of the proposed method. Experimental results show that our diversity-based forecast combination framework not only simplifies the modelling process but also achieves superior forecasting performance.

Links: Working Paper


We are presenting at ISF2020 Invited Session

Our lab members will be presenting our work at the invited session of the 40th International Symposium on Forecasting virtually.

Session: Forecast Combination

Time: October 26, Monday, 17:00-18:00 GMT+8

Detailed Schedule:


  • Yanfei Kang (Speaker) Associate Professor, School of Economics and Management, Beihang University
  • Xiaoqian Wang (Speaker) PhD student, Beihang University
  • Xixi Li (Speaker) The University of Manchester.
  • Feng Li (Speaker) Assistant Professor at School of Statistics and Mathematics, Central University of Finance and Economics

Chair: Yanfei Kang

Forecast with forecasts: diversity matters

  • Yanfei Kang (Speaker) Associate Professor, School of Economics and Management, Beihang University

Forecast combination has been widely applied in the last few decades to improve forecast accuracy. In recent years, the idea of using time series features to construct forecast combination model has flourished in the forecasting area. Although this idea has been proved to be beneficial in several forecast competitions such as the M3 and M4 competitions, it may not be practical in many situations. For example, the task of selecting appropriate features to build forecasting models can be a big challenge for many researchers, and the interpretation may also be obscure so that it is hard to get valuable information from them. Hence, it is crucially important to improve the interpretability of forecast combination, making it feasible in practical applications. In this work, we treat the diversity of a pool of algorithms as an alternative to state-of-the-art time series features, and use meta-learning to construct diversity-based forecast combination models. A rich set of time series are used to evaluate the performance of the proposed method. Experimental results show that our diversity-based combination forecasting framework not only simplifies the modeling process but also achieves superior forecasting performance.

Distributed ARIMA Models for Ultra-long Time Series

Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industry-standard MapReduce framework. The proposed model combination approach facilitates distributed time series forecasting by combining the local estimators of ARIMA (AutoRegressive Integrated Moving Average) models delivered from worker nodes and minimizing a global loss function. In this way, instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we make assumptions only on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed distributed ARIMA models on an electricity demand dataset. Compared to ARIMA models, our approach results in significantly improved forecasting accuracy and computational efficiency both in point forecasts and prediction intervals, especially for longer forecast horizons. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.

Improving forecasting with sub-seasonal time series patterns

Time series forecasting plays an increasingly important role in modern business decisions. In today’s data-rich environment, people often want to choose the optimal forecasting model for their data. However, identifying the optimal model often requires professional knowledge and experience, making accurate forecasting a challenging task. To mitigate the importance of model selection, we propose a simple and reliable algorithm and successfully improve forecasting performance. Specifically, we construct multiple time series with different sub-seasons from the original time series. These derived series highlight different sub-seasonal patterns of the original time series, making it possible for the forecasting methods to capture diverse patterns and components of the data. Subsequently, we make extrapolation forecasts for these multiple time series separately with classical statistical models (ETS or ARIMA). Finally, forecasts of these multiple time series are averaged together with equal weights. Whether in point or interval predictions, we evaluate our approach on the widely used competition datasets M1, M3, and M4 and it improves the forecasting performance in total horizon compared with the benchmarks. We also study which pattern of time series is more suitable for our method.

Feature-based Bayesian Forecasting Model Averaging

  • Feng Li (Speaker) Assistant Professor at School of Statistics and Mathematics, Central University of Finance and Economics

In this work, we propose a feature-based Bayesian forecasting model averaging framework (febama). Our Bayesian framework estimates weights of the feature-based forecasting combination via a Bayesian log predictive score, in which the optimal forecasting combination is connected and determined by time-series features from historical information. In particular, we utilize the prior knowledge of the coefficients of time-series features. We use an efficient Bayesian variable selection method to weight important features that may affect the forecasting combinations. To this end, our approach has better interpretability compared to other black-box forecasting combination schemes. Our framework is more computational efficient because the log predictive score and time-series features are calculated in the offline phase. We apply our framework to stock market data and M4 competition data. Based on our structure, a simple maximum-a-posteriori scheme outperforms the optimal prediction pools (Geweke and Amisano, 2011) or simple averaging, and Bayesian variable selection further enhanced the forecasting performance.


New Paper: Distributed ARIMA Models for Ultra-long Time Series


The forecasting with time series imaging paper is accepted in Expert Systems with Applications

Our foresting paper with time series imaging approach is accepted in Expert Systems with Applications.

Xixi Li, Yanfei Kang, and Feng Li*. (2020). Forecasting with time series imaging, Expert Systems with Applications. (In Press)

[Working Paper | Software]

Feature-based time series representations have attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual choice of an appropriate set of features. Exploiting machine learning methods to extract features from time series automatically becomes crucial in state-of-the-art time series analysis. In this paper, we introduce an automated approach to extract time series features based on time series imaging. We first transform time series into recurrence plots, from which local features can be extracted using computer vision algorithms. The extracted features are used for forecast model averaging. Our experiments show that forecasting based on automatically extracted features, with less human intervention and a more comprehensive view of the raw time series data, yields highly comparable performances with the best methods in the largest forecasting competition dataset (M4) and outperforms the top methods in the Tourism forecasting competition dataset.

Feature-based time series forecasting has been proved highly promising, primarily through the extraction and selection of an appropriate set of features. Nonetheless, traditional time series feature extraction requires manual design of feature metrics, which is typically complicated to time series forecasting practitioners. Known features used in time series forecasting literature are global characteristics of a time series, which may ignore important local patterns. Evidence from the literature further indicates that feature-based forecast combination might not perform as well as simple averaging when the feature extraction and selection are not properly conducted.

We propose an automated time series imaging feature extraction approach with computer vision algorithms, and our experiment results show that our approach works well for forecast combination. An innovative point of our approach over other feature-based time series forecasting methods is that time series features are extracted automatically from time series imaging, which are obtained using recurrence plots. In principle, any image feature extraction algorithm is applicable to our proposed framework. We employ two widely used algorithms to extract features from time series images, namely the spatial bag-of-features (SBoF) model and the deep convolutional neural networks (CNN).

The SBoF model, combining the scale-invariant feature transform (SIFT) algorithm, the locality constrained linear coding (LLC) method, and spatial pyramid matching (SPM) and max pooling, can capture both global and local characteristics of images. The traditional SBoF model is a fast industry level model in computer vision applications. One may notice that the features extracted based on the traditional SIFT model performs better than the deep CNN model in some scenarios with our testing data. But it is worth to mention that SIFT method is not a fully automated image feature extraction processing because it requires a careful specification of four steps, namely (1) detecting extreme values in the scale spaces, (2) finding the key points, (3) assigning feature directions, and (4) describing key points. Moreover, SIFT algorithm is patent protected, which means other open source program could not incorporate it without the patent owner’s permission. Having an alternative approach with highly comparable performance but without patent restrictions is important to time series forecasters.

The alternative feature extraction algorithm based on deep CNN is an automated process once the source task is confirmed. We use transfer learning to borrow the information of well pre-trained neural network models for imaging classification, which can avoid the complication of settings the network structure and tuning the hyper-parameters. Unlike traditional CNN tasks that require the fine-tuning and massive computation, we transfer the convolutional layers and fully-connected lays from the ImageNet competition results to our task. Hence only one new adaption layer needs to train, which significantly saves the computational power.

Although the aims of source task in ImageNet and the target task of time series forecasting are naturally different, the image features generated from time series share similar shapes and angles with the image of real objects. This explains why we could transfer a different task to time series forecasting. In practice, the forecasting practitioners may train a customized CNN model to further improve the forecasting performance if a rich collection of time series is available.

Another significant merit of using deep CNN and transfer learning for time series feature extraction is that, the pre-trained neural network models (e.g., on ImageNet) are continuously updated and improved in the image processing literature. Thus, we believe that this line of automated time series feature extraction approaches has great potential in the future.

In this paper, we use the features extracted from recurrence plots to reveal the characteristics of the corresponding time series. The recurrence plot for a given time series displays its dynamics based on the distance correlations within the time series. However, other features such as cross-correlation coefficients can also be used to generate cross-correlation recurrence plots. Thus, multi-channel images, with more comprehensive information, can be obtained for each time series, which can potentially improve the feature extraction and feature-based forecast combination performances. Therefore, time series forecasting based on multi-channel imaging can be one potential extension of our current work.

The forecasting framework based on time series image features is in line with the work in the feature-based forecast (FFORMA), where they use 42 manual time series features and nine forecasting methods to optimize the weights for forecast combination. FFORMA won the second place in the M4 competition. To be consistent and comparable, in our study, we employ the same set of forecasting methods in the M4 dataset. However, we want to mention that the choice of candidate forecasting methods for forecast combination also requires expert knowledge and practical experience. The performance of forecast combinations depends on the accuracy of individual forecasting methods and the diversity among them since the merits of forecast combination stem from the independent information across multiple forecasts. How to automatically select an appropriate set of candidate methods for combination is another interesting direction for future research.

In our experiments, all the time series are independent data. Therefore we treat the time series features as independent images and apply them to the CNN framework which is also used for classifying objects in ImageNet. A further extension of our work is to extend time series forecasting with imaging to (1) forecasting with time varying image features, and (2) hierarchical time series or multivariate time series with recurrent dependence. In both scenes, the hierarchical image classification framework mixtures with CNN and RNN could be further explored.

We make our code publicly available at Making it open-source can enrich the toolboxes of forecasting support systems by providing a competitive alternative to the existing feature-based time series forecasting methods.


The GRATIS paper is accepted in Statistical Analysis and Data Mining

Our GRATIS paper for GeneRAting TIme Series with diverse and controllable characteristics is accepted in the ASA data science journal: Statistical Analysis and Data Mining.

Yanfei Kang, Rob J Hyndman, and Feng Li*. (2020). GRATIS: GeneRAting TIme Series with diverse and controllable characteristics, Statistical Analysis and Data Mining. (In Press)

[Journal version | Working Paper | R Package | Web App]

The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative approaches. We propose GeneRAting TIme Series with diverse and controllable characteristics, named GRATIS, with the use of mixture autoregressive (MAR) models. We simulate sets of time series using MAR models and investigate the diversity and coverage of the generated time series in a time series feature space. By tuning the parameters of the MAR models, GRATIS is also able to efficiently generate new time series with controllable features. In general, as a costless surrogate to the traditional data collection approach, GRATIS can be used as an evaluation tool for tasks such as time series forecasting and classification. We illustrate the usefulness of our time series generation process through a time series forecasting application.

We have proposed an efficient simulation method, GRATIS, for generating time series with diverse characteristics requiring minimal input of human effort and computational resources. Our generated dataset can be used as benchmarking data in the time series domain, which functions similarly to other machine learning data repositories. The simulation method is based on mixture autoregressive models where the parameters are assigned with statistical distributions. In such a way, we provide a general benchmarking tool serving for advanced time series analysis where a large collection of benchmarking data is required, including forecasting comparison, model averaging, and time series model training with self-generated data. To the best of our knowledge, this is the first paper that thoroughly studies the possibility of generating a rich collection of time series. Our method not only generates realistic time series data but also gives a higher coverage of the feature space than existing time series benchmarking data.

The GRATIS approach is also able to efficiently generate new time series with controllable target features, by tuning the parameters of MAR models. This is particularly useful in time series classification or specific areas where only some features are of interest. This procedure is the inverse of feature extraction which usually requires much computational power. Our approach of generating new time series from given features can scale up the computation time by 40 times making feature-driven time series analysis tasks feasible.

We further show that the GRATIS scheme can serve as a useful resource for time series applications. In particular, we present a novel time series forecasting approach by exploiting the time series features of current generated time series. Our application also sheds light on a potential direction to forecasting with private data where the model training could be purely based on our generated data. The reader should take home the message that simulated series are similar to the original series in terms of features, but this does not mean they visually look alike.

Other potential extensions include: (i) GRATIS with exogenous information via mixture of ARIMA with explanatory variables (ARIMAX) to allow for local patterns due to external events, (ii) GRATIS with multivariate time series by exploring mixtures of vector autoregression models, (iii) GRATIS with cross-sectional information about the time series by exploring the approaches, (iv) extending GRATIS to discrete time series by investigating the mixture of integer-valued autoregressive processes or Poisson autoregression, and (v) using GRATIS to serve as a pre-training process of deep learning methods to save time and improve accuracy.


We have a lab now!

We are so excited to announce that Yanfei and Feng’s group now has an official lab, named KLLAB (pronounced as [col·lab], meaning collaborating) and a website

Have fun!