A Hybrid Framework for Sequential Data Prediction with End-to-End Optimization

by   Mustafa E. Aydın, et al.

We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates, via an end-to-end architecture, the need for hand-designed features and manual model selection issues of conventional nonlinear prediction/regression methods. In particular, we use recursive structures to extract features from sequential signals, while preserving the state information, i.e., the history, and boosted decision trees to produce the final output. The connection is in an end-to-end fashion and we jointly optimize the whole architecture using stochastic gradient descent, for which we also provide the backward pass update equations. In particular, we employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression. Our framework is generic so that one can use other deep learning architectures for feature extraction (such as RNNs and GRUs) and machine learning algorithms for decision making as long as they are differentiable. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets. Furthermore, we openly share the source code of the proposed method to facilitate further research.


page 1

page 2

page 3

page 4


Hybrid State Space-based Learning for Sequential Data Prediction with Joint Optimization

We investigate nonlinear prediction/regression in an online setting and ...

Context-Aware Ensemble Learning for Time Series

We investigate ensemble methods for prediction in an online setting. Unl...

Markovian RNN: An Adaptive Time Series Prediction Network with HMM-based Switching for Nonstationary Environments

We investigate nonlinear regression for nonstationary sequential data. I...

A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data

We investigate regression for variable length sequential data containing...

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

We present recursive recurrent neural networks with attention modeling (...

Takens-inspired neuromorphic processor: a downsizing tool for random recurrent neural networks via feature extraction

We describe a new technique which minimizes the amount of neurons in the...

Event-based Feature Extraction Using Adaptive Selection Thresholds

Unsupervised feature extraction algorithms form one of the most importan...

Please sign up or login with your details

Forgot password? Click here to reset