Recurrent Neural Network(RNN) Time Series Prediction With PyTorch
RNN Time Series Prediction
A Comprehensive Guide to Sequence Forecasting with Recurrent Neural Networks
Introduction
Recurrent Neural Networks (RNNs) are a class of neural networks designed to work with sequential data. Unlike traditional feedforward networks, RNNs maintain a "memory" of previous inputs through hidden states, making them particularly effective for time series forecasting.
Key Concepts
- Sequential Processing: RNNs process data points in sequence, maintaining information about previous steps
- Hidden State: Internal memory that captures information about previous inputs
- Time Unrolling: The process of visualizing RNNs across time steps
- Backpropagation Through Time (BPTT): The training algorithm for RNNs
Applications
- Stock price prediction
- Weather forecasting
- Energy demand prediction
- Sensor data analysis
- Economic indicators forecasting
Note: While this guide focuses on basic RNNs, the same principles apply to more advanced architectures like LSTMs and GRUs, which are better at capturing long-term dependencies.
Implementation
1. Data Preparation
Time series data requires special preprocessing to create sequences suitable for RNN training.
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
def generate_sine_wave(seq_len, num_samples=1000):
"""Generate sine wave time series data with sequences."""
x = np.linspace(0, 100, num_samples)
y = np.sin(x)
sequences = []
targets = []
for i in range(len(y) - seq_len):
sequences.append(y[i:i+seq_len])
targets.append(y[i+seq_len])
return np.array(sequences), np.array(targets)
class TimeSeriesDataset(Dataset):
"""Custom Dataset for time series data"""
def __init__(self, sequences, targets):
self.sequences = torch.FloatTensor(sequences).unsqueeze(-1)
self.targets = torch.FloatTensor(targets).unsqueeze(-1)
def __len__(self):
return len(self.sequences)
def __getitem__(self, idx):
return self.sequences[idx], self.targets[idx]
Key Steps
- Generate or load time series data
- Create sliding window sequences
- Split into input sequences and target values
- Convert to PyTorch tensors
- Create Dataset and DataLoader for batching
Parameters
seq_len
: Length of input sequences (time steps)num_samples
: Total number of data pointsbatch_size
: Number of sequences per training batch
Data Visualization
2. Model Architecture
The RNN model consists of an embedding layer (for discrete inputs), recurrent layers, and output layers.
import torch.nn as nn
class RNNModel(nn.Module):
def __init__(self, input_size, hidden_size, output_size, num_layers=1):
super(RNNModel, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
# Recurrent layer
self.rnn = nn.RNN(
input_size=input_size,
hidden_size=hidden_size,
num_layers=num_layers,
batch_first=True
)
# Output layer
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x, hidden=None):
# Initialize hidden state if not provided
if hidden is None:
hidden = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
# Forward pass through RNN
out, hidden = self.rnn(x, hidden)
# Reshape output for fully connected layer
out = out.contiguous().view(-1, self.hidden_size)
out = self.fc(out)
return out, hidden
Components
- RNN Layer: Processes sequential data with hidden state
- Linear Layer: Maps final hidden state to output
- Hidden State: Maintains memory between time steps
Hyperparameters
input_size
: Dimension of input features (1 for univariate)hidden_size
: Number of units in hidden stateoutput_size
: Dimension of output (1 for regression)num_layers
: Stacked RNN layers (default 1)
For better performance on long sequences, consider using LSTM (nn.LSTM
) or GRU (nn.GRU
) layers which are better at capturing long-range dependencies.
3. Training Process
The training loop involves forward passes, loss calculation, and backpropagation through time.
# Initialize model, loss function, and optimizer
model = RNNModel(input_size=1, hidden_size=32, output_size=1)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
# Training loop
num_epochs = 100
for epoch in range(num_epochs):
model.train()
total_loss = 0
for batch_x, batch_y in train_loader:
# Zero gradients
optimizer.zero_grad()
# Forward pass
outputs, _ = model(batch_x)
loss = criterion(outputs, batch_y.view(-1, 1))
# Backward pass and optimize
loss.backward()
optimizer.step()
total_loss += loss.item()
# Print training progress
if (epoch+1) % 10 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}')
Training Components
- Loss Function: Mean Squared Error (MSE) for regression
- Optimizer: Adam optimizer with learning rate
- Epochs: Complete passes through the dataset
- Batches: Subsets of data for gradient updates
Common Issues
- Vanishing/exploding gradients
- Overfitting on training data
- Difficulty learning long-term dependencies
- Sensitivity to hyperparameters
Training Visualization
4. Model Evaluation
After training, we evaluate the model on test data to assess its generalization ability.
def evaluate(model, test_sequences, test_targets, seq_len):
model.eval()
with torch.no_grad():
# Convert to tensor
test_seq = torch.FloatTensor(test_sequences).unsqueeze(-1)
# Make prediction
predictions, _ = model(test_seq)
predictions = predictions.view(-1).numpy()
# Plot results
plt.figure(figsize=(12, 6))
plt.plot(np.arange(len(test_targets)), test_targets, label='True Values')
plt.plot(np.arange(len(predictions)), predictions, 'ro', label='Predictions')
plt.legend()
plt.title('Model Predictions vs Actual Values')
plt.show()
return predictions
Evaluation Metrics
- Mean Squared Error (MSE): Standard regression metric
- Mean Absolute Error (MAE): Robust to outliers
- R-squared: Variance explained by model
- Visual Inspection: Plot predictions vs actual
Prediction Strategy
- Single-step: Predict next value only
- Multi-step: Predict multiple future values
- Recursive: Use predictions as new inputs
- Direct: Separate model for each future step
Prediction Demo
Advanced Topics
LSTM and GRU Architectures
Advanced RNN variants that address the vanishing gradient problem.
LSTM (Long Short-Term Memory)
class LSTMModel(nn.Module):
def __init__(self, input_size, hidden_size):
super().__init__()
self.lstm = nn.LSTM(input_size, hidden_size)
self.fc = nn.Linear(hidden_size, 1)
def forward(self, x):
out, (h_n, c_n) = self.lstm(x)
return self.fc(out[-1])
Key Features:
- Input, output, and forget gates
- Cell state for long-term memory
- Better at learning long-range dependencies
GRU (Gated Recurrent Unit)
class GRUModel(nn.Module):
def __init__(self, input_size, hidden_size):
super().__init__()
self.gru = nn.GRU(input_size, hidden_size)
self.fc = nn.Linear(hidden_size, 1)
def forward(self, x):
out, h_n = self.gru(x)
return self.fc(out[-1])
Key Features:
- Update and reset gates
- Simpler than LSTM but often comparable
- Fewer parameters than LSTM
Attention Mechanisms
Allows the model to focus on relevant parts of the input sequence.
class Attention(nn.Module):
def __init__(self, hidden_size):
super().__init__()
self.attention = nn.Linear(hidden_size, 1)
def forward(self, rnn_output):
# rnn_output shape: (seq_len, batch, hidden_size)
attention_weights = torch.softmax(
self.attention(rnn_output), dim=0
)
return (attention_weights * rnn_output).sum(dim=0)
Benefits of Attention:
- Interpretability (can see which time steps are important)
- Better performance on long sequences
- Flexibility in focusing on relevant parts of history
Multivariate Time Series
Extending the model to handle multiple input features.
# Multivariate data shape: (num_samples, seq_len, num_features)
multivariate_data = np.stack([feature1, feature2, feature3], axis=-1)
class MultivariateRNN(nn.Module):
def __init__(self, input_size, hidden_size):
super().__init__()
self.rnn = nn.RNN(input_size, hidden_size)
self.fc = nn.Linear(hidden_size, 1)
def forward(self, x):
out, h_n = self.rnn(x)
return self.fc(out[-1])
Considerations:
- Input size becomes number of features
- May need more hidden units
- Feature scaling becomes more important
- Can capture cross-feature dependencies
Production Considerations
Deploying time series models in real-world applications.
Deployment Options
- TorchScript for optimized inference
- ONNX runtime for cross-platform
- Flask/FastAPI web services
- Cloud functions (AWS Lambda, GCP Functions)
Monitoring
- Track prediction drift over time
- Monitor input data distribution
- Set up alerting for anomalies
- Periodic model retraining
Additional Resources
Learning Resources
Useful Libraries
- sktime - Time series machine learning
- Prophet - Forecasting at scale
- TensorFlow - Alternative to PyTorch
Comments