Recurrent Neural Network(RNN) Time Series Prediction With PyTorch

RNN Time Series Prediction - Complete Guide

RNN Time Series Prediction

A Comprehensive Guide to Sequence Forecasting with Recurrent Neural Networks

Introduction

Recurrent Neural Networks (RNNs) are a class of neural networks designed to work with sequential data. Unlike traditional feedforward networks, RNNs maintain a "memory" of previous inputs through hidden states, making them particularly effective for time series forecasting.

Key Concepts

  • Sequential Processing: RNNs process data points in sequence, maintaining information about previous steps
  • Hidden State: Internal memory that captures information about previous inputs
  • Time Unrolling: The process of visualizing RNNs across time steps
  • Backpropagation Through Time (BPTT): The training algorithm for RNNs

Applications

  • Stock price prediction
  • Weather forecasting
  • Energy demand prediction
  • Sensor data analysis
  • Economic indicators forecasting

Note: While this guide focuses on basic RNNs, the same principles apply to more advanced architectures like LSTMs and GRUs, which are better at capturing long-term dependencies.

Implementation

Data Preparation
Model Architecture
Training Process
Evaluation

1. Data Preparation

Time series data requires special preprocessing to create sequences suitable for RNN training.

import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader

def generate_sine_wave(seq_len, num_samples=1000):
    """Generate sine wave time series data with sequences."""
    x = np.linspace(0, 100, num_samples)
    y = np.sin(x)
    
    sequences = []
    targets = []
    for i in range(len(y) - seq_len):
        sequences.append(y[i:i+seq_len])
        targets.append(y[i+seq_len])
    
    return np.array(sequences), np.array(targets)

class TimeSeriesDataset(Dataset):
    """Custom Dataset for time series data"""
    def __init__(self, sequences, targets):
        self.sequences = torch.FloatTensor(sequences).unsqueeze(-1)
        self.targets = torch.FloatTensor(targets).unsqueeze(-1)
        
    def __len__(self):
        return len(self.sequences)
    
    def __getitem__(self, idx):
        return self.sequences[idx], self.targets[idx]

Key Steps

  1. Generate or load time series data
  2. Create sliding window sequences
  3. Split into input sequences and target values
  4. Convert to PyTorch tensors
  5. Create Dataset and DataLoader for batching

Parameters

  • seq_len: Length of input sequences (time steps)
  • num_samples: Total number of data points
  • batch_size: Number of sequences per training batch

Data Visualization

2. Model Architecture

The RNN model consists of an embedding layer (for discrete inputs), recurrent layers, and output layers.

import torch.nn as nn

class RNNModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, num_layers=1):
        super(RNNModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        
        # Recurrent layer
        self.rnn = nn.RNN(
            input_size=input_size,
            hidden_size=hidden_size,
            num_layers=num_layers,
            batch_first=True
        )
        
        # Output layer
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x, hidden=None):
        # Initialize hidden state if not provided
        if hidden is None:
            hidden = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
        
        # Forward pass through RNN
        out, hidden = self.rnn(x, hidden)
        
        # Reshape output for fully connected layer
        out = out.contiguous().view(-1, self.hidden_size)
        out = self.fc(out)
        
        return out, hidden

Components

  • RNN Layer: Processes sequential data with hidden state
  • Linear Layer: Maps final hidden state to output
  • Hidden State: Maintains memory between time steps

Hyperparameters

  • input_size: Dimension of input features (1 for univariate)
  • hidden_size: Number of units in hidden state
  • output_size: Dimension of output (1 for regression)
  • num_layers: Stacked RNN layers (default 1)

For better performance on long sequences, consider using LSTM (nn.LSTM) or GRU (nn.GRU) layers which are better at capturing long-range dependencies.

3. Training Process

The training loop involves forward passes, loss calculation, and backpropagation through time.

# Initialize model, loss function, and optimizer
model = RNNModel(input_size=1, hidden_size=32, output_size=1)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Training loop
num_epochs = 100
for epoch in range(num_epochs):
    model.train()
    total_loss = 0
    
    for batch_x, batch_y in train_loader:
        # Zero gradients
        optimizer.zero_grad()
        
        # Forward pass
        outputs, _ = model(batch_x)
        loss = criterion(outputs, batch_y.view(-1, 1))
        
        # Backward pass and optimize
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
    
    # Print training progress
    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}')

Training Components

  • Loss Function: Mean Squared Error (MSE) for regression
  • Optimizer: Adam optimizer with learning rate
  • Epochs: Complete passes through the dataset
  • Batches: Subsets of data for gradient updates

Common Issues

  • Vanishing/exploding gradients
  • Overfitting on training data
  • Difficulty learning long-term dependencies
  • Sensitivity to hyperparameters

Training Visualization

4. Model Evaluation

After training, we evaluate the model on test data to assess its generalization ability.

def evaluate(model, test_sequences, test_targets, seq_len):
    model.eval()
    with torch.no_grad():
        # Convert to tensor
        test_seq = torch.FloatTensor(test_sequences).unsqueeze(-1)
        
        # Make prediction
        predictions, _ = model(test_seq)
        predictions = predictions.view(-1).numpy()
        
        # Plot results
        plt.figure(figsize=(12, 6))
        plt.plot(np.arange(len(test_targets)), test_targets, label='True Values')
        plt.plot(np.arange(len(predictions)), predictions, 'ro', label='Predictions')
        plt.legend()
        plt.title('Model Predictions vs Actual Values')
        plt.show()
        
        return predictions

Evaluation Metrics

  • Mean Squared Error (MSE): Standard regression metric
  • Mean Absolute Error (MAE): Robust to outliers
  • R-squared: Variance explained by model
  • Visual Inspection: Plot predictions vs actual

Prediction Strategy

  • Single-step: Predict next value only
  • Multi-step: Predict multiple future values
  • Recursive: Use predictions as new inputs
  • Direct: Separate model for each future step

Prediction Demo

Advanced Topics

LSTM & GRU
Attention
Multivariate
Production

LSTM and GRU Architectures

Advanced RNN variants that address the vanishing gradient problem.

LSTM (Long Short-Term Memory)

class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, 1)
    
    def forward(self, x):
        out, (h_n, c_n) = self.lstm(x)
        return self.fc(out[-1])

Key Features:

  • Input, output, and forget gates
  • Cell state for long-term memory
  • Better at learning long-range dependencies

GRU (Gated Recurrent Unit)

class GRUModel(nn.Module):
    def __init__(self, input_size, hidden_size):
        super().__init__()
        self.gru = nn.GRU(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, 1)
    
    def forward(self, x):
        out, h_n = self.gru(x)
        return self.fc(out[-1])

Key Features:

  • Update and reset gates
  • Simpler than LSTM but often comparable
  • Fewer parameters than LSTM

Attention Mechanisms

Allows the model to focus on relevant parts of the input sequence.

class Attention(nn.Module):
    def __init__(self, hidden_size):
        super().__init__()
        self.attention = nn.Linear(hidden_size, 1)
    
    def forward(self, rnn_output):
        # rnn_output shape: (seq_len, batch, hidden_size)
        attention_weights = torch.softmax(
            self.attention(rnn_output), dim=0
        )
        return (attention_weights * rnn_output).sum(dim=0)

Benefits of Attention:

  • Interpretability (can see which time steps are important)
  • Better performance on long sequences
  • Flexibility in focusing on relevant parts of history

Multivariate Time Series

Extending the model to handle multiple input features.

# Multivariate data shape: (num_samples, seq_len, num_features)
multivariate_data = np.stack([feature1, feature2, feature3], axis=-1)

class MultivariateRNN(nn.Module):
    def __init__(self, input_size, hidden_size):
        super().__init__()
        self.rnn = nn.RNN(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, 1)
    
    def forward(self, x):
        out, h_n = self.rnn(x)
        return self.fc(out[-1])

Considerations:

  • Input size becomes number of features
  • May need more hidden units
  • Feature scaling becomes more important
  • Can capture cross-feature dependencies

Production Considerations

Deploying time series models in real-world applications.

Deployment Options

  • TorchScript for optimized inference
  • ONNX runtime for cross-platform
  • Flask/FastAPI web services
  • Cloud functions (AWS Lambda, GCP Functions)

Monitoring

  • Track prediction drift over time
  • Monitor input data distribution
  • Set up alerting for anomalies
  • Periodic model retraining

Additional Resources

Useful Libraries

Comments

Popular posts from this blog

Tech Duos For Web Development

CIFAR-10 Dataset Classification Using Convolutional Neural Networks (CNNs) With PyTorch

Long-short-term-memory (LSTM) Word Prediction With PyTorch