Multivariate LSTM with SLO-Aware Loss for Virtual Machine Workload Prediction on Cloud Data Center

  • Agus Hariyanto
  • Ahmad Fahriyannur Rosyady Politeknik Negeri Jember, Indonesia
  • Adi Sucipto Politeknik Negeri Jember, Indonesia
  • Bekti Maryuni Susanto Politeknik Negeri Jember, Indonesia
  • Sapta Nugraha Universitas Maritim Raja Ali Haji Tanjung Pinang, Kepulauan Riau, Indonesia
  • Nicolas Chenu Polytech Annecy-Chambery, Université Savoie Mont Blanc, France
Keywords: VM workload prediction, LSTM, SLO-aware loss, cloud computing

Abstract

Accurate virtual machine (VM) workload prediction is a key component of cloud resource management, particularly to support auto-scaling and to maintain Service Level Objectives (SLOs). In conventional prediction models that rely on symmetric loss functions such as Mean Squared Error (MSE), under-prediction errors are treated equivalently to over-prediction errors, even though under-prediction carries significantly more severe operational consequences — it directly triggers capacity shortages and SLO violations. This study proposes a CPU workload prediction approach based on a multivariate Long Short-Term Memory (LSTM) network enhanced with an SLO-aware loss, an asymmetric loss function that penalizes under-prediction ten times more heavily than over-prediction. Experiments are conducted on a subset of 25,000 rows from the Bitbrain GWA-T-12 fastStorage dataset with four input features (CPU, memory, network received, network transmitted), using a fixed random seed for reproducibility. Two models are trained and compared: one with SLO-aware loss and one with standard MSE as baseline, both sharing identical architecture and hyperparameters. The primary evaluation metric is the under-prediction rate, which directly quantifies SLO violation risk. Results show that the SLO-aware model achieves an under-prediction rate of 0.04%, compared to 0.16% for the MSE baseline — a fourfold reduction. These findings empirically confirm that SLO-aware loss effectively directs the model toward conservative predictions that protect SLO compliance, establishing loss function design as a critical and actionable dimension in cloud VM workload prediction.

Published
2026-06-29
Section
Articles