Skip to main navigation menu Skip to main content Skip to site footer

Hierarchical Attention-Based Modeling for Intelligent Scheduling Delay Prediction in Complex Backend Systems

Abstract

This paper addresses the problem of backend scheduling delay prediction and proposes a modeling method based on hierarchical attention to overcome the limitations of traditional approaches under high-dimensional features, complex dependencies, and dynamic environments. The study begins with multidimensional features such as task load, resource usage, and component invocation relationships, and generates low-level representations through linear embedding and nonlinear mapping. Local attention is then introduced to model short-term dependencies and local features, while global attention and hierarchical aggregation structures capture cross-component and cross-temporal dependencies, achieving a balance between fine-grained local modeling and overall dependency modeling. A prediction layer is further constructed, where local and global representations are fused to generate delay predictions. To verify the effectiveness of the method, experiments were conducted on a public dataset with Transformer, FedFormer, iTransformer, and TimeMixer as baselines. Results show that the proposed method outperforms these models in key metrics, including MSE, MAE, RMSE, and MAPE. In addition, extensive sensitivity analyses were carried out, covering parameters and environmental factors such as learning rate, weight decay, hidden dimension size, node scale, and missing rate. The results demonstrate that the proposed method exhibits strong robustness and stability under diverse conditions. Overall, this study improves both prediction accuracy and generalization performance in backend scheduling delay prediction and provides strong support for efficient scheduling in complex systems.

pdf