Skip to main navigation menu Skip to main content Skip to site footer

Intelligent Cloud Service Anomaly Monitoring via Uncertainty Estimation and Causal Graph Inference

Abstract

This paper addresses the challenges of complex dependencies, diverse anomaly patterns, and the coexistence of label scarcity and pseudo-label noise in cloud service environments by proposing an anomaly monitoring method that integrates uncertainty estimation with causal inference. The method models cloud service interactions as dependency graphs, extracts cross-temporal and cross-service contextual features through graph embedding, and applies uncertainty estimation to provide confidence intervals for boundary samples, thereby mitigating prediction instability caused by short-term fluctuations and noise. On this basis, a causal inference mechanism is introduced to suppress spurious correlations, while causal consistency constraints enhance the identification of complex anomalies under cross-tenant coupling and multi-tenant interference. The optimization objective jointly incorporates classification loss, contrastive loss, and uncertainty calibration to balance threshold performance and global ranking stability. Experiments systematically analyze hyperparameter sensitivity, environmental sensitivity, and data sensitivity, including the effects of prediction head depth and width on boundary confidence, the trade-off between false positives and false negatives under varying interference and coupling, and the impact of label scarcity and pseudo-label noise on causal accuracy. Results show that the proposed method outperforms existing public models on metrics such as AUC, F1-Score, Precision, Recall, and AUROC, and maintains robustness and stability under complex interference and high-noise conditions, fully validating its effectiveness and applicability in cloud service anomaly monitoring tasks.

pdf