Automatic Elastic Scaling in Distributed Microservice Environments via Deep Q-Learning
 
					
									Abstract
This study addresses the challenges of designing elastic scaling strategies, low resource utilization, and high response latency in microservice systems by proposing an automatic scaling and elasticity optimization algorithm based on Deep Q-Learning. The method models the microservice system as a Markov Decision Process and constructs a multi-dimensional state space. A deep neural network is used to approximate the Q-value function, and mechanisms such as experience replay and target network updates are introduced to enhance training stability and policy generalization. The model design incorporates current resource loads, service dependency structures, and runtime states to construct a refined action space that supports dynamic selection among scaling out, scaling in, and maintaining the current state. To evaluate performance in practical scheduling scenarios, several experiments are conducted, including comparisons with mainstream reinforcement learning methods, sensitivity analyses of key hyperparameters, and adaptability tests under different sampling frequencies and training data sizes. Results show that the proposed method outperforms existing approaches in terms of average response time, resource cost, and QoS violation rate, demonstrating good convergence speed, robustness, and adaptability. Based on a complete policy learning framework, this study systematically quantifies the modeling capability of reinforcement learning in microservice elasticity scheduling and provides empirical support for resource management in complex service environments.