Reinforcement Learning-Based Dynamic Cache Replacement Using Deep Q-Networks
Abstract
This paper addresses the challenge of data access dynamism in caching systems and proposes a reinforcement learning-based method for optimizing dynamic cache replacement strategies. The method models the cache management process as a Markov Decision Process and adopts the Deep Q-Network (DQN) framework. It enables adaptive learning and dynamic decision-making of cache strategies in complex environments. To improve convergence efficiency and system stability, the proposed reinforcement learning structure incorporates experience replay, target networks, and an ε-greedy exploration mechanism. This design allows continuous policy optimization under varying access patterns, resource constraints, and mixed workload conditions. The experimental evaluation covers multiple dimensions. It assesses policy adaptability under changing access patterns, robustness under cache capacity variation, generalization ability under multi-workload combinations, and the impact of different state modeling approaches on convergence performance. The results show that the proposed method outperforms several recent advanced algorithms in key metrics such as hit rate, average latency, and policy stability. It demonstrates strong practicality and flexibility. The overall method is well-structured and tightly integrated across its components. It confirms the effectiveness of reinforcement learning in cache management and offers a data-driven optimization pathway for cache policy design in complex systems.