Journals / CMC / Vol.68, No.3
Table of Content

Research Article


Q-Learning Based Routing Protocol for Congestion Avoidance

Daniel Godfrey1, Beom-Su Kim1, Haoran Miao1, Babar Shah2, Bashir Hayat3, Imran Khan4, Tae-Eung Sung5, Ki-Il Kim1,*
1 Department of Computer Science and Engineering, Chungnam National University, Korea
2 College of Technological Innovation, Zayed University, Abu Dhabi, UAE
3 Institute of Management Sciences, Peshawar, Pakistan
4 Department of Electrical Engineering, University of Engineering and Technology, Peshawar, Pakistan
5 Department of Computer and Telecommunications Engineering, Yonsei University, Korea
* Corresponding Author: Ki-Il Kim. Email:
(This article belongs to this Special Issue: Intelligent Software-defined Networking (SDN) Technologies for Future Generation Networks)


The end-to-end delay in a wired network is strongly dependent on congestion on intermediate nodes. Among lots of feasible approaches to avoid congestion efficiently, congestion-aware routing protocols tend to search for an uncongested path toward the destination through rule-based approaches in reactive/incident-driven and distributed methods. However, these previous approaches have a problem accommodating the changing network environments in autonomous and self-adaptive operations dynamically. To overcome this drawback, we present a new congestion-aware routing protocol based on a Q-learning algorithm in software-defined networks where logically centralized network operation enables intelligent control and management of network resources. In a proposed routing protocol, either one of uncongested neighboring nodes are randomly selected as next hop to distribute traffic load to multiple paths or Q-learning algorithm is applied to decide the next hop by modeling the state, Q-value, and reward function to set the desired path toward the destination. A new reward function that consists of a buffer occupancy, link reliability and hop count is considered. Moreover, look ahead algorithm is employed to update the Q-value with values within two hops simultaneously. This approach leads to a decision of the optimal next hop by taking congestion status in two hops into account, accordingly. Finally, the simulation results presented approximately 20% higher packet delivery ratio and 15% shorter end-to-end delay, compared to those with the existing scheme by avoiding congestion adaptively.


Congestion-aware routing; reinforcement learning; Q-learning; Software defined networks
  • 821


  • 642


  • 100


Share Link

WeChat scan