Traffic-Aware Horizontal Pod Autoscaler in Kubernetes-Based Edge Computing Infrastructure

Abstract: 

Container-based Internet of Things (IoT) applications in an edge computing environment require autoscaling to dynamically adapt to fluctuations in IoT device requests. Although Kubernetes’ horizontal pod autoscaler provides the resource autoscaling feature by monitoring the resource status of nodes and then making pod adjustments if necessary, it evenly allocates pods to worker nodes without considering the imbalance of resource demand between nodes in an edge computing environment. This paper proposes the traffic-aware horizontal pod autoscaler (THPA), which operates on top of Kubernetes to enable real-time traffic-aware resource autoscaling for IoT applications in an edge computing environment. THPA performs upscaling and downscaling actions based on network traffic information from nodes to improve the quality of IoT services in the edge computing infrastructure. Experimental results show that Kubernetes with THPA improves the average response time and throughput of IoT applications by approximately 150% compared to Kubernetes with the horizontal pod autoscaler. This indicates that it is important to provide proper resource scaling according to the network traffic distribution to maximize IoT applications performance in an edge computing environment.

Introduction  

Edge computing is a new paradigm that overcomes the inherent limitations of cloud computing by distributing edge nodes with computing resources closer to IoT devices. Using containerization, due to the lightness and portability of containers, it is easy to deploy, install, update, and delete application services on edge nodes, and various types of IoT services can be provided simultaneously at each edge node. As such, containerization is widely considered as the most suitable technology for providing IoT services in edge computing environments. However, containerization technology is limited to deploying and managing container-level application services, requiring container orchestration to monitor and manage resource status through multiple edge nodes in an edge computing environment

However, despite the many benefits of Kubernetes, it is still in its infancy in an edge computing environment. In edge computing infrastructure, requests from devices are handled by container-based applications on edge nodes, and the traffic load varies over location and time. Namely, as some nodes are too busy to handle a large amount of traffic while others are idle, an imbalance of demand occurs between nodes. In Kubernetes, the kube-proxy balances resource usage between nodes by sharing the incoming traffic at each node to all pods in the cluster in a random or roundrobin manner. Because edge nodes are geographically distributed, there is network delay in their communication, so this kind of redirection offered by the kube-proxy can increase the response time of applications. Therefore, it is necessary to allocate more or terminate redundant computational resources according to network traffic at each node to maximize the amount of locally handled traffic while minimizing network delay by minimizing the number of requests handled by pods on remote nodes. Nevertheless, when the resource demands of the applications change, Kubernetes’ HPA (KHPA) only tries to evenly distribute new pods to nodes or terminate redundant pods on nodes based on pod status without considering the network delay between edge nodes and the volume of network traffic accessing them in real time. This limitation of KHPA can result in the degradation of the quality of service and overall throughput of the system.

To solve the aforementioned problem of KHPA in a Kubernetes-based edge computing infrastructure, this paper proposes traffic-aware HPA (THPA), which operates on top of Kubernetes to provide dynamic resource autoscaling by considering the IoT service demand at each edge node. Specifically, in an upscaling event, THPA allocates a number of additional pods proportional to the distribution of network traffic accessing the nodes, whereas in a downscaling event, it terminates the pods in the node with low demand. Experimental evaluations prove that THPA significantly improves the average response time and throughput by maximizing the amount of traffic handled locally and avoiding the round-trip delay from redirection between edge nodes in an edge computing environment.

Base Paper - https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9709810


Post a Comment

0 Comments