I have talked to many people in the industry and taken multiple interviews. When someone says I am good in Kubernetes my first question is what is the exact task of kube-proxy or some variation of this question like does kube-proxy comes in data path and most of the time there in confusion in what exactly does kube-proxy does.
In this writeup let us look at what is the exact task of kube proxy?
In short Kube Proxy talks to the API server to get the information of all the services and pods that are running and then configures IPtable rules or IPVS rules. These rules are present on all nodes, So if you want to reach a random pod of any service your request can land on any of the worker nodes and the iptables rules present on that node will forward the packet to the pod.
Kube proxy doesn’t come in the data path. What this means is that kube-proxy doesn’t interact with packets. It just interacts with the node’s iptables. This is also the reason why we need to run the Kube proxy on each node.
Let us understand this concept with the help of a diagram.
Kube Proxy talks to the api server which gets the details of all the services and pods running in the cluster from etcd. Once kube-proxy has this information it configures the iptables of the nodes. This was control plane part. Let’s look at how traffic actually flows.
Traffic [green line in the diagram] for any service may land on any node, now with help of the iptables, the traffic will be forwarded to the node where the pods actually are. Next, when it reaches the node the iptables forward the traffic to the pod.
So in this way kube proxy facilitates the access of service from inside and outside of the cluster.
Kube proxy never comes in data path and its only task is to configure the iptables or IPVS rules.