The problem is that during downscaling clients get 502 errors. E.g. when a pod is stopping, its containers can not correctly close existing connections. Show
So, in this post, we will take a closer look at the pods’ termination process in general, and NGINX and PHP-FPM containers in particular. Testing will be performed on the AWS Elastic Kubernetes Service by the Yandex.Tank utility. Ingress resource will create an AWS Application Load Balancer with the AWS ALB Ingress Controller. Для управления контейнерами на Kubernetes WorkerNodes испольузется Docker. Contents So, let’s take an overview of the pods’ stopping and termination process. Basically, a pod is a set of processes running on a Kubernetes WorkerNode, which are stopped by standard IPC (Inter-Process Communication) signals. To give the pod the ability to finish all its operations, a container runtime at first ties softly stop it (graceful shutdown) by sending a The Thus, the whole flow of the pod’s deleting is (actually, the part below is a kinda copy of the ):
Actually, there are two issues:
E.g. if we have a connection to an NGINX server, the NGINX master process during the fast shutdown will just drop this connection and our client will receive the 502 error, see the Avoiding dropped connections in nginx containers with “STOPSIGNAL SIGQUIT”. NGINX--- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 801 and 502 Okay, now we got some understanding of how it’s going – let’s try to reproduce the first issue with NGINX. The example below is taken from the post above and will be deployed to a Kubernetes cluster. Prepare a Dockerfile: FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf CMD ["nginx", "-g", "daemon off;"] Here NGINX will phantom: address: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com header_http: "1.1" headers: - "[Host: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com]" uris: - / load_profile: load_type: rps schedule: const(100,30m) ssl: false console: enabled: true telegraf: enabled: false package: yandextank.plugins.Telegraf config: monitoring.xml9 a request to the http://httpbin.org which will respond with a 10 seconds delay to emulate a PHP-backend. Build an image and push it to a repository: docker build -t setevoy/nginx-sigterm . docker push setevoy/nginx-sigterm Now, add a Deployment manifest to spin up 10 pods from this image. Here is the full file with a Namespace, Service, and Ingress, in the following part of this post, will add only updated parts of the manifest: --- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 80 Deploy it: kubectl apply -f test-deployment.yaml namespace/test-namespace created deployment.apps/test-deployment created service/test-svc created ingress.extensions/test-ingress created Check the Ingress: curl -I aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com HTTP/1.1 200 OK And we have 10 pods running: kubectl -n test-namespace get pod NAME READY STATUS RESTARTS AGE test-deployment-ccb7ff8b6-2d6gn 1/1 Running 0 26s test-deployment-ccb7ff8b6-4scxc 1/1 Running 0 35s test-deployment-ccb7ff8b6-8b2cj 1/1 Running 0 35s test-deployment-ccb7ff8b6-bvzgz 1/1 Running 0 35s test-deployment-ccb7ff8b6-db6jj 1/1 Running 0 35s test-deployment-ccb7ff8b6-h9zsm 1/1 Running 0 20s test-deployment-ccb7ff8b6-n5rhz 1/1 Running 0 23s test-deployment-ccb7ff8b6-smpjd 1/1 Running 0 23s test-deployment-ccb7ff8b6-x5dc2 1/1 Running 0 35s test-deployment-ccb7ff8b6-zlqxs 1/1 Running 0 25s Prepare a FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"]0 for the Yandex.Tank: phantom: address: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com header_http: "1.1" headers: - "[Host: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com]" uris: - / load_profile: load_type: rps schedule: const(100,30m) ssl: false console: enabled: true telegraf: enabled: false package: yandextank.plugins.Telegraf config: monitoring.xml Here, we will perform 1 request per second to pods behind our Ingress. Run tests: All good so far. Now, scale down the Deployment to only one pod: kubectl -n test-namespace scale deploy test-deployment --replicas=1 deployment.apps/test-deployment scaled Pods became Terminating: kubectl -n test-namespace get pod NAME READY STATUS RESTARTS AGE test-deployment-647ddf455-67gv8 1/1 Terminating 0 4m15s test-deployment-647ddf455-6wmcq 1/1 Terminating 0 4m15s test-deployment-647ddf455-cjvj6 1/1 Terminating 0 4m15s test-deployment-647ddf455-dh7pc 1/1 Terminating 0 4m15s test-deployment-647ddf455-dvh7g 1/1 Terminating 0 4m15s test-deployment-647ddf455-gpwc6 1/1 Terminating 0 4m15s test-deployment-647ddf455-nbgkn 1/1 Terminating 0 4m15s test-deployment-647ddf455-tm27p 1/1 Running 0 26m ... And we got our 502 errors: Next, update the Dockerfile – add the FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"]1: FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"] Build, push: docker build -t setevoy/nginx-sigquit . docker push setevoy/nginx-sigquit Update the Deployment with the new image: ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ... Redeploy, and check again. Run tests: Scale down the deployment again: kubectl -n test-namespace scale deploy test-deployment --replicas=1 deployment.apps/test-deployment scaled And no errors this time: Great! Traffic,--- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 805, and FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"]3 But still, if repeat tests few times we still can get some 502 errors: This time most likely we are facing the second issue – endpoints update is performed at the same time when the Let’s add a --- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 805 hook with the FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"]6 to give some time to update endpoints and our Ingress, so after the cluster will receive a request to stop a pod, a --- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 804 on a WorkerNode will wait for 5 seconds before sending the SIGTERM :... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 lifecycle: preStop: exec: command: ["/bin/sleep","5"] ... Repeat tests – and now everything is fine Our PHP-FPM had no such issue as its image was initially built with the FROM nginx RUN echo 'server {\n\ listen 80 default_server;\n\ location / {\n\ proxy_pass http://httpbin.org/delay/10;\n\ }\n\ }' > /etc/nginx/conf.d/default.conf STOPSIGNAL SIGQUIT CMD ["nginx", "-g", "daemon off;"]1.Other possible solutions And of course, during debugging I’ve tried some other approaches to mitigate the issue. See links at the end of this post and here I’ll describe them in short terms. --- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 805 and ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...1One of the solutions was to add a --- apiVersion: v1 kind: Namespace metadata: name: test-namespace --- apiVersion: apps/v1 kind: Deployment metadata: name: test-deployment namespace: test-namespace labels: app: test spec: replicas: 10 selector: matchLabels: app: test template: metadata: labels: app: test spec: containers: - name: web image: setevoy/nginx-sigterm ports: - containerPort: 80 resources: requests: cpu: 100m memory: 100Mi readinessProbe: tcpSocket: port: 80 --- apiVersion: v1 kind: Service metadata: name: test-svc namespace: test-namespace spec: type: NodePort selector: app: test ports: - protocol: TCP port: 80 targetPort: 80 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: test-ingress namespace: test-namespace annotations: kubernetes.io/ingress.class: alb alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]' spec: rules: - http: paths: - backend: serviceName: test-svc servicePort: 805 hook which will send ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...3 to NGINX: lifecycle: preStop: exec: command: - /usr/sbin/nginx - -s - quit Or: ... lifecycle: preStop: exec: command: - /bin/sh - -SIGQUIT - 1 .... But it didn’t help. Not sure why as the idea seems to be correct – instead of waiting for the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...4 from Kubernetes/Docker – we gracefully stopping the NGINX master process by sending ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...3. You can also run the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...6 utility check which signal is really received by the NGINX. NGINX + PHP-FPM, ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...7, and ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...8Our application is running in two containers in one pod, but during the debugging, I’ve also tried to use a single container with both NGINX and PHP-FPM, for example, trafex/alpine-nginx-php7. There I’ve tried to add to to the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 lifecycle: preStop: exec: command: ["/bin/sleep","5"] ...0 for both NGINX and PHP-FPM with the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 ...3 value, but this also didn’t help although the idea also seems to be correct. Still, one can try this way. PHP-FPM, and ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 lifecycle: preStop: exec: command: ["/bin/sleep","5"] ...2In the Graceful shutdown in Kubernetes is not always trivial and on the Stackoveflow in the Nginx / PHP FPM graceful stop (SIGQUIT): not so graceful question is a note that FPM’s master process is killed before its child and this can lead to the 502 as well. Not our current case, but pay your attention to the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 lifecycle: preStop: exec: command: ["/bin/sleep","5"] ...3. NGINX, HTTP, and keep-alive sessionAlso, it can be a good idea to use the ... spec: containers: - name: web image: setevoy/nginx-sigquit ports: - containerPort: 80 lifecycle: preStop: exec: command: ["/bin/sleep","5"] ...4 header – then the client will close its connection right after a request is finished and this can decrease 502 errors count. |