Does ALB maintains single back-end connection with each ec2 instance or different back-end connections for each client?

1

When a HTTP request is submitted to ALB. It maintains two connections.

  1. It establishes a TCP connection with the client who submitted the request. We call it front-end connection.
  2. It establishes a TCP connection with the underlying ec2-instance registered in the Target Group where the ALB forwards the request. We call it a back-end connection.

ALB automatically scales based upon the number of clients. So practically, it can establish as many front-end connections as it requires.

I have a doubt regarding back-end connections. Does the ALB establish single connection with each underlying ec2-instance or different connections for each client connecting to the ALB?

For example, let's say there is one EC2 instance registered in the ALB Target Group. Let's say the ALB has received two HTTP requests from two different clients over the internet at the same time. The ALB will establish two front-end connections connecting to each of the client. After receiving the HTTP requests, the ALB will forward requests to the registered EC2 instance through the back-end connection/s. Will there be a separate back-end connection for each of the client ( In this case, two back-end connection)? or Will there be a single back-end connection (irrespective of the number of clients ) through which requests will be routed to the EC2 instance?

We have an Apache Tomcat Server deployed on EC2 instances behind ALB. The server is configured to receive no more than 200 HTTP requests over a single HTTP Connection. While performing load testing on our application, we are getting many 504 Gateway timeout errors, if we try to send 400 requests over a interval of 10 seconds from a single machine.

Assuming the case, the ALB establishes only a single TCP connection with the underlying EC2 Instances. Assuming the server has already received 200 requests over a back-end connection but there are few requests remaining for which the responses are yet to be sent back from the ec2-instance, therefore, the connection can't be closed at the moment. If the server receives any new request, before this exhausted connection get closed, they will end up getting error.

If the ALB maintains different back-end connection for each client connecting to it, what factors does it take into account to consider a new connection instead of using existing back-end connection.

We want to perform load testing on our application, based upon the answers to the doubt above, I will be able to understand better about how to perform load testing

2 Answers
0
Accepted Answer

Hello,

you're absolutely right about the two connections ALB maintains:

Front-end connection: This connects ALB to the client making the request. Back-end connection: This connects ALB to the EC2 instance in the target group.

ALB establishes a single back-end connection with each EC2 instance in the target group, irrespective of the number of clients.

In your example, even with two clients sending requests, ALB will maintain only one back-end connection to the EC2 instance. This connection acts as a channel for forwarding requests from multiple clients to the EC2 instance.

Reason for 504 Gateway Timeout Errors:

Your assumption about the single back-end connection being the culprit for the 504 errors is likely correct. Here's why:

Tomcat Server Limit: With a limit of 200 requests per connection, exceeding this limit while the connection is still open will result in new requests being rejected.

Multiple Clients, Single Connection: With a single back-end connection, if the first 200 requests from both clients fill the connection, subsequent requests will be rejected until the existing requests are processed and the connection frees up. This delay can lead to timeouts. Understanding ALB's Decision for New Connections:

While ALB uses a single connection per EC2 instance, it does consider certain factors before reusing an existing connection:

Healthy Target: The EC2 instance needs to be healthy in the target group to receive new requests.

Connection Idle Timeout: ALB maintains an idle timeout period for back-end connections. If a connection remains inactive for a certain time (configurable), it may be closed and a new connection established for the next request.

Least Outstanding Requests: In some configurations, ALB might prefer back-end connections with fewer outstanding requests to minimize queuing. (This behavior can be disabled)

ALB Connection Management: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html

ALB Target Group Health: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html

Tomcat Connection Management: https://tomcat.apache.org/tomcat-9.0-doc/config/index.html

Tomcat Connection Pooling: https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html

ALB Metrics:

pen_spark https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-cloudwatch-metrics.html

profile picture
EXPERT
Sandeep
answered a month ago
profile picture
EXPERT
reviewed a month ago
profile picture
EXPERT
reviewed a month ago
EXPERT
reviewed a month ago
  • "ALB establishes a single back-end connection with each EC2 instance in the target group, irrespective of the number of clients." - This is not correct - we maintain a pool of connections to each target - someone from ELB team will create a correct/complete response for this shortly.

0

Here is what I found

  • We are using Tomcat to serve our web application.
  • It is configured to use HTTP 1.1 protocol.
  • It doesn't support TCP Pipelining.
  • Target Group is configured with HTTP 1.1 protocol.

When a request arrives at the application load balancer, the ALB first establish TCP connection with the underlying server before it can send request to the server (TCP/IP Model). It has two options at that moment.

  1. Make use of existing keep-alive TCP connection and send the request over it.
  2. Establish a new TCP connection and send the request over it.

Since the underlying server doesn't support TCP pipelining therefore, over a single TCP connection the ALB can't send another request if there is already a request that has been sent and the response for that request is not yet arrived. In other words, there can only be one request getting served over a TCP connection at a moment. Keep-alive connections allow us to send multiple requests over a single TCP connection, but following HTTP 1.1 protocol, these requests must be sent in a sequence such that at a moment there is only one request getting served over it. To be able to send another request, it must have to wait for the response of previously sent request.

Now we know that for the ALB to use an existing TCP keep-alive connection, there must NOT be any outstanding request over it. ( Outstanding request: a request for which the ALB is waiting for the server to respond). If it finds such keep-alive TCP connection with no outstanding request, it uses that connection to send the request to the server and waits for the response before sending another request over it.

A TCP connection can't outlive forever. We have configured our Tomcat server to close the connection after a certain time period elapses ( 20 seconds) since its opening or once it has served certain number of requests (200 ).

Second case, where if doesn't find any existing keep-alive TCP connection with no outstanding request, it immediately tries to establish a new TCP connection with the underlying server (It happens within 10 seconds the moment the ALB receives the HTTP request). Once the connection is established, it sends the request over it.

There is a limit on the number of active TCP connections our Tomcat server can have at a moment. If there comes any new TCP connection request over that limit, it sends connection refused error to the client trying to establish the connection ( here, the ALB). To such errors, the ALB respond with 504 Gateway timeout error.

answered a month ago