Redirection issue with Load balancer with https://xyz.com to https://www.xyz.com when using curl it shows https://www.xyz.com:443/ in end

0

issues while im using curl -I command for my site

curl -I http://xyz.com/ = its redirecting to https://xyz.com/ shouldnt this redirect to https://www.xyz.com/

curl -I https://xyz.com/ = its location shows Location - https:/www.xyz.com:443/ does this 443 should be there??

is it alright to show there? and the thing is im getting issue in crawling my site it crawl just first page home page and some tools shows that i have two domain but which shouldnt be there.

  • the main issue is my website is not getting crawl and i have concern that im getting issue that i have 2 domain as xyz.com and www.xyz.com i think that is the reason im not getting crawl i have checked redirection its perfect its redirecting xyz.com with 301 to www.xyz.com

asked a month ago329 views
2 Answers
0

I suggest you configure a custom listener rule for the HTTP listener port: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/listener-update-rules.html#add-rule.

Set the rule to inspect the "Host" header to match the pattern zyx.zom and other patterns you want to redirect. Set the action as "redirect to URL". You can specify the destination URL to be constructed with the variables #{host}, #{port}, and #{path}, which are extracted from the originally requested URI. As you already saw, if the redirect specifies #{host} as the target host, zyx.com will get sent to zyx.com and not www.zyx.com.

If you only have one site behind the load balancer, the cleanest setup is to specify the exact URL you want to redirect to, such as www.zyx.com (not using #{host} at all) and perhaps setting the path to / instead of #{path}, so that users trying an outdated HTTP link will get sent to the HTTPS front page. When you configure the target as www.zyx.com, your users will get sent directly to the www address.

Finally, set the default rule for the HTTP listener to return a "fixed response" and set it to HTTP status code 404 and empty content. The reason I'm suggesting this is that all remaining real users who legitimately connect over HTTP are guaranteed to know the URL of your site, so they'll hit the custom listener rule you created above. The only ones to hit the default rule returning a 404 are the vast numbers of bots and shady actors constantly scanning the whole internet to find potential targets for attacks. You'd only be doing them a small favour by pointing them to the real address of your site when they connect to your ALB blindly, without even knowing its name, instead of returning a 404 to such blind requests.

For the same reason, I'd also suggest you configure your HTTPS listener to require a valid Host header to forward the request to your backend compute, and return 404 in the default listener rule also for HTTPS.

EXPERT
Leo K
answered a month ago
profile picture
EXPERT
reviewed a month ago
  • thanx for your response i will do 404 for http

    but my main issue is that im enable to crawl my website bcoz im getting issue that there is two domain which is xyz.com and www.xyz.com whenever i try to crawl it just crawl my only home page and some chunks.

0

When you enable the redirect HTTP to HTTPS, the ALB is only going to change the protocol/port. It will not redirect to another (sub) domain such as www.xyz.com.

You should look to create an apex A record that is an alias to the ALB to redirect xyz.com to www.xyz.com.
Relavent docs: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-choosing-alias-non-alias.html

Hope this helps!

profile pictureAWS
EXPERT
iBehr
answered a month ago
profile picture
EXPERT
reviewed a month ago