Installing libraries in glue python shell while connected to VPC

0

I am using public subnet, that has a route table with internet gateway, destination 0.0.0.0/0. I am still not able to access to pypi to download necessary libraries in glue python shell. Getting below error , any help would be appreciated. Also, if i don not connect to Redshift VPC, i am able to download libraries using --additional-python-modules. Error: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f5213a9a0>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/geopy/

1 Answer
0

If the attempt is actually being made from inside your VPC, it should be connected to a private subnet, with the default route pointing to a NAT gateway in a public subnet. The public subnet where the NAT gateway resides should have its default route pointing to the IGW.

An IPv4-only network interface in a public subnet won't be able to connect to the internet, when it doesn't have a public IP address. The NAT gateway provides the translation capability to map the private IP to an internet-routable public IP address.

There's a diagram of this standard design here: https://docs.aws.amazon.com/vpc/latest/userguide/vpc-example-private-subnets-nat.html

EXPERT
Leo K
answered 2 months ago
profile picture
EXPERT
reviewed 2 months ago
  • Once i create private subnet in same availability zone , i will need to use private subnet for vpc connection in glue and not the public subnet?

  • Yes, precisely right @Vanishree.

  • Glue will not work with a public subnet. It needs to be private with a nat gateway as per Leo K

  • it worked, thank you

  • @Gary Mclean: I really wish the documentation were clearer about that, because it led me to believe that if I used a public subnet, that the Glue Job runner would be able to use the IGW to reach the Intenet without needing a NAT gateway from a private subnet ... https://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-pypi.html