AWS Transit Gateway – Active/Passive IPSEC Tunnels

Like so many others, my organization over the last few years has been leveraging AWS cloud services for IT infrastructure.  It started out with one account, one VPC, one VPN tunnel (2 for redundancy) and a handful of EC2 instances.  Gradually over time this simple cloud presence has multiplied drastically.  There are now numerous EC2 instances, VPCs, VPN tunnels, regions in use, and AWS accounts.  I was not around to witness this AWS footprint grow; I inherited the network responsibilities after the fact. 

When arrived I investigated the on premises to AWS connectivity and discovered the web of data center to VPC VPN tunnel networking.  Setting up internal connectivity from an on prem router/firewall to a VPC is relatively straight forward.  In AWS you follow the steps here to setup a site to site IPSEC tunnel and then download a configuration ‘walk through’ to configure your physical device where you want connectivity.  Each VPN tunnel you configure on the AWS side results in two tunnels to your on prem gateway for redundancy.  In the end on the customer network side you get two VPN tunnels to every VPC needing internal network connectivity.  As one can imagine this doesn’t scale very well.  Even 15 AWS VPCs results in 30 VPN tunnels with 30 separate BGP peers or static routes depending on which option you go for in terms of routing.

I was able to simplify my org’s cloud networking by utilizing AWS’ Transit Gateway service.  A Transit Gateway (TGW) essentially moves your cloud networking architecture to a hub and spoke model.  Instead of setting up direct connections from your data center to each VPC separately, each VPC is connected to the TGW and the local router/firewall has an IPSEC VPN tunnel only connecting to the TGW as well.  The TGW allows for all internal connectivity between VPCs and on prem.  Per the example above, moving to a TGW hub and spoke you would go from 30 VPN tunnels to two.  The two VPN tunnels to the TGW are for redundancy and equal cost multi-path if you want to go the ECMP route.     

    Example Before:

    Example After:

Per the AWS Transit Gateway documentation, the maximum bandwidth per VPN connection is 1.25 Gbps which exceeds the bandwidth of the internet connection the org is using to terminate the site to site tunnel.  This means ECMP is not needed and an active/passive tunnel approach is more suitable.  An issue I ran into while setting this up had to do with tunnel priority and BGP.  I discovered that when setting up a standard VPN tunnel direct to VPC, AWS will advertise routes to an on prem device over BGP with a MED value of 200 for one tunnel and 100 for the other.  This allows for a primary and secondary tunnel, avoiding asymmetric routing, etc.  However, when two tunnels are configured to a TGW AWS does not advertise routes to a customer on prem device with different MED values.  This obviously is an issue if you’re looking for an active/passive approach with zero ECMP.

After digging around AWS documentation and the internet I discovered that BGP peering to a TGW will in fact honor AS Path Prepend.  As a result I was able to use AS Path Prepend on the routes advertised over the secondary tunnel, and Local Preference on the routes advertised to our on prem device from AWS over the primary tunnel.

I believe these same BGP attributes can be used for AWS DirectConnect customers as well.

For some reason I struggle to find proper AWS documentation and setting this up took longer than it needed to.  I hope this will save someone some time if they run into this same situation.

AWS Route Propagation – Site to Site VPN

When first starting to work with AWS networking I obviously ran into the term Route Propagation. Similar to nearly all layer 3 IP devices, routes in AWS route tables are populated with either manual static routes or the routes are dynamically populated from an outside neighbor or source. Typically when talking about dynamically populated routing tables the topic is about widely used routing protocols such as OSPF or BGP, and with AWS the only traditional routing protocol we can use is BGP.

In the AWS networking world Route Propagation comes into play when connecting an on premises network to an AWS Virtual Private Cloud (VPC). When using an IPSEC tunnel for connectivity, we have the routing options of Dynamic or Static.

Option when creating Site to Site IPSEC Tunnel in AWS Console

If Dynamic is selected then the on premises device (router, firewall, load balancer, etc.) needs to support BGP. After the tunnel is established, the operator then sets up BGP peering over the connection, and barring Route Propagation is enabled on the AWS side, routes are advertised between the on premises and AWS VPC routing tables. Routing protocol advertisements feel very natural to someone working in the networking space which ultimately led me to believe the term Propagation is AWS’ way of saying Advertisement. This is not quite true, Route Propagation can actually be used with the ‘Static’ option as well.

When the Static Routing option is selected for IPSEC site to site connectivity, the operator will get the option to add some Static IP Prefixes into the configuration. After the connection is built and the Virtual Private Gateway is attached to the proper VPC, we’ll find that some routes need to be added into the VPC routing table in order to route traffic over the new connection.

AWS Console – Static Routing IPSEC VPN Configuration
AWS Console – Empty Route Table – Zero Static Routes

If we select the tab for Route Propagation under the route table we can see that there is an option to enable this feature with the Virtual Private Gateway. Once this feature is enabled, then the static routes added into the VPN configuration are automatically placed into the routing table.

AWS Console – VGW Route Propagation Configuration
AWS Console – VPC Route Table with Static Route Propagation

So ultimately AWS Route Propagation is not exactly like a traditional routing protocol advertisement. Route Propagation is used with AWS Virtual Private Gateways to populate routing tables in conjunction with the Site-To-Site VPN configuration. For instance with AWS’ static routing option, any routing table associated with a VPC that has an attached Virtual Private Gateway can have Route Propagation enabled. Once enabled that routing table will dynamically receive the routes from the tunnel prefix configuration.

I ran into someone’s VPC route table with both Propagated and Static routes going to the same destination, which lead me into figuring out what AWS meant by this term. The person who setup a VPN tunnel added static routes manually and then later on for whatever reason Route Propagation was turned on. In the tunnel configuration prefixes were already added which resulted in the Static and Propagated routes showing in the VPC route table.

This post did not talk about Direct Connect, but Direct Connect does use the same Route Propagation terminology.