At Amazon Web Services, one of the most common practices is to divide one account into multiple sub-accounts, where each one of them have their own credentials, instances and services. This practice adds complexity when we have a big network, but facilitates many things like: each account can have people with different roles and authorizations, without complex IAM rules; each cost center is entirely separated in a much easier way (without using tags, as it would be in the case of only one account); big projects may have their own totally independent infrastructure; among others.
When we have multiple accounts in the same company, we usually need to link these accounts in a secure way. A great way of doing this is using the VPC Peering to create VPCs in the same region. However, when the VPCs are in different regions, how can they communicate with each other? In this case, instead of using the native VPC Peering offered by AWS, we can create EC2 instances with IPSEC configured, to establish cryptographed VPNs between any network. We call this type of VPN a site-to-site VPN.
Here at Movile we use multiple AWS accounts in many different regions. To keep the monitoring and automation services up and secure, we had to implement these mechanisms. And here, in this tutorial, we show you how we did this.
- 1 Preparing
- 2 Creating the instances
- 3 Configuring the VPC routes
- 4 Configuring kernel parameters
- 5 Configuring openswan/libreswan
- 6 Testing
- 7 Troubleshooting
- 8 References
Before setting up the VPN between VPCs in different regions, it’s good to have the following observations in mind:
- As with VPC Peering, it’s preferable that the two VPCs use CIDRs that don’t conflict. It’s important that, when creating a VPC, one begins to reserve subnets in an organized form, so they would never repeat. If, for any reason, the subnets CIDRs collide, it’s still possible to configure the scenario, but it gets a little more complex.
- For each VPC you must have an EC2 instance running, and this includes an extra cost for these running servers.
- Instances created with the ipsec will be a single point of failure for communication between the VPCs. It’s important that you also configure some kind of High Availability scheme to keep at least one ipsec instance always running.
- As the instance is a single point of failure, all network bandwidth between VPCs will pass through this instance. It’s really important to choose a proper AWS instance, with good network connectivity (for high network throughput) and processing (a little less important, but necessary for the packets’ encryption between networks). For a network with less communication, a m3.medium type may be sufficient. For bigger networks, with dozens of instances and considerable traffic, we recommend a m3.large or up.
- The softwares used in this tutorial are: CentOS and openswan (or the newer libreswan). Although the installation method can vary, everything works normally when using other distributions like Debian or Ubuntu.
These are the data for our example structure. Feel free to replace any values with your configuration.
VPC 1 (us-east-1)
- CIDR: 10.110.70.0/24
- Public subnet: 10.110.70.0/25
- Private subnet: 10.110.70.128/25
- Instance hostname: openswan-br
- Instance type: m3.large
- Instance elastic IP: 126.96.36.199
VPC 2 (sa-east-1)
- CIDR: 10.110.80.0/24
- Public subnet: 10.110.80.0/25
- Private subnet: 10.110.80.128/25
- Instance hostname: openswan-br
- Instance type: m3.large
- Instance elastic IP: 188.8.131.52
- The instances are located at the public subnet. This is mandatory because they need a fixed Elastic IP on the Internet in order to connect with each other.
- The public subnet has an IGW (Internet Gateway) configured and associated with the default route (0.0.0.0/0). The private subnet has only the route to the local network, using the CIDR initially specified on the VPC.
Creating the instances
First of all, allocate an Elastic IP on each VPC and reserve, to put them on the instances. Write down the IPs for each VPC.
Or with the CLI:
aws ec2 allocate-address --domain vpc
In our case, the elastic IPs are described in the topic with our example structure’s definitions.
Now, create a security group called ipsec in each VPC, releasing what ipsec needs to function correctly: UDP 500, UDP 4500, Custom Protocol 50, Custom Protocol 51. Remember that each VPC’s security group will allow access to the instance at the other VPC. In our example, we’ll get something like this:
Now, create two instances, one on each VPC, using those security groups.
Next, you’ll have to disable the Source/Destination Check at the instance’s network interface. By default, when you create an instance, it receives a virtual network interface (ENI) and this check comes enabled. Disabling this option means that the instance will receive traffic that was not for it (e.g, to mitigate IP spoof). This is exactly what we want for our instances, since they will route traffic among different VPC networks through ipsec.
To do this through the AWS panel:
Or with the CLI:
# The ID eni-7e712e1b corresponds to the instance's interface. Replace with your instance's IDe aws ec2 modify-network-interface-attribute --network-interface-id eni-7e712e1b --no-source-dest-check
Configuring the VPC routes
This part is extremely important, because without it the instances inside each VPC won’t be able to communicate with the other VPC. To configure the needed routes at the AWS panel, go to VPC – Virtual Private Cloud – Route Tables and select the main route of your VPC. Here, we have two route tables: one for the public subnet (tables on the left) and another for the private network (tables on the right).
In each route table, add one route to the CIDR of the other VPC. In our example, we’ll get something like this:
Notice the blue marked routes: they tell that, if the instances try to communicate with the network at the other VPC, the packets will be routed to the instance that we created, so they can be routed through the VPN/ipsec software. This type of route is also useful to create NAT instances on AWS.
Configuring kernel parameters
As the instances will function like routers between networks, it’s necessary to activate/deactivate some kernel parameters.
Inside the instances, create a file called /etc/sysctl.d/ipsec.conf with the following content:
net.ipv4.ip_forward = 1 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.default.rp_filter = 0 net.ipv4.conf.all.rp_filter = 0
Load these configurations immediately, with the command:
sysctl -p /etc/sysctl.d/ipsec.conf
If the instance gets a reboot, the distributions load the files with the extension .conf inside the /etc/ipsec.d/ipsec.conf directory, so this part is complete!
Now, it’s time to configure ipsec on the instances. You can install openswan on CentOS with:
yum install openswan
Or using Debian/Ubuntu, with:
apt-get install openswan
In both instances, create the file /etc/ipsec.conf with the following content:
config setup protostack=netkey dumpdir=/var/run/pluto/ nat_traversal=yes include /etc/ipsec.d/*.conf
The last line is important and serves to organize configurations in separate files inside the /etc/ipsec.d/ directory. It’s also interesting to notice that the nat_traversal option is enabled, because AWS instances don’t have Public IPs inside them.Instead, they use an allocated IP from the VPC routers (in practice it is as if the VPN instance went to the internet through NAT).
Instance 1: openswan-us
Now, on the openswan-us (us-east-1), create the file /etc/ipsec.d/us-to-br.conf with the following content:
conn us-to-br type=tunnel authby=secret left=%defaultroute leftid=184.108.40.206 leftnexthop=%defaultroute leftsubnet=10.70.0.0/24 right=220.127.116.11 rightsubnet=10.80.0.0/24 pfs=yes auto=start
The left* options indicate the local instance network (us-east-1). The right* options correspond to the instance’s network at the other VPC (sa-east-1). Notice that the elastic IPs are specified on the leftid and right options.
To establish the ipsec connection between the instances, both have to know a shared password (Pre-Shared Key, or PSK). This password/key will be used as the key to crypt/decrypt packets.
Again, on the openswan-us server, create the file /etc/ipsec.d/us-to-br.secrets with the following content:
18.104.22.168 22.214.171.124: PSK "iamasecretpsk"
Note that the first IP is the elastic IP of the origin (left), and the second is from the other end (right). This means that the PSK “iamasecretpsk” will be used when the connection is made between the two servers. Later in this tutorial, you’ll see that the other end will have a very similar file with the same key.
Remember! Create a strong and big key, with uppercase and lowercase letters, numbers and symbols.
Oh, and as the file contains a password, make sure that the its permission is secure:
chmod 600 /etc/ipsec.d/us-to-br.secrets
Instance 2: openswan-br
On the openswan-br (sa-east-1), create the file /etc/ipsec.d/br-to-us.conf with the following content:
conn br-to-us type=tunnel authby=secret left=%defaultroute leftid=126.96.36.199 leftnexthop=%defaultroute leftsubnet=10.80.0.0/24 right=188.8.131.52 rightsubnet=10.70.0.0/24 pfs=yes auto=start
Now you can see that we switched the values. The left* options indicate the sa-east-1 VPC, while the right* options correspond to the us-east-1 VPC (that is now the other end).
Configure the key file /etc/ipsec.d/br-to-us.secrets with the following content:
184.108.40.206 220.127.116.11: PSK "iamasecretpsk"
And the file permission:
chmod 600 /etc/ipsec.d/br-to-us.secrets
Establishing the VPN
With openswan configured, start the service:
service ipsec start
After that, use the verify command to see if everything is right. If something shows in red, you need to review your configuration. Example:
[root@openswan-us ipsec.d]# ipsec verify Verifying installed system and configuration files Version check and ipsec on-path [OK] Libreswan 3.8 (netkey) on 3.10.0-123.20.1.el7.x86_64 Checking for IPsec support in kernel [OK] NETKEY: Testing XFRM related proc values ICMP default/send_redirects [OK] ICMP default/accept_redirects [OK] XFRM larval drop [OK] Pluto ipsec.conf syntax [OK] Hardware random device [N/A] Checking rp_filter [OK] Checking that pluto is running [OK] Pluto listening for IKE on udp 500 [OK] Pluto listening for IKE/NAT-T on udp 4500 [OK] Pluto ipsec.secret syntax [OK] Checking NAT and MASQUERADEing [TEST INCOMPLETE] Checking 'ip' command [OK] Checking 'iptables' command [OK] Checking 'prelink' command does not interfere with FIPSChecking for obsolete ipsec.conf options [OK] Opportunistic Encryption [DISABLED]
After you do this on both instances, check the connection status:
ipsec auto --status
The last lines are what matter to us:
000 "us-to-br": 10.70.0.0/24===10.70.0.106[18.104.22.168]---10.70.0.1...22.214.171.124<126.96.36.199>===10.80.0.0/24; erouted; eroute owner: #4 000 "us-to-br": oriented; my_ip=unset; their_ip=unset; 000 "us-to-br": xauth info: us:none, them:none, my_xauthuser=[any]; their_xauthuser=[any]; ; 000 "us-to-br": modecfg info: us:none, them:none, modecfg policy:push, dns1:unset, dns2:unset, domain:unset, banner:unset; 000 "us-to-br": labeled_ipsec:no, loopback:no; 000 "us-to-br": policy_label:unset; 000 "us-to-br": ike_life: 3600s; ipsec_life: 28800s; rekey_margin: 540s; rekey_fuzz: 100%; keyingtries: 0; 000 "us-to-br": sha2_truncbug:no; initial_contact:no; cisco_unity:no; send_vendorid:no; 000 "us-to-br": policy: PSK+ENCRYPT+TUNNEL+PFS+UP+IKEv2ALLOW+SAREFTRACK+IKE_FRAG; 000 "us-to-br": conn_prio: 24,24; interface: eth0; metric: 0; mtu: unset; sa_prio:auto; 000 "us-to-br": newest ISAKMP SA: #8; newest IPsec SA: #4; 000 "us-to-br": IKE algorithm newest: AES_CBC_128-SHA1-MODP2048 000 "us-to-br": ESP algorithm newest: AES_128-HMAC_SHA1; pfsgroup=<Phase1> 000 000 Total IPsec connections: loaded 1, active 1 000 000 State list: 000 000 #8: "us-to-br":4500 STATE_MAIN_R3 (sent MR3, ISAKMP SA established); EVENT_SA_REPLACE in 1739s; newest ISAKMP; lastdpd=-1s(seq in:0 out:0); idle; import:not set 000 #4: "us-to-br":4500 STATE_QUICK_R2 (IPsec SA established); EVENT_SA_REPLACE in 16497s; newest IPSEC; eroute owner; isakmp#3; idle; import:not set 000 #4: "us-to-br" firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com ref=0 refhim=4294901761 Traffic: ESPin=840B ESPout=840B! ESPmax=4194303B
Besides the configuration description and the path that the packets make between IPs, we also have the last lines, that must have these strings: ISAKMP SA established and IPsec SA established. If those messages appear, it means that the VPN was successfully established.
Another useful information to see if the connection works is to verify if there is an ipsec policy directing the other VPC’s traffic, using the elastic IP. For example, running this on openswan-br:
[root@openswan-br ~]# ip xfrm policy src 10.80.0.0/24 dst 10.70.0.0/24 dir out priority 2344 ptype main tmpl src 10.80.0.32 dst 188.8.131.52 proto esp reqid 16385 mode tunnel src 10.70.0.0/24 dst 10.80.0.0/24 dir fwd priority 2344 ptype main tmpl src 184.108.40.206 dst 10.80.0.32 proto esp reqid 16385 mode tunnel
Looking at the output, we can see that the traffic between networks 10.80.0.0/24 and 10.70.0.0/24 uses the Elastic IP (220.127.116.11) and passes through the tunnel.
Oh, and don’t forget to enable the service to start at the boot:
chkconfig ipsec on
The first thing that we can use to test and see if everything works is ping!
From us-east-1 instance to sa-east-1:
[root@openswan-us ~]# ping -c3 10.80.0.32 PING 10.80.0.32 (10.80.0.32) 56(84) bytes of data. 64 bytes from 10.80.0.32: icmp_seq=1 ttl=64 time=123 ms 64 bytes from 10.80.0.32: icmp_seq=2 ttl=64 time=123 ms 64 bytes from 10.80.0.32: icmp_seq=3 ttl=64 time=123 ms --- 10.80.0.32 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 123.103/123.254/123.417/0.425 ms
From sa-east-1 instance to us-east-1:
[root@openswan-br ~]# ping -c3 10.70.0.106 PING 10.70.0.106 (10.70.0.106) 56(84) bytes of data. 64 bytes from 10.70.0.106: icmp_seq=1 ttl=64 time=123 ms 64 bytes from 10.70.0.106: icmp_seq=2 ttl=64 time=123 ms 64 bytes from 10.70.0.106: icmp_seq=3 ttl=64 time=122 ms --- 10.70.0.106 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 122.954/123.128/123.427/0.212 ms
In these examples, 10.80.0.32 is the internal IP for openswan-br and 10.70.0.106 is the internal IP for openswan-us. If ping is working, great!
But besides pinging from one instance to another, we must ping other instances inside the subnets, to see if the VPC route tables are correct.
Suppose we have two more instances on the private subnets (or public, it doesn’t matter, since we configured both route tables). They will be identified as:
- client-us: IP 10.70.0.250
- client-br: IP 10.80.0.14
Testing with ping from one to another:
From us-east-1 instance to sa-east-1:
[root@client-us ~]# ping -c3 10.80.0.14 PING 10.80.0.14 (10.80.0.14) 56(84) bytes of data. 64 bytes from 10.80.0.14: icmp_seq=1 ttl=62 time=126 ms 64 bytes from 10.80.0.14: icmp_seq=2 ttl=62 time=125 ms 64 bytes from 10.80.0.14: icmp_seq=3 ttl=62 time=125 ms --- 10.80.0.14 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 125.461/125.954/126.564/0.457 ms
From sa-east-1 instance to us-east-1:
[root@client-br ~]# ping -c3 10.70.0.250 PING 10.70.0.250 (10.70.0.250) 56(84) bytes of data. 64 bytes from 10.70.0.250: icmp_seq=1 ttl=62 time=125 ms 64 bytes from 10.70.0.250: icmp_seq=2 ttl=62 time=125 ms 64 bytes from 10.70.0.250: icmp_seq=3 ttl=62 time=125 ms --- 10.70.0.250 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 125.272/125.513/125.827/0.470 ms
Another interesting way to see the route working, is to use traceroute. An example from the client-us to client-br:
[root@ip-10-70-0-250 ~]# traceroute -n 10.80.0.14 traceroute to 10.80.0.14 (10.80.0.14), 30 hops max, 60 byte packets 1 10.70.0.106 2.791 ms 2.740 ms 2.697 ms 2 10.80.0.32 124.710 ms 124.682 ms 126.029 ms 3 10.80.0.14 127.201 ms 127.158 ms 127.109 ms
We can see that the packet goes first to the openswan-us instance (10.70.0.106), then goes directly to the openswan-br instance (10.80.0.32) on the other VPC, then to its final destination: client-br. If the VPN was not active or the VPC route table was not configured, the packet would try to go through the Internet and it would not work, because the 10.80.0.0/24 network does not publicly exists on the Internet.
My two openswan instances ping between each other, but the internal instances don’t
- Check if the kernel parameter /proc/sys/net/ipv4/ip_forward is 1. It needs to be 1.
- Check if the VPC route table for your instances is configured to use the openswan instance’s network interface (ENI) when trying to reach the other VPC network.
- Check if the openswan instances are in a security group that allows traffic between all instances on internal subnets
Sometimes the transfer between VPCs gets too slow
Make sure that the instance type used at the openswan instances supports the network throughput that you need. Raise the instance type to a better one and see if the problem persists. Use programs like iptraf to see the bandwidth in real time and the iperf to make tests between networks.
A common problem that occurs in this scenario is loss of packets.
Openswan can’t establish a connection to the other end
If you check the ipsec status and see something like this:
000 #1: "us-to-br":500 STATE_MAIN_I1 (sent MI1, expecting MR1); EVENT_RETRANSMIT in 23s; nodpd; idle; import:admin initiate 000 #1: pending Phase 2 for "us-to-br" replacing #0
it’s because you didn’t pass the Phase 1 of the ipsec connection. This means that openswan couldn’t even get to the other instance to negotiate the key, cryptography, networks, and so on.
This is generally easy to solve:
- Make sure that both openswan instances are in a security group that allows: UDP 500, UDP 4500, Custom Protocol 50, Custom Protocol 51.
- Review your openswan configuration and see if the left* and right* values are correct.
There’s no traffic between VPCs
Supposing openswan could establish a connection and negotiate Phase 1 and Phase 2, start monitoring the packets to see if they’re coming and getting crypted/decrypted. On both ends, use the command:
ip xfrm monitor
You should see packets come and go between the local IP from the instance and the elastic IP on the other end.
Also check if there’s no local firewall on the instance (iptables) blocking packets. Pay attention to the FORWARD chain at the filter table, and make sure there are no SNAT/DNAT rules rewriting the packets.
Finally, try what was described at the first item of this troubleshooting section.
My traffic only works with NAT or only works on one VPC
Some tutorials on the Internet teach how to configure openswan on the same instance as NAT to the Internet (a common scenario). It’s possible that, in this configuration, packets to the other VPC go with the elastic ip at the source field. There are two ways to identify when this problem occurs:
- When analyzing traffic from the openswan instance, the packets go with the Elastic IP and not the internal one
- When analyzing traffic between internal instances from both VPCs, the IP that arrives on the desination is always from the openswan instance, not from the internal instance (origin)
In the first case, when we execute tcpdump on the openswan instance and do a test with ping, we can see something like this:
[root@openswan-us ~]# tcpdump -i any -nn icmp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes 16:01:49.597273 IP 18.104.22.168 > 10.80.0.14: ICMP echo request, id 13053, seq 2, length 64 16:01:50.597281 IP 22.214.171.124 > 10.80.0.14: ICMP echo request, id 13053, seq 3, length 64 16:01:51.597269 IP 126.96.36.199 > 10.80.0.14: ICMP echo request, id 13053, seq 4, length 64
Notice that the origin IP is the elastic IP, but it should be the openswan-us internal IP (10.70.0.106). And also: the ping won’t work because when it reaches the other end, it’ll try to come out from the Internet and not from the tunnel (since the elastic IP is public, and not from the 10.70.0.0/24 subnet).
This can happen if you use the following line on the connection configuration in openswan:
So, don’t use this line unless you really know what you’re trying to do.
Other symptom of this problem is that when the VPN connection gets established, openswan will create an additional route:
[root@openswan-us ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.70.0.1 0.0.0.0 UG 0 0 0 eth0 10.70.0.0 0.0.0.0 255.255.255.128 U 0 0 0 eth0 10.80.0.0 10.70.0.1 255.255.255.0 UG 0 0 0 eth0
This route to the 10.80.0.0 network doesn’t need to exist. If you don’t use the leftsourceip option, the route won’t be created and the packets will normally pass through the tunnel, without being rewritten.
If you didn’t use this option, also check on iptables if there’s a NAT/Masquerade that rewrites the packets. For example, this common rule at NAT instances would not be a good idea:
iptables -t nat -A POSTROUTING -s 10.70.0.0/24 -j MASQUERADE
This would make the iptables rewrite the packets to always use the openswan IP instead of the instance’s origin IP. A better rule would be:
iptables -t nat -A POSTROUTING -s 10.70.0.0/24 ! -d 10.80.0.0/24 -j MASQUERADE
This way the instance would do NAT to its entire network, unless the packets go to the other VPC. The packets that go through the tunnel would not be rewritten.
- Connecting Multiple VPCs with EC2 Instances (IPSec) – https://aws.amazon.com/articles/5472675506466066
- Working with Amazon AWS VPC: Software-based VPN Part 3 – http://www.heitorlessa.com/working-with-amazon-aws-vpc-software-based-vpn-part-3/