Using VPC Peering to Share Internal Services on AWS

Leo Sjöberg • January 28, 2018

If you use AWS to run your infrastructure, you're likely to run your production and staging environments in different VPCs, or even different accounts altogether. However, if you want to run shared services, like an instance of Graylog, you suddenly realise that you somehow need to have a common place for those.

Sure, you could expose all those services publicly, but then you suddenly need to deal with authentication, higher latency, and any other dangers the internet exposes you to. Instead, what makes more sense in such a scenario is to create a new VPC, one for shared services (either in a brand new account, or an existing one, whichever makes sense for your architecture).

The Theoretical

In order to connect the VPCs, we need to setup VPC peering between our service VPC and our production and staging VPCs. It's important to note that VPC peering is not transitive, so connecting your staging and production VPCs to the service VPC will not automatically allow the staging and production VPCs to communicate, as demonstrated by this illustration from the AWS guide.

AWS VPC Peering Illustration

Once the VPC is configured, we simply need to configure the correct route from our VPCs to the service VPC, and make sure our service can accept connections.

Implementation

To implement this, I will walk through the setup with 3 separate AWS accounts. Before beginning, it is important to note that VPCs that are peered cannot have overlapping or the same CIDR ranges. Throughout this post, I will refer to the environments as "production", "staging", and "shared" (or "service"), where production and staging would represent VPCs B and C in the illustration above, and the shared environment is VPC A.

Setting up our environments

If you already have your environments ready, just skip this section

First, I'll setup the production environment. I'll start by just creating the VPC in my first account, choosing 10.0.0.0/16 as my CIDR block.

VPC configuration for production environment

We also need to create a subnet in this VPC for our server to be in:

Subnet configuration for production environment

Make sure you have some instance in this VPC so that you can try out the peering connection. I opted for a simple Amazon linux image on EC2, which I've made accessible to myself over SSH using an Elastic IP to access it. Note that you also need to configure an Internet Gateway to make it accessible.

Next we'll launch the staging environment. This will be done in the exact same manner, but for the CIDR block 172.16.0.0/16. We will also create an instance on that VPC (which we'll want accessible through the internet as well, so that we can SSH into our server and perform some requests).

Finally, we'll setup our shared VPC. I'll create this with a 192.168.0.0/16 CIDR block. Here, I'll be launching an instance of Graylog (as logging is a service you'd be likely to have shared between prod and staging to keep it in one place). If you want to launch Graylog on AWS, you can grab the official AMI for your region of choice from Github. Depending on your setup, you may not need (or even want) this accessible from the public interface. For now, I'll make it available through an EIP, and we will explore how to set it up without public access in a future post.

In doing my setup, I've also placed the shared account's VPC in another region; production and staging in us-east-2, and the service VPC in us-east-1, to demonstrate cross-region and cross-account peering.

With Graylog launched and configured, I double checked that I could login to the web interface:

Logged into the Graylog dashboard

Okay, time for the fun part.

Creating our peering connections

In order to create a VPC Peering connection, we need to send a peering request from one VPC to the other. Note that a peering connection goes both ways, so it doesn't matter if you send it from VPC A to VPC B, or the other way around; your routing rules and security groups determine what traffic is and isn't allowed.

To send a peering request, on the VPC Dashboard, go to "Peering Connections" and click "create peering connection". I've chosen to create them both from the service account. Fill out the required details, and send the request.

Sending the VPC Peering request

To accept the request, go to "Peering Connections" on your other account, select the connection request, and click "accept request".

Do this for all VPCs you wish to connect to the shared VPC. That's it, the connection is created! Now it's just one step remaining: setting up the routes.

Configuring Routes to a Peered VPC

To allow us to send our log messages to Graylog from both staging and production, we need to configure the correct routes on those route tables, and configure the security group for inbound connections from our instances.

In your staging and production VPCs, navigate to the route table associated with the subnets from which you want to send logs, and add a new one. In the target, your peering connection should show up as a suggestion:

Routes on the production and staging VPCs

You also need to configure corresponding routes on the shared VPC:

Routes on the shared VPC

Make sure you pick the right peering connection here.

Configuring Inbound Security Groups

Last but not least, we need to configure security groups to allow communication to our shared service.

I've opted to launch a Graylog GELF over HTTP input on port 12201. We'll add those as custom TCP rules on our Graylog instance's Security Group:

Security groups for our graylog subnet

Note that the AWS interface will not auto-complete the sources for you, so make sure you double check that you enter the correct CIDR block.

Testing our connection

Phew, we're done, that's it. All we need to do now is make sure it works!

SSH into the production and staging instances, and execute the following:

1curl -XPOST http://192.168.1.140:12201/gelf -p0 -d '{"short_message":"Test message", "host":"10.0.1.240", "facility":"production"}

Replacing the IP address you're hitting with that of your service (and the host with the instance you're sending the request from). Replace appropriately for testing it from staging (and change the facility to staging to see the difference in Graylog).

Lastly, let's check the Graylog dashboard for the messages: Graylog dashboard with messages from both environments

Perfect! It works.

Wrapping up

We've now configured a fully working multi-account, multi-region shared service solution using VPC Peering. From here, there are a couple things you might want to consider:

Setup a DNS A type entry to your service's private IP (I like {service}.in.{company}.tld, i.e logs.in.decahedron.io), this will mean you don't have to remember a private IP, and means you can easily modify your environment without worrying about your applications, all you need to do when something changes is update the DNS target.
Setup a software VPN like pritunl in your cloud (either in another VPC, or in your shared VPC), and have this as your only exposed shared service. You would then generate VPN profiles for users, who would connect, and then be able to use the internal IPs to connect to Graylog, your instances etc. As a result of this, you can have a setup where only your load balancer is exposed publicly, and your instances requires users to go through the VPN.

Most importantly, before you build out your infrastructure, make sure you plan it with pen and paper first, so that you know what connects where and how!