Growing out of Heroku to Terraform, Docker and AWS
Heroku is great, but how about using Terraform, AWS, Docker and have full control over the entire stack?
After a recent conversation, I’ve been thinking about how I’d go about moving an application out of Heroku into a fully GitOps compliant stack (AWS, Docker, Kubernetes, etc). Despite knowing what all these technologies are and how they work, I don’t have hands on experience with them.
That, to me, sounds like an interesting challenge worth of a small weekend project (along with entertaining a 1.5 year daughter).
Heroku
The App
Fair warning, I don't intend to spend much time with Heroku right now. But let's see how to deploy a very simple Python app to Heroku.
Look closely at how PORT is defined. Heroku apps are to be self-contained which, in our case, means we have to listen to whatever port Heroku assigns to the app. Heroku will take care of redirecting port 80 (http) to whatever port the app is listening to.
Deployment requirements
Heroku provides many different deployment options, for our project, let's use Heroky CLI. Details on how to install it are available here.
The deployment steps are simple, but there are two important details to Heroku: it needs to be able to tell what language/technology is in use and it requires a Procfile. The Procfile tells Heroku what kind of app it will run and the command to start the app. Our Procfile has only one line: web: python main.py
To tell Heroku we're deploying a Python app it’s as simple as creating a requirements.txt file, in our case, the file is empty.
Deployment steps
With all requirements ready, deploying is fairly straightforward:
heroku loginThis will ask for your credentialsheroku createThis will create an application in Heroku. No code at this point.git add .Add all filesgit commit -m “Your very own super nice commit message.”Commit the codegit push heroku mainPush the main branch to Heroku. Pushing to Heroku will also trigger the deployment of the app. When something goes wrong, running heroku logs — tail (optional) will show the latest logs from the app.heroku openThis will open the newly deployed app in the browser.
And voilá
A different stack
Imagine that the simple app running on Heroku is actually far more complex. Also imagine that for whatever reason (growth, new dev team, you name it), Heroku is not suitable anymore and the app is moving into another stack.
Today the options are endless. There are so many tools and technologies to cover the areas of interest here: infrastructre, continuous deployment (CI/CD), how to handle app dependencies, etc.
This is a weekend project and I won't have the time to juggle entertaining my 1.5 year old and all the different components of this endeavor. So I'll focus on:
- Infrastructure as Code
- Using Docker to ease the app deployment
Infrastructure
Let's use Terraform to manage the infrastructure and use AWS as the infrastructure provider for this little project.
Terraform
Terraform is a popular and very robust technology to work with infrastructure as code. The main concepts to be aware of for our project are:
- Providers: plugins used by Terraform to interact with external systems. Here we use the AWS provider to have Terraform communicate with AWS API.
- Resources: used to describe the infrastructure and it's made of elements. An element can be a machine, virtual network, DNS records, SSH key-pair and so on.
- Data sources: used to retrieve information from an external source or defined in another separate Terraform configuration.
Terraform code
To start things off Terraform needs to know the AWS provider is required:
We'll use the official Terraform's AWS provider (source tag) and select a minimum version of the provider.
To make the AWS provider work, we need to define a few things:
- A profile
- AWS region
- AWS access key
- AWS Secret Access Key
- A key_name: used by AWS to identify the key pair
- The public key of the computer running the Terraform script (or its location)
The code to do all that is as follows:
Note that for the access key and the secret key, we're relying on 2 variables. These variables are defined in a file called secrets.tf (you can name it however you wish). and since it only contains the variables definitions (and my own keys) it's not available in the repository. Also note that Terraform loads all .tf files in the folder, so we don't need to include/import the file in main.tf.
This is how to define the variables:
Terraform is now able to connect to AWS! We can move on into creating an EC2, deploying our app, etc.
For the EC2 on this project we'll use Canonical's Ubuntu 18.04 LTS AMI (Amazon Machine Image). This way we don't need to create our own image and we can leverage a nice and easy to use Ubuntu installation.
Instead of hard coding the version, let's have Terraform retrieve the latest AMI for Ubuntu 18.04. We can easily do that with a data source:
The code above is fairly simple:
- most_recent: tells Terraform to fetch the latest version of what is returned by the query
- filter "name": tells Terraform to fetch data from a repository that matches the string "ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*" (note the wildcard in the end)
- filter "virtualization-type": tells Terraform to fetch a virtualization-type object, a hvm. This is essentially the AMI we're looking for.
- owners: tells Terraform who's the owner of the repository, in this case: Canonical. Knowing the owner allows Terraform to know where to look for the repository name.
With the AMI at hand, we can move on and define our EC2 instance.
There are 4 main sections in our EC2 resource definition.
1: Configuration variables
For the AMI ID, we're using the data source to fetch the ID from Canonical's repository (data.aws_ami.ubuntu.id). We'll use a t2.micro (so this little project stays free), we name the key pair (this will be visible from AWS Console later once we deploy) and we also tag the instance as UbuntuVM. Finally, we need to define a security group (we'll look closer later) so we can actually access the instance from the outside world.
2: Connection
Terraform needs to know how to connect to the instance:
ubuntu is the default user in Canonical's Ubuntu AMI. As we don't know the hostname before deploying, we tell Terraform to take the value of public_ip from the resource we're defining. We can access the resource with the keyword self. And lastly, we load the content of the private key we used to connect to AWS using the file Terraform function.
3: Setup script
We'll use a shell script to run some setup on the instance (install a few things like Docker). We need Terraform to upload the file to AWS for us. This is easy, we use the provisioner "file" and give it a source (local path) and a destination (remote path).
4: Remote execution
To have Terraform execute the script for us, we'll just make sure the script has execution rights and tell Terraform to run it by simply providing the path.
That's it! We have defined our EC2 instance! There's one last thing to do: create our security group so we can access it.
Security group
This section is fairly easy to understand. We're allowing incoming traffic on ports 80 (default HTTP), 8080 (used by our little app) and 22 (SSH). And we're allowing all traffic out of the instance.
I won't comment on the content of the setup script, but for completeness, it's here:
sudo apt updatesudo apt install -y python3.8sudo apt install -y python3-distutilssudo apt install -y docker.iocurl https://bootstrap.pypa.io/get-pip.py -o get-pip.pysudo python3 get-pip.pysudo pip install docker-composesudo service docker startgit clone https://github.com/fsgeorgee/hello_aws.gitcd hello_aws/applicationsudo docker-compose buildsudo docker-compose up
The application
We said before that our application outgrew Heroku. So instead of a simple app using Python's very own http module, we now have a simple Flask app (yes, you're right, it's not enough to trigger the move, but hey! it's just an example):
Instead of installing Flask on our EC2 instance, let's use Flask's official Docker image. For that, we need a simple Dockerfile to import the Flask image and install pip inside it. We'll also need a requirements.txt file to make sure dependencies are met.
Dockerfile:
FROM python:3.9.5-busterRUN pip install flask
requirements.txt
Flaskpika==0.9.14
Wait a minute, Flask alone will not suffice. We need a HTTP server, let's use Nginx and while we're at it, let's use Nginx's official Docker image. We'll need a config file for our app to run:
We're simply relaying the work to the Flask image by using proxy_pass. http://flask-app:5000 is an internal URL. flask-app is the name of the Flask container we'll define soon and 5000 is the port where Flask is running.
The main trick here is that we'll use a docker-compose.yml file to handle building and "connecting" the two containers (Flask and nginx) to make our app work.
Here is the first part of our docker-compose.yml. We're using docker-compose version 3.3 (because of the version that will be installed on the EC2.
Note that we're using a specific version of Nginx image (1.13.7) and we're naming the container nginx. We're also telling docker-compose that this container depends on another container called flask (we'll define it soon).
We're also using a custom network, this is not really required. For all containers defined in the same docker-compose file, they'll rely on a a default network created by docker-compose if no networks are declared. And lastly, we're telling docker-compose that nginx should receive requests sent to port 80.
For Flask, we're telling docker-compose to build the image based on the Dockerfile avilable in the ./app folder. Similarly to nginx, we name the container, we use a specific version of flask (0.0.1) and we copy the application code to the container with the volume directive.
Flask requires an environment variable FLASK_APP telling it where to find the application entry point. We can define environment variables with docker-compose by creating the environment node.
Again we tell docker-compose to use our custome network. But here it becomes clear why we created one. We're giving this container an alias flask-app. This alias is what we used as the URL for the nginx.conf file. And here we tell Docker to redirect requests coming to port 8080 to 5000, where Flask runs.
Deploying the infrastructure and application
Now to the real nice part. Let's deploy!
First thing we need to do is run Terraform init, so Terraform can retrieve the providers it needs to run our configuration.
Now we can go ahead and deploy with terraform apply:
And voilá
Conclusion and next steps
Well, this was a simple exercise. There are many things that were not considered. The application of course is super simple. We're not using auto-scaling groups, we could better define security rules, firewall, persistent storage… There are many many things that are not covered, but they weren't important for this project and so out of scope.
A couple of next steps on something like this, besides of course enhance what's been done already, can be:
- CI/CD Pipeline, something like Jenkins to orchestrate it
- An operator to trigger the deployment (Gitops!)