This is more of a note to self as much as a blog post, as well as I would like to get blogging again.
Lambda integration vs. Lambda-Proxy
We just went through the process of setting up an AWS API proxy for lambda again today, and API Gateway seems to have changed a lot over the last little bit.
One of the new Integration Request options is “Lambda Proxy” which basically just sends the full request to Lambda (including the URL info) as opposed to just invoking the function.
if key isNone: log.warn('Creating new key for {}'.format(site_name)) key = bucket.new_key('{}/endpoint.json'.format(site_name)) key.content_type = 'application/json'
At work we’ve been looking for a good solution to consolidate our service layer. We have around 30-40 backing services for our application and web tiers. The problem with the current setup is that they’re deployed all over the place; some in EC2-land, and some are colocated, and they’re in various languages.
We’re working on consolidating our services into a consolidated architecture:
service stack (Python/WSGI)
load balancer
service cache
service proxy
…or something like that. I’m primarily interested in the first item today: service stack.
The problem with EC2
We absolutely love Amazon Web Services. Their services remove a lot of the headache from our day-to-day life. However, EC2 instances get pricey, especially when deploying a lot of services. Even if we put all of our services on their own micro instances (which isn’t a great idea) the hard cost is pretty clear… let’s assume 50 services at today’s prices:
50 t1.micro instances x ~$14/month = $700/month
This doesn’t seem like an astounding number, but to be fair, this probably isn’t an accurate estimate; we’re still forgetting high availability and load balancing, as well as snapshots. Assuming a t1.micro is a great option for all services, let’s factor in the new costs:
(50 t1.micro instances x 2 availability zones x $14/month) + (50 elastic load balancers x $14/month) = $2100/month
Having a high availability setup is starting to add up a bit, and there are still additional costs for EBS, snapshots, S3 (if you’re using it), and data transfer.
The case for Deis
Deis is a platform-as-a-service (Paas) much like Heroku or Elastic Beanstalk. There are a few other open source solutions out there like Flynn.io (which is at the time of writing still in alpha) and dokku. Dokku will only run on a single instance, which doesn’t make it quite as appealing as Deis.
The whole idea behind Deis is that you can simple git push your application code to the cluster controller, and the platform will take care of the rest, including providing the hostname and options to scale. The solution is based on Docker which makes it extra appealing.
Enter AWS EC2
There is a contrib setup for EC2 in the source repository, and it looks like it should work out of the box. It doesn’t currently appear to have support for AWS VPC just yet, but it is just using CloudFormation stacks behind the scenes, so the configuration should be pretty straight forward. Now, I don’t know about anyone else, but CloudFormation templates (example make my brain hurt. AWS’s documentation is pretty clear for the most part, but sometimes it seems like you need a very specific configuration in about 6 different places for something to work. VPC is very much like this.
Fortunately, someone has already tackled this and there is a pull request open to merge it into deis trunk. The pull request effectively adds a couple configuration values that should allow deis to operate in a VPC. Primarily, there are a couple environment variables you need to set before provisioning the cluster:
I did run into one problem with this, however, when I specified the environment variables, I found my CloudFormation stack creation was failing (you can find this in your AWS console under the CloudFormation section, click on a stack, and select the ‘Events’ tab below). The error I was getting was related to the VPC and AutoScalingGroup not being created in the same availability zone, so I had to tweak the CloudFormation template under the Resources:CoreOSServerAutoScale:Properties:AvailabilityZones key to reflect my AZ (namely us-west-2b).
The IGW
Another problem I ran into (that had nothing to do with Deis) was I was deploying to an availability zone that didn’t actually have access to the public internet (no gateway/nat). This manifested as an error when trying to make run the cluster:
1
Failed creating job deis-registry.service: 501: All the given peers are not reachable (Tried to connect to each peer twice and failed) [0]
Once I got into the proper availability zone, the problem went away.
make run
I ran into some more issues when I finally got make run-ning…first the “activation” part of it took a really long time, like 15 minutes. I finally got this error:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
af:deis aaronfay$ make run fleetctl --strict-host-key-checking=false submit registry/systemd/deis-registry.service logger/systemd/deis-logger.service cache/systemd/deis-cache.service database/systemd/deis-database.service Starting 1 router(s)... Job deis-router.1.service scheduled to 22e48bb9.../10.0.14.17 Starting Deis! Deis will be functional once all services are reported as running... fleetctl --strict-host-key-checking=false start registry/systemd/deis-registry.service logger/systemd/deis-logger.service cache/systemd/deis-cache.service database/systemd/deis-database.service Job deis-registry.service scheduled to 4cb60f67.../10.0.14.16 Job deis-logger.service scheduled to 22e48bb9.../10.0.14.17 Job deis-database.service scheduled to 4cb60f67.../10.0.14.16 Job deis-cache.service scheduled to bc34904c.../10.0.14.13 Waiting for deis-registry to start (this can take some time)... Failed initializing SSH client: Timed out while initiating SSH connection Status: Failed initializing SSH client: Timed out while initiating SSH connection Failed initializing SSH client: Timed out while initiating SSH connection One or more services failed! Check which services by running 'make status' You can get detailed output with 'fleetctl status deis-servicename.service' This usually indicates an error with Deis - please open an issue on GitHub or ask for help in IRC
One of the admins on the IRC channel for #deis mentioned that you can make run again with no problems, however after several runs I still couldn’t get the command to complete free from errors. make status pointed out the issue with the controller:
1 2 3 4 5 6 7 8 9
af:deis aaronfay$ make status fleetctl --strict-host-key-checking=false list-units UNIT LOAD ACTIVE SUB DESC MACHINE deis-cache.service loaded active running deis-cache bc34904c.../10.0.14.13 deis-controller.service loaded failed failed deis-controller 22e48bb9.../10.0.14.17 deis-database.service loaded active running deis-database 4cb60f67.../10.0.14.16 deis-logger.service loaded active running deis-logger 22e48bb9.../10.0.14.17 deis-registry.service loaded active running deis-registry 4cb60f67.../10.0.14.16 deis-router.1.service loaded active running deis-router 22e48bb9.../10.0.14.17
failed hey? The same admin on the channel recommended fleetctl start deis-controller although I think I’m using an older version of fleetctl (0.2.0?) and I had to actually run fleetctl start deis-controller.service. That appears to have worked:
1 2 3 4 5 6 7 8 9 10 11 12
af:deis aaronfay$ fleetctl start deis-controller.service Job deis-controller.service scheduled to 22e48bb9.../10.0.14.17
af:deis aaronfay$ make status fleetctl --strict-host-key-checking=false list-units UNIT LOAD ACTIVE SUB DESC MACHINE deis-cache.service loaded active running deis-cache bc34904c.../10.0.14.13 deis-controller.service loaded active running deis-controller 22e48bb9.../10.0.14.17 deis-database.service loaded active running deis-database 4cb60f67.../10.0.14.16 deis-logger.service loaded active running deis-logger 22e48bb9.../10.0.14.17 deis-registry.service loaded active running deis-registry 4cb60f67.../10.0.14.16 deis-router.1.service loaded active running deis-router 22e48bb9.../10.0.14.17
So far so good
I now have Deis running in the VPC, well, the first bits anyway. I will update with the second part which includes DNS configuration, initializing a cluster, and deploying an app.