I’m a big fan of Sidekiq, but recently, one of my clients asked for hosting their new application on Amazon’s Elastic Beanstalk which imposed certain limitations, that I had to overcome to get my normal setup working. In this post I want to describe some of them.
I usually either use Sidekiq on the same node, or place it on a dedicated one to put demanding workers away from the main web app. It requires Redis to maintain the queue of tasks and the copy of application to be able to execute them.
Amazon’s Elastic Beanstalk is the framework with load balancing and autoscaling that lets you focus on your application, not infrastructure. The copy of your Rails application (in this case) is unziped and configured on each instance with Nginx and Passenger. If you need database, they provide RDS that runs MySQL (in this case).
It appears that the setup with a dedicated node for Sidekiq and one with Redis isn’t quite possible with current state of Beanstalk. There’s no distinction between nodes – every one of them is just like the sibling. At the same time, there’s a notion of a “leader”, being the first node handled during the deployment, but we can’t use it to identify a certain node reliably afterwards. It can be used for migrations or other one-off tasks during the deployment only, and can change from deployment to deployment. On top of this, the auto-scaler is free to terminate any instance during downscaling, and I read it that it leans towards older instances more often than not.
Here’s what I finally came up with:
- use RedisToGo as an external Redis database. They host it on EC2 in the same zone and so the latency is super-small.
- deploy workers on each web node. Even though it’s an unusual setup, it makes sence. As the load grows, the number of tasks to run is likely to increase as well, and so scaling workers makes sense.
To (re)start workers on the nodes, I placed this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Basically, after unziping (pre-init task 1) and setting up environment (pre-init task 2), we tell Sidekiq not to accept any more jobs. Then new app version is deployed, and finally (post-init task 50), we restart workers by first, stopping them, giving time to terminate, and starting over again.
Figuring this out took about two days of dirty time, and so hopefully, this will be of any help to others.