AWS EC2 Spot Instances: The Cloud at its Best!

Once in a while, your data project requires more resources than your laptop can offer. If you deal with big data using Hadoop and/or large databases on MySQL, Postgres, MongoDB or SQL server, you will require a server or cluster of servers. If you are lucky, your organization has the capacity to host your project and you can proceed to work and experiment to your heart's content. If you do not have access to server resources on-site or in your internal cloud, then you need to find a reliable and cost-effective cloud service provider. Amazon Web Services (AWS) is one of the most widely known and used cloud service providers offering an affordable pay as you go cloud hosting model.

AWS consists of a number of services such as EC2 for computing, SSD (hi-est speed, most expensive), EBS (hi-speed, more expensive) & S3 (slower, cheapest) for storage, Cloudfront for dynamic, demand-based, distributed web delivery etc. Amazon Web Services (AWS), especially the Elastic Cloud Compute (EC2) service, have been around for a while for projects that require hosted servers as opposed to shared server hosting. EC2 was an innovative idea that came about when Amazon online bookstore decided to put up for auction their spare server computing time at a fraction of the cost of other cloud offerings, thus making money from their expensive servers when they were not in use. EC2 is the most common (and oldest) service which is quite simply a virtual server which is billed hourly and pricing is based on the computing power. Under EC2, you have a few options:

  1. On-Demand (most common) where you only pay for what you use. You pay a fixed price for when your server is running without any long-term commitments.
  2. Reserved which is identical to On-Demand except that you have to commit to a minimum period of time (not less than a year). This can reduce costs by up to 75%.
  3. Dedicated hosts (if you want to go there). Means you have a physical server in the AWS server farm. Can be billed same as On-Demand or Reserved. This is mainly for enterprises in my mind.
  4. Spot Instances, the subject of this blog, where you bid for computing time. This can reduce costs by up to 90%!

EC2 On-Demand, the "traditional" EC2, offers a wide variety of server instances to fit any project and budget, from a nano server with 1 CPU + 0.5 GB RAM @$0.0059 (KShs 0.60, yes, 60 Kenyan cents!) per hour, upwards with virtually no limit if you can afford it. For illustration purposes, one of the largest offerings currently is 64 CPUs + 288 GB RAM @$4.256 (KShs 425.6) per hour! For more information on the numerous AWS and EC2 offerings and pricing see: and

EC2 instance types

AWS EC2 represents a huge savings on traditional buy-a-server (local hosting) or host-a-server (cloud hosting) solutions while maintaining the flexibility and power that comes with "owning" your own server. You have root/admin access, you can install whatever operating system or software you like, download speeds are incredible, you get near-zero downtime and stellar customer support. The best part is that you only pay for what you use i.e. services which are active. Server instances which have been stopped are not billed (or billed at a negligible rate). Still, charges, like most of the other AWS offerings are per hour and for those, like me, who go to AWS to minimize on computing costs, this is a double-edged sword. Pricing per hour means that you can get short-term access to a powerful server and pay for only what you use, but you must know what you're doing. Long-term use or forgetting to stop or terminate (know the difference*!) a server can attract a hefty bill. I have made this mistake myself, thankfully, discovering it before my bill got out of hand. Billed per hour, EC2 for short-term projects delivers huge savings, but if you run this service for months you must have calculated your costs and benefits. This is part of the extra work you have to put in when using AWS - you have got to do a lot of reading and calculating** to understand how much you actually get to pay, what you save and at what point (if ever) maybe you should be thinking of getting a different service or your own physical server.

BUT THERE'S MORE. You can get even lower server pricing with AWS's spot instances! The catch is more work for you - you have to re-think how you design your projects/apps. Just like AWS's EC2 innovation began with Amazon's online bookstore putting to use its spare computing power by leasing out their servers when they were not in use, Spot Instances take the idea a step farther by offering the spare computing time when EC2 servers are not in use! AWS spot instances are spare EC2 computing resources put up for auction at a fraction of their cost. This again means that the servers are rarely idle - better a server that gives you even a meager return than one that just sits idle incurring power and maintenance costs. With spot instances you can cut your EC2 costs by up to 90%!!! For example, experimenting with a few spot instance server sizes ranging from 2 vCPUs and 8GB RAM @$0.0128/KShs 1.28 per hour to 16 vCPUs and 64GB RAM @$0.1093/KShs. 10.93 per hour over a couple of days resulted in a total of about 35 hours of active server time and a bill of only $3.60 (KShs 360)! How's that for cheap?! See EC2 spot instance prices here:

Example of EC2 On-Demand pricing

Example of EC2 Spot Pricing

An example comparison of spot and on-demand EC2 pricing

As earlier mentioned, spot instances require more work on your part. When setting up a spot instance, you set a maximum price that you are willing to pay for it and then AWS will only instantiate your server when your max bid price is higher than the auction price which varies based on demand/supply. But this means that once your server is running, if the demand rises and the auction price exceeds your maximum bid price, or if EC2 capacity is exceeded, your server is terminated. What is worse is you only get 2 minutes to prepare for the server to be terminated (!) AND AWS DOES NOT OFFER ANY GUARANTEE THAT YOU WILL RECEIVE THE TERMINATION NOTIFICATION BEFORE YOUR SERVER IS TERMINATED!

Look out for the instance termination notification but do not rely on it

Therefore, you should programatically poll for the notification and trigger your own termination procedure when detected, but all the same, your service/application must be able to recover from an unexpected termination and, where spot instance persistence is configured, resume from where it left off at any time. A spot instance can be configured to persist which means that if the server instance is terminated for any reason, a NEW server can be instantiated when your max bid price next exceeds the auction price or when capacity becomes available. NOTE that this is a different server each time so your data must be persisted elsewhere (e.g. a separate S3 or EBS volume). To ensure that your app resumes as required you can either create an AMazon Image (AMI) of the server that has all the configurations you need (including mounting your persistent storage) or use a "user data" script that is run when the server instance is instantiated (only the one time!) or a combination of the two. More on the nitty gritty of setting up EC2 instances, including spot instances, in a later post.

So, in summary AWS spot instances offer by far the cheapest option for server hosting but introduce:

  1. major uncertainties in when and how long your server is up and running
  2. (possibly) major rethinking on how your applications/services are designed
  3. the need for some experimentation to balance bid prices with auction prices and server capacities to maximize run-times and minimize cost.

Spot Instances are particularly good for large batch processing jobs where you have many small tasks that save state frequently and can resume from where they left off any time they are called upon. A list of use-cases and testimonials of how spot instances are being used can be found here:

*Stopping a server instance shuts down the server but does not destroy it. The server may incur costs from storage, static IP addresses assigned etc but not from the server itself. A stopped instance can be restarted. Terminating an instance shuts down and destroys the server. All data is lost and cannot be retrieved.

**Similar to other services that have many offerings, it is not always easy to get all the information you need about EC2 and other AWS services in one place. A lot of reading and cross-referencing from the EC2 site and from blogs/forums is essential.