A few months ago Amazon Web Services announced three new instance types for EC2: the T2 instances (t2.micro, t2.small and t2.medium). These new T2 instances are different because you don't get a fixed amount of computing power all of the time.
The status quo
With any other EC2 instance type, once you boot the instance you get a certain amount of CPU power, a certain amount of memory, disk etc. You can use those resources however you like up to the limit. In practice, this works very much like the limitations you would have if you were running a physical server.
T2 instances
On the other hand, the T2 instances serve a slightly different purpose. They're designed for workloads where you sometimes need a lot of CPU power, but where most of the time you don't.
The instances themselves come with the same Xeon E5-2670 CPUs (running at 2.5GHz) that come in the M3/R3/I2 instance types (roughly 3 ECUs per core). Instead of being able to use that power whenever and however you like, they're subject to a CPU "credit" system.
This system gives you a certain amount of baseline performance, but it allows you to "burst" much higher when you need the extra power. Effectively, this feature means you can use as much power as you like if your average usage falls below a certain value.
For the t2.micro, that value is 10%. For the t2.small, it's 20%. For the t2.medium, it's also 20% but you get two cores and it takes the average of both. For example, you could run one core at 40% and one core idle.
Where T2s work best
Plenty of different usage profiles could fit this pattern. For example, consider running one service that maxes out the whole CPU but only does it 10% of the time, in that situation a t2.micro instance is perfect. Similarly, if an application or database needs lots of CPU to serve individual requests with low latency, but idles between requests, then a T2 is advantageous too.
How GoSquared is using T2s
For example, here're some use cases that we've found work really well on T2 instances:
- Our email-sending service uses a lot of CPU on-the-hour, every hour. We send out daily traffic reports in different timezones but it sits mostly idle in-between those times.
- Most of the instances serving our front-end applications (everything you see on gosquared.com) – these instances typically spend most of their time either idle between requests or waiting on network activity to other services.
- Our Jenkins build server, that takes care of building, testing and deploying all our other services – T2s are particularly brilliant here, because it sits idle between builds but the high peak CPU performance means our builds are really fast.
- Any applications and databases powering our internal tools which have relatively low traffic.
Service-oriented architecture
T2 instances are really well-suited to a service-oriented architecture. When components are broken down into small services and those services can be placed on separate hardware (or in the case of EC2, separate virtual machines), then T2s are a good option.
They also make it easier than ever to take advantage of service-oriented architecture by providing extremely cheap and small building blocks. Architecture can be broken down into its smallest constituent components. For the price of an old m1.small, you can run three t2.micro instances and each has approximately 3x the peak processing power.
Performance, power and low costs
The best part about the T2 instances is that, so long as you don't spend all your CPU credits, you enjoy the performance and all the power of a much larger instance, but at a fraction of the cost. As far as the services on your instance are concerned, they're running on a c3.large(ish)-sized instance, but costing a fraction of the price.
We’ve recently migrated a number of services to T2 instances. For some, we’re now running more than 50% more EC2 instances than before, but paying roughly 30% less in equivalent on-demand cost. Compared to the previous instances, we’re enjoying all the benefits of great peak performance for these services, such as reduced request latency.