Cloudonomics - maximizing return on cloud investment

In this post, I address the question "Is my organization maximizing ROI from the cloud" exploring various options with practical examples.

Use reservations effectively

Cloud vendors like AWS and Azure provide substantial discounts for longer term commitments. For example AWS provides significant discounts on EC2 instances, DynamoDB, RDS, Elasticache for a longer commitment. Target maximizing coverage through reservations by reserving resources where projected usage equals or exceeds reservation discount percentage.

Here is a simple example illustrating savings using reserved instances over on-demand pricing using this technique. Lets assume there are a total of 200 instances used, 100 of which are used 24 hours, 50 for 18 hours and 50 more for 12 hours daily. Lets further assume that one year reservations provide 35% discount over on-demand pricing, monthly on-demand costs are $100 while reserved costs are $65.  In this case, the rule of thumb here is to reserve any capacity that is used for more than 65% of the time which is the break even in terms of reservations.

Cost savings through reservations

Instances/Costs On-demand costs $ Reserved costs $ To reserve Actual costs $ Savings $
(On-demand - Actual costs)
100 instances used 100% of the time 10,000 6,500 Yes 6,500 3,500
50 instances used 75% of the time 3,750 3,250 Yes 3,250 500
50 instances used 50% of the time 2,500 3,250 No 2,500 N.A
Total 16,250 N.A N.A 12,250 4,000

Simply reserving 150 instances results in savings of 24%. Keep in mind is that instance types change frequently and the benefits of reserving instances for 3 years is not as clear. However for managed services like Dynamo it is more economical to invest in 3 year reservations than 1 year as the customers are insulated from underlying hardware changes.

There are cloud service management platforms like CloudHealth Technologies which automate these processes to maximize ROI from the cloud.

Scaling through hardware and not optimizing enough

While the cloud is elastic and can scale with demand, costs can get out of hand quickly. It can encourage uneconomical behavior by using compute resources to solve problems rather than efficient engineering practices.

Too often, scaling problems are solved by adding hardware instead of an efficient design. An efficient design minimizes the amount of data transferred between different computing systems leading to lesser processing requirements resulting in cost savings.

The servers and databases become more efficient by processing smaller volume of data resulting in lower compute, memory and networking resource needs. Data transfer costs are reduced which can be passed on to the consumers.

Data compression can achieve the similar results. The CPU overhead to compress and uncompress is offset by gains in transferring lower amounts of data. 

Breaking down data transferred into smaller chunks involves a higher level of engineering effort. For example it is programmatically simpler to save an entire user profile when a single attribute has changed but is inefficient and expensive, especially when repeated. .

Use resources only when needed

Follow the business hours cost model where resources are active only when they are being used. This is usually 12 hours/day M-F,  60 of 168 hours a week which is 35.7% of the time. Resources for software development and testing are excellent examples for this type of usage.

Batch compute tasks, both scheduled and ad-hoc tasks run and release all resources when complete. A word of caution - on-demand compute capacity may not always be available so plan for variable schedules. 

Containerize

When an application is containerized, it can be deployed across different hardware consistently. When new hardware is introduced the application can be benchmarked it and used if more efficient (new hardware usually is).

Containerization future proofs applications. As applications mature, the engineers who built them may not be around and technology will have evolved. Redeploying the application on newer hardware might get complicated. For example, there is no clear migration path for legacy applications running on AWS from PV AMi’s to the newer HVM AMI’s without knowing how the application is configured or deployed. There are applications running on older AWS m1 and m2 instance families which cannot be moved to newer instances without significant engineering effort. A containerized application will not have this problem resulting, in savings over the entire lifecycle of the application.

In addition containerization facilitates on-demand usage, supporting use cases like business hours cost model and batch computing. 

Use managed services

Managed services tend to be more elastic and costs correlate to usage. For example AWS DynamoDB follows an elastic pricing model where capacity can be changed as needed, multiple times a day. In addition these services are managed programmatically lending themselves to a devops culture requiring lower administrative overhead. Other administrative tasks like backing data up and setting up cross region replication for high availability are greatly simplified. So managed services improve productivity and need fewer personnel.

In summary, to get the maximum ROI from the cloud, use reservations effectively, optimize the applications to minimize data transfer, follow business hours cost model to schedule resources during working hours, containerize to future proof and to take advantage of newer, cheaper hardware and finally use managed services which are elastic and simpler to manage.

Appendix

Here is a presentation on the same topic.