In our last blog post, we looked at how we can identify your big spenders in your cloud deployment. In this post, we’ll take a look at strategies to make your cloud spend more efficient.
One of the keys to maximising the efficiency of your workloads in Cloud is to actually understand what they consume and how they consume them. If you don’t understand how your application behaves, and simply plan your cloud deployment based on the highest possible peak that it could reach, then you will likely have highly inefficiency consumption and end up wasting money on underutilised resources. Cloud consumption is charged based on unit consumption or unit consumption over time. When you get charged $3/hour to run your web server VM, then you really want to be using as much performance of that VM as possible, because if you use 0%, or 99%, you’re still being charged $3/hour.
Building for clouds
Despite all the flexibility that cloud gives us, few clouds offer the capability to add CPUs and memory to a running system without interruption, and even fewer guest operating systems reliably support it. Generally speaking, scaling up a server is an event that requires down time. So when you build cloud deployments you should build them using a scale out architecture. When you build applications with a scale out architecture, you simply add or remove servers to handle performance changes.
As you evaluate the best way to build a scale out architecture, it’s also important to understand what metrics you can use to determine when to scale out. It may be a very simplistic metric works for your application, you may simply deploy more servers when the CPU load of the application servers gets high. But sometimes that isn’t appropriate, in some architectures, you may have to monitor how much data is building up as unprocessed in system A, to determine when to scale system B, regardless of which scale out metric you chose and how you chose to implement it, understanding how your application performs and what the key indicators of performance bottlenecks are is critical.
Similarly, if you don’t periodically review your application behaviour and performance, you won’t be able to tune and optimise your deployment to find new efficiencies as your workloads develop.
Not everything can be made cloudy
However, it’s not always possible to change the application architecture of our cloud deployments. In those situations, we have to find more fundamental ways to save money when running workloads in the cloud. Some other key fundamentals take the key approaches of our application design and apply them generally. For example, knowing that our developers and testers only work business hours, we can schedule the start-up and shutdown of our dev and test environments to ensure that they only run when they are needed.
We can also apply this process to user based cloud services. For some SaaS applications, cloud providers charge a prorated per user monthly fee. When consuming those services, it’s worth evaluating if you can take a just in time user provisioning approach to those applications. Provision access to the SaaS as the user needs it, then take it away after they have finished what they need to do. This is especially useful for scenarios where you use generic accounts for training, but they are only used for a week at a time, and training only happens 6-7 times a year. Rather than leaving all the accounts provisioned and active and charging in the application, provision them before training starts and deprovision them after the training ends to ensure that you are not paying per user changes for accounts that aren’t being used most of the time.
As you can see, there a number of different approaches to making your cloud spend more efficient and this blog only covers the tip of the iceberg.