The five pillars of cloud architecture
The five pillars of cloud architecture is a framework for designing and building cloud infrastructure to run any type of workload at any scale on the cloud.
Why do I need a cloud architecture framework
A framework guides you with a set of best practices and design patterns. In the absence of a framework, you would be on your own, running into pitfalls often.
The five-pillar cloud architecture framework defines a set of best practices across five key aspects (or pillars) of the cloud:
- Performance efficiency
- Operational excellence
- Resiliency
- Cost optimization
- Security
These five pillars sometimes have competing requirements.
You can reduce cost, compromising reliability or security.
Or, you can get more cushion on resiliency while impacting the cost.
Or, you can operate efficiently, completely forgetting security (at least for a short time - until you are under attack)
The six-pillar architecture helps you balance these competing priorities and design your cloud to meet business goals.
What is cloud architecture
At its fundamental layer, the cloud consists of computing, storage, and networking resource pools. Cloud service providers offer these resources for you to consume via a range of cloud services. Computing resources are made available as virtual machines, containers, serverless functions, etc. Storage and networking resources are also made available via similar services.
You can build your cloud infrastructure by provisioning computing, storage, and networking resources via these cloud services. But first, you need to know what services to choose and how to integrate those services to build a holistic platform to run your software applications.
This is the problem addressed in the five-pillar cloud architectural design.
Now, let’s explore the five pillars and how you should build your architecture balancing all pillars.
Pillar #1: Performance efficiency
You get maximum performance when your cloud resources are optimally utilized but not overly stressed.
If your cloud resources are not being optimally utilized, you are spending money on unused capacity.
On the other hand, your software applications will get negatively impacted if your cloud resources are overly stressed either because you do not have enough capacity to handle the load or you have not properly architected your cloud for the type of workload.
Follow these guidelines to ensure optimal utilization of your cloud resources.
- Select the right cloud service
Cloud service providers offer a wide range of services to consume computing, storage, and networking resources on the cloud. There is no one-size-fits-all service. Select the right services based on the type of workload you intend to run.
- Set the right size
Maintain enough capacity to handle the current load. Use auto-scaling to scale your capacity proportionate to the variations in the load. Set the scaling thresholds to appropriate values so scaling is triggered at the right time. If you trigger scaling earlier, you would have to pay for excess capacity. But if you scale too late, performance is degraded.
- Measure continuously
Maintaining efficient performance in your cloud resources is a continuous task. Your application changes. The number of users and their behavior also change. To accommodate these changes, continuously monitor and optimize your cloud resources. Cloud providers have monitoring services that let you view the performance metrics of all your cloud services in a single dashboard. Utilize these services to monitor the performance parameters and retain historical data to support your architectural and design decisions. The cloud monitoring services also support detecting anomalies in performance data. Report these anomalies to an external system to get notified when they occur.
The performance efficiency guidelines and practices emphasize on utilizing the right service in the right capacity on the cloud.
These guidelines help you gain the maximum benefit for the dollars you spend on your cloud infrastructure.
Pillar #2: Operational excellence
Operating cloud resources could be a challenging task due to the scale and complexity of software applications.
The operational excellence pillar defines a set of guidelines on how you should approach this challenge.
- Treat your infrastructure as code (IaC)
Manage all your cloud resources with IaC. Make changes only via the IaC tool. Do not make ad-hoc changes in the production setup bypassing the IaC. Keep your IaC code in version control so you can revert to the previous state at any time.
- Automate every (repetitive)thing
Make your goal to automate every repetitive task. Continuing a repetitive task manually is an anti-pattern in the cloud. Doing so is not scalable and is error-prone.
- Eliminate guesswork in troubleshooting
Collect and store metrics and logs from all your cloud resources. When a problem occurs use the metrics and logs to pinpoint the problem. Trying to guess a problem is time-consuming and becomes impractical at scale.
- Run granular operations
Make changes in cloud infra one at a time so that if a problem occurs it’s easy to locate it and is easy to revert if required.>Granular changes can increase the frequency of operations. But that’s OK. Deal with this increased frequency via automation.
- Experiment in a sandbox
All changes intended for production must first be tested and verified in a sandbox or staging environment. Apply the changes in the production setup only after testing thoroughly in the sandbox.
- Refine continuously
Your operational practices need to evolve. Do not expect things to stay the same always. Your customers change, the business requirements change, also the scale changes. Accordingly, your operational practice must also evolve. The volume of operational work tends to increase as the cloud infra scales. These guidelines will help you manage that increased volume so that you can sustainably continue operating at scale.
Pillar #3: Resiliency
The resiliency practices make your software applications immune to failures in cloud resources.
- Set resiliency goals
Your cloud infra should be resilient enough to support your business goals. If you over-engineer resiliency, the cost will increase with no added business benefit.
- Automate failure recovery
Automate the fallback and recovery procedures so that a failure will be recovered with no manual intervention. Use logging and metrics to record failures. These logs and metrics can give you valuable insights for future decisions.
- Practice failure recovery
The worst that can happen at a failure is the malfunctioning of the failure recovery mechanism. Test failure recovery regularly to avoid that. Make sure that your recovery mechanisms kick in at the time of the failure.
- Backup your data always
Cloud providers offer many storage services. Most of these services have inherent redundancy mechanisms. And that’s a good thing. But, do not rely solely on that redundancy. Back up your data regularly.
Your software application must keep serving your customers no matter what happens in your cloud infra. Use resiliency practices to support this goal. Build a resilient architecture that can eliminate the impact of common failures and minimize the impact in extreme cases.
Pillar #4: Cost optimization
The cloud gives you access to cloud resources at the click of a button. But this agility can easily be misused. If you keep on enhancing your cloud infra, soon you will be paying more than what is justifiable for your business.
The guidelines in the cost optimization pillar help you overcome this problem.
- Use cloud resources with a purpose
Make sure that every computing, storage, or networking resource in your cloud infra serves a clear business objective. If not, you must do away with the particular resource. Over-engineering is a common problem in the cloud. Be aware of it and architect your cloud strictly to support your business objectives.
- Analyze granular costs
You cannot optimize your cloud cost by looking at the big picture - the summary of your monthly cloud bill. Dig into the details. Visualize how each part of your workload is contributing to the overall cost. Only then you will be able to optimize your cloud bill.
- Refine continuously
Cost optimization is not a one-time project. Continuously refine your cloud architecture to optimize costs.
Pillar #5: Security
The security pillar defines a set of overarching guidelines on how to secure your cloud resources.
- Centrally manage user accounts and authorization policies
Use an Identity and Access Management (IAM) system to manage all users in a central repository. It allows you to centrally implement policies to control access to all your cloud resources.
- Log all account activities
Users can access your cloud via GUI, CLI, or API. Log all user activities done via any of these interfaces. Make sure to protect the log so that users cannot tamper.
- Grant minimum authorization level
Grant your users the minimum level of authorization to get the job done. It reduces the possibility of human error.
- Security harden all cloud resources
The cloud service provider is responsible only for the infra that creates the cloud. The security of the cloud resources on top is your responsibility. So, make sure to apply security hardening on all your cloud resources like OS, containers, software frameworks, databases, firewalls, etc. Also, routinely apply security fixes on all your cloud resources.
- Allow only minimum required network access
Restrict all network access to your cloud resources using firewalls, ACLs, etc. Allow only the minimum required IP/port combinations so that the attack surface is reduced.
- Audit regularly
Compare the actual configurations with the desired state of your cloud resources to detect unsolicited changes. If any such change is detected, use logs to locate the culprit. This should not be an annual or monthly audit but should be done at least daily using configuration management tools.
Confirming the automation pillar guidelines, the security-related activities must also be automated using tools from cloud service providers or third-party suppliers.
Wrap up
The five-pillar framework is the North Star in your cloud journey.
Yes, being on the cloud is a journey. Cloud tech keeps evolving. Your business requirements keep changing. As such, you need to be continuously updating your cloud architecture.
So, whether you are just starting or halfway on the cloud journey, strive to align with these guidelines so that you reuse industry-proven
this framework matters. Align your cloud architectural design with the guidelines in the five-pillar framework so you can build on a foundation that’s proven in the cloud industry.
The guidelines we provide here are in an abstract form. And that’s intentional so that You can convert them to actionable tasks using the tools and features available in your cloud provider.
Make sure that the five-pillar framework is not an afterthought in your cloud architecture. It must be at the core.