To help our customers on their path to success with Google Cloud, we published the Google Cloud Architecture Framework – a set of canonical best practices for building and operating workloads that are secure, efficient, resilient, high performing, and cost effective.
Today, we’re diving deeper into the Architecture Framework System Design pillar, including the four key principles of system design and recent improvements to our documentation. We’ll also expand on the new space of the Google Cloud Community dedicated to the Architecture Framework, which was created to help you achieve your goals with a global community of supportive and knowledgeable peers, Googlers, and product experts.
What is system design?
The System Design Pillar is the foundational pillar of the Architecture Framework, which includes Google Cloud products, features, and design principles to help you define the architecture, components, and data you need to satisfy your business and system requirements.
The System Design concepts and recommendations can be further applied across the other five pillars of the Architecture Framework: Operational Excellence, Security, Privacy, and Compliance, Reliability, Cost Optimization, and Performance Optimization.
You can evaluate the current state of your architecture against the guidance provided in the System Design Pillar to identify potential gaps or areas for improvement.
System design core principles
A robust system design is secure, reliable, scalable, and independent, enabling you to apply changes atomically, minimize potential risks, and improve operational efficiency. To achieve a robust system design, we recommend you follow four core principles:
When customers are either looking to move to the cloud or starting to build their applications, one of the major success blockers we see is the lack of documentation. This is especially true when it comes to correctly visualizing current architecture deployments.
A properly documented cloud architecture helps establish a common language and standards, enabling your cross-functional teams to communicate and collaborate effectively. It also provides the information needed to identify and guide future design decisions that power your use cases.
Over time, your design decisions will grow and change, and the change history provides the context your teams need to align initiatives, avoid duplication, and measure performance changes effectively over time. Change logs are particularly invaluable when you’re onboarding a new cloud architect, who is not yet familiar with your current system design, strategy, or history.
Simplify your design (use fully managed services)
When it comes to system design, simplicity is key. If your architecture is too complex to understand, your developers and operations teams can face complications during implementation or ongoing management. Wherever possible, we highly recommend using fully managed services to minimize the risk of managing and maintaining baseline systems, as well as the time and effort required by your teams.
If you’re already running your workloads in production, testing managed service offerings can help simplify operational complexities. If you’re starting new, start simple, establish an MVP, and resist the urge to over-engineer. You can identify corner use cases, iterate, and improve your systems incrementally over time.
Decouple your architecture
Decoupling is a technique used to separate your applications and service components – such as a monolithic application stack – into smaller components that can operate independently. A decoupled architecture therefore, can run its function(s) independently, irrespective of its various dependencies.
With a decoupled architecture, you have increased flexibility to apply independent upgrades, enforce specific security controls, establish reliability goals, monitor health, and control granular performance and cost parameters.
You can start decoupling early in your design phase or incorporate it as part of your system upgrades as you scale.
In order to perform a task, stateful applications rely on various dependencies, such as locally-cached data, and often require additional mechanisms to capture progress and sustain restarts.
On the other hand, stateless applications can perform tasks without significant local dependencies by utilizing shared storage or cached services. This enables your applications to quickly scale up with minimum boot dependencies, thereby withstanding hard restarts, reducing downtime, and maximizing service performance for end users.
The System Design Pillar describes recommendations to make your applications stateless or to utilize cloud-native features to improve capturing machine state for your stateful applications.
System design principles applied across other pillars
The core System Design principles can be applied across the other five pillars of the Architecture Framework, including Operational Excellence, Security, Reliability, Cost, and Performance Optimization. Here are a few examples of how this looks in practice.
Use fully managed and highly-available operational tools to deploy and monitor your workloads, so you can minimize the operational overhead of maintaining and optimizing them.
Apply security controls at the component level. By decoupling and isolating components, you can apply fine-grained governance controls to effectively manage compliance and minimize the blast radius of potential security vulnerabilities.
Design for high availability and scalability. A decoupled architecture enables you to define and control granular reliability goals, so you can maximize the durability, scalability, and availability of your critical services, while optimizing non-critical components on-the-go.
Define budgets and design for cost efficiency. Cost usually becomes a significant factor as you define reliability goals, so it’s important to consider various cost metrics early on when you’re designing your applications. A decoupled architecture will help you enforce granular cost budgets and controls, thereby improving operational efficiency and cost optimization.
Optimize your design for speed and performance. As you design your service availability within your cost budget, ensure you also consider performance metrics. Various operational tools will provide insights to view performance bottlenecks and highlight opportunities to improve performance efficiency.
These are just a few examples, but you can see how the System Design principles can be expanded into various other use cases across the other five pillars of the Architecture Framework.
The Architecture Framework is now part of The Google Cloud Community
The Google Cloud Community is an innovative, trusted, and vibrant hub for Google Cloud users to ask questions and find answers, engage and build meaningful connections, share ideas and have an impact on product roadmaps, as well as learn new skills and develop expertise.
Today, we’re announcing the launch of a new space in the Google Cloud Community dedicated to the Architecture Framework. In this space, you can:
- Access canonical articles that provide practical guidance and address specific questions and challenges related to the System Design pillar. We’ll be releasing articles focused on the remaining five pillars in the coming months.
- Engage in open discussion forums where members can ask questions and receive answers.
- Participate in Community events, such as our “Ask Me Anything” series, where we’ll host a virtual webinar on a specific topic of the Architecture Framework and open it up for questions from the audience.
Together, the Google Cloud Community and Architecture Framework provide a trusted space for you to achieve your goals alongside a global community of supportive and knowledgeable peers, Googlers, and product experts.
Explore the new space of the Community today and if you haven’t already, sign up to become a member so you can take full advantage of all the opportunities available.
What’s new for System Design 2.0?
Earlier this year, we released an updated version (2.0) of the Architecture Framework, and we’ve been continuing to enhance our catalog of best practices based on feedback from our global partner and customer base, as well as our team of Google product experts.
Here’s what’s new in the System Design Pillar:
- Resource labels and tags best practices were added to simplify resource management.
- The compute section is now reorganized to focus on choosing, designing, operating, and scaling compute workloads.
- The database section is reorganized into topics like selection, migration, and operating database workloads, and highlights best practices around workflow management.
- The data analytics section now includes sections on data lifecycle, data processing, and transformation.
- A new section on artificial intelligence (AI) and machine learning (ML) that covers best practices for deploying and managing ML workloads.
As always, we welcome your feedback so we can continue to improve and support you on your path to success with Google Cloud.
Special note and thank you to Andrew Biernat, Willie Turney, Lauren van der Vaart, Michelle Lynn, and Shylaja Nukala, for helping host the Architecture Framework on the Google Cloud Community site. And Minh “MC” Chung, Rachel Tsao, Sam Moss, Nitin Vashishtha, Pritesh Jani, Ravi Bhatt, Olivia Zhang, Zach Seils, Hamsa Buvaraghan, Maridi Makaraju, Gargi Singh, and Nahuel Lofeudo for helping make System Design content a success!
By: Omkar Suram (Solutions Engineer, Project Lead – Architecture Framework) and Rob Rosen (Director Solutions Architecture, Google Cloud)
Source: Google Cloud Blog