aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Data
  • Engineering

How Spanner And BigQuery Work Together To Handle Transactional And Analytical Workloads

  • aster.cloud
  • March 27, 2022
  • 6 minute read

As businesses scale to meet the demands of their customers, so do their need for efficient products to collect, manage and analyze data to meet their business goals. Whether you are building a multi-player game or a global e-commerce platform, it’s critical to ensure that data can be stored and queried at scale with strong consistency and then processed for analysis to deliver real-time insights.

In this blog we’ll discuss how Cloud Spanner and BigQuery are a match made in heaven, and can be used together to process transactions at scale and generate real-time insights to deliver amazing customer experiences. Using Cloud Spanner and BigQuery also allows customers to build their data clouds using Google Cloud, a unified, open approach to data-driven transformation that is unmatched for speed, scale, and security.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Industry leading speed, scalability and reliability

Spanner is Google Cloud’s fully managed relational database optimized for transactional workloads.

 

(Click to enlarge)

 

Spanner provides seamless replication across regions in Google Cloud. It also has strong external consistency, unlimited scale and processes over 1 billion requests per second at peak. Customers can start Spanner with 100 Processing Units at $65/month and will not have to worry about the scalability of their database even if their business grows 100x.

BigQuery is a serverless, highly scalable, and cost-effective multi cloud data warehouse designed for business agility.

(Click to enlarge)

 

It is optimized to remove the limits for ad-hoc analysis and reporting, which makes it best suited for gaining organizational insights. BigQuery customers analyze over 110 terabytes (TB) of data per second. BigQuery and Spanner are powerful tools independently, but seamlessly work together to execute transactional and analytical workloads and handle high throughput needs.

Built on proven infrastructure

Spanner and BigQuery can scale independently from each other in both compute and storage resources as workload demands change. Historically databases have been architected with tightly coupled storage and compute, but Spanner and BigQuery are architected with separate compute and storage. Both products are built on Colossus, Google’s distributed storage system, that powers Google’s most popular, globally available products, like YouTube, Search, Maps, and Gmail.

Colossus provides high durability, availability, performance and scalability globally for the services built on it. Users never have to worry about running out of capacity during peak seasons. For example, year end financial planning and forecasting exercises that run many complex queries against large datasets or the massive retail transaction volumes during Cyber Monday can all be handled by both services.

Read More  What's Google Talking About? Google-y Podcast Weekend Listening

Spanner and BigQuery can scale up or down independently or together as needed by users. Both products are also built on Borg, Google’s internal cluster management system and Jupiter, in-house custom network hardware and software that connects all servers in Google’s data centers.

 

 

Real time analytics made easy

Users have long performed analytics against transactional data by extracting required data from OLTP databases into analytics (OLAP) systems, with homegrown scripts or external ETL/ELT tools. This worked well for many years, but as workloads grew from tens to millions of transactions per second and analytics expanded from weekly and daily static reports to include ad hoc queries against the latest transactional data, entire teams were dedicated to maintain the integration of these systems. Spanner and BigQuery were architected to unify the data lifecycle and quickly enable analytics in BigQuery using the latest transactional data from Cloud Spanner.

Depending on the use case, users have the option of ingesting data into BigQuery as physical tables first or querying the data in Spanner directly when needed. In both cases, users can leverage query federation for a simple setup. Users only need to configure an external data source in BigQuery pointing to the intended Spanner instance and write the appropriate queries. These queries may be used to populate a BigQuery table on demand or scheduled to run as needed. Alternatively it can be joined with another BigQuery result set dynamically. There is no additional maintenance or dependency on other components for BigQuery users to be able to consume the latest and greatest OLTP data from Spanner. In cases where there may be more complex transformations or external dependencies, users may also ingest Spanner data into BigQuery using a service like DataFlow.

Sample BigQuery and Spanner usage scenario

To get a better picture of how BigQuery and Spanner are used together, look at a sample scenario of an online gaming startup company Cymbal with just over a million players around the world.

(Click to enlarge)

 

 

 

 The company is fictional but the scenario is real. Cymbal stores game data in Spanner including user profile, inventory items, customizations, actions and other gaming activities. This information is physically located in Colossus in Spanner’s file format designed for efficiency and scalability. In addition to high transaction rates to support game play data, Cymbal also performs analytics on player behavior including how they interact with other players, equipment, even non-player characters. Cymbal analysts routinely need to combine competition data already in BigQuery with data in Spanner. Cymbal has two ways to accomplish this. They can replicate data from Spanner into BigQuery and perform analytics against local data or they can use federated queries to retrieve data from Spanner on-demand.

Read More  Connecting Urban Environments With IoT And Digital Twins

Replicate data

As mentioned earlier, users can leverage federated query to read data from Spanner and write to a native BigQuery table. This is most commonly used when data replication for analytics is needed infrequently and the data can be minutes to days old. This works well for some of Cymbal’s analysts and we will dive deeper into federated queries in the next section including step-by-step details on setting it up. Other Cymbal analysts need to have data refreshed daily from Spanner and do not want to manage this themselves. In this case, Cymbal admins can configure a recurring job to replicate data from Spanner to BigQuery using federated queries, a service like Google DataFlow or one of the many solutions by Google Cloud partners.

Federated queries

BigQuery users can easily run federated queries against data residing in Spanner on-demand. The data can be consumed in real-time as part of a bigger analytics query. The on-demand nature of federated queries provides users flexibility and control over how fresh they need the data to be. No need to wait for the next scheduled execution of a data movement (ETL/ELT) job to complete.

Although federated queries are powerful there are some limitations. A federated query might not be as fast as querying local BigQuery tables. There may be higher latency because of the small wait time for the source database to execute the external query and then move the resultset from Spanner to BigQuery. Federated queries really are just another workload on the Spanner instance that consumes compute resources so users should make sure it does not negatively impact the existing OLTP workload..

Read More  Achieving Autonomic Security Operations: Why Metrics Matter (But Not How You Think)

How to run federated queries

The steps to run federate queries are simple!

1) Start by launching BigQuery and choosing the Google Cloud project that contains the Spanner instance that includes the database you need.
2) Then set up an external data source for the Spanner database in BigQuery. You’ll need to have bigquery.admin permissions to set this up.

 

 

Note: The “Read data in parallel” checkbox will enable Spanner to divide a federated query into smaller partitions for parallel execution. This option is restricted to queries whose first operator in the execution plan is a distributed union operator. Other queries will return an error. To learn more, see  Understand how Cloud Spanner executes queries.

3) Finally write a query in BigQuery that accesses the data in the Spanner data source. If you’d like other users to access this external data source in BigQuery, simply grant them permission to use the connection resource you just created.

 

When to use which?

If your use case has a clear scope of the Spanner data to be used in BigQuery, fixed data freshness requirements (e.g. 15 minutes or 1 day), have a large number of users of the same dataset or only want to share a specific dataset (not the entire database), replicating data to local BigQuery tables might be the right approach. This provides security and resource isolation of your BigQuery analytic workload from your Spanner OLTP workload. If your use case involves mostly ad hoc queries that are unpredictable in the data to be retrieved, when the queries might run and can allow permitted BigQuery users access to all Spanner data, then federated queries might be the better option. To learn more about the topic in this blog post, please visit the links below.

Learn more

  • To get started with Spanner, create an instance or try it out with a Spanner Qwiklab.
  • To learn more about Spanner federated queries with BigQuery, visit this tutorial.
  • To learn more about how to replicate data from Spanner To Big Query, visit this blog post.
  • To learn more about BigQuery, make sure to check out these training and tutorials

 

 

By: Bukola Ayodele (Developer Advocate)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • BigQuery;
  • Cloud Spanner
  • Google Cloud
You May Also Like
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • oracle-ibm 3
    IBM and Oracle Expand Partnership to Advance Agentic AI and Hybrid Cloud
    • May 6, 2025
  • 4
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 5
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 6
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 7
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 8
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 9
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 10
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
  • 2
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 3
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 4
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 5
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.