aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Data
  • Engineering
  • Practices

Cloud Wisdom Weekly: 6 Tips To Optimize Data Management And Analytics

  • aster.cloud
  • September 29, 2022
  • 6 minute read

“Cloud Wisdom Weekly: for tech companies and startups” is a new blog series we’re running this fall to answer common questions our tech and startup customers ask us about how to build apps faster, smarter, and cheaper. In this installment, Google Cloud Big Data & Analytics Consultant Julianne Cuneo explores how to get started using BigQuery effectively.

Working with large amounts of data – like those encountered with traditional data warehouses and data lakes – can be challenging, complex, expensive, and reliant on specialized skills that can be difficult to source. To compete in today’s customer-centric and data-driven marketplaces, these challenges are critical to overcome.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Analyzing data at scale is crucial to this effort, but so is managing costs and resources. Many businesses are thus looking to the cloud to find solutions and strike the right balance. In this article, we will explore how growing tech companies and startups leverage BigQuery for innovation, and we will share tips that will help you do more with Google’s industry-leading enterprise cloud data warehouse.

Optimizing data management and analytics

Oftentimes, companies rush into loading data and running queries for the sake of seeing how a new technology will perform. This is reasonable for a quick proof-of-concept or evaluation, but it doesn’t necessarily set you up for success in the long term, as you’re encouraged to be more sophisticated in your approach to business, security, and budgetary needs. The below tips will help you set up a strong, scalable foundation, including specific examples of how to optimize a data platform architecture with BigQuery.

1. Independently scale storage and compute

When it comes to handling massive amounts of data, having the right storage capabilities is one of the biggest challenges. Assuming you can afford the cost associated with maintaining large volumes of information, effectively analyzing and extracting value from it can be even more daunting. A serverless architecture can help you overcome these challenges in a couple ways.

First, serverless platforms such as BigQuery separate compute and storage, letting you pay independently for the resources you use, flexibly scaling up or down as your data needs change. Whereas some services bundle resources such that you get (and pay for) more compute and storage than you need, this approach makes storing large amounts of data more cost-effective and therefore more feasible

Read More  Shorten The Path To Insights With Aiven For Apache Kafka And Google BigQuery

Second, if you can afford to store more data, you create more potential for insights. To that end, BigQuery’s scalable compute capacity allows you to query terabytes or even petabytes of data in a single request.

Combined, these capabilities enable you to scale analytics efforts according to your needs, rather than a predefined amount of storage or compute resources.

2. Carefully organize storage and datasets

Providing secure and consistent data access to the necessary people at the right cost is another crucial aspect of data management and analytics. Appropriately planning for resource optimization can save time and circumvent security, billing, and workflow problems down the road. For instance, in BigQuery’s resource organization, key design considerations include:

  • Datasets and their objects (e.g., tables, views, ML models, etc.) only belong to a single project. This is the project to which that dataset’s storage costs will be billed. Peruse this resource to consider whether you’d want to implement a centralized data warehouse approach, allocate data marts to individual projects, or mix both approaches.
  • Access to objects in BigQuery can be controlled at the dataset, table, row, and column level, which should also be factored into your storage design (e.g., grouping closely-related objects in the same dataset to simplify access grants).

3. Optimize compute cost and performance across teams and use cases

Some use cases may require precise cost controls or resource planning to meet tight service-level agreements (SLAs). In BigQuery, for instance, data only belongs to a single project, but can be queried from anywhere, with compute resources billed to the project that runs the query, regardless of data location. Therefore, to granularly track query usage, you can create individual projects for different teams (e.g., finance, sales) or use cases (e.g., BI, data science).

In addition to segmenting your compute projects by team or use case for billing purposes, you should think about how you may want to control compute resources across projects for workload management. In BigQuery, you can use “slot commitments” to switch between an on-demand model and a flat-rate billing model, including mixing and matching approaches to balance on-demand efficiency with flat-rate predictability. “Slot commitments” are dedicated compute resources that can be further divided into smaller allocations (or “reservations”). These allocations can either be assigned to an individual project or shared by multiple projects, providing flexibility that allows you to reserve compute power for high-priority or compute-intensive workloads while enjoying cost savings over the on-demand query model.

Read More  Bio Farma Collaborates With Google Cloud And ‘Connectedlife With Fitbit’ To Empower Healthier Living And Chronic Disease Prevention Across Indonesia

For example, say your company has committed to 1,000 slots. You may choose to allocate 500 to your compute-intensive data science projects, 300 to ETL, and 200 to internal BI which has a more flexible SLA. Best of all, your idle slots aren’t isolated in a silo to be left unused. If your ETL projects aren’t using all of their 300 slots, these idle resources can be seamlessly shared with your other data science or BI projects until they are needed again.

4. Load and optimize your data schemas

Once you understand how your data will be organized, you can start populating your data warehouse. BigQuery provides numerous ways to ingest data through flat files in Google Cloud Storage, pre-built connectors to apps and databases through Data Transfer Service, streaming inserts, and compatibility with numerous third party data migration and ETL tools.

A few simple optimizations to your table schemas can help you achieve the best results. In most cases, this means applying partitioning and/or clustering based on your expected query patterns to significantly reduce the amount of data scanned by queries.

5. Unify your data investments

Your data and analysis needs might involve working with unstructured and semi-structured data alongside your more broadly-understood, structured data. For this, it is helpful to think beyond just “enterprise data warehouse” and broaden your focus to include solutions that provide a true, centralized data lake.

If you’re using BigQuery, the platform’s federation capabilities can seamlessly query data stored in Google services including Cloud Storage, Drive, Bigtable, Cloud SQL, and Cloud Spanner, as well as data in other clouds. BigQuery’s Storage API also gives other services such as Dataproc, Dataflow, ML, and BI tools fast access to BigQuery storage at high volumes. Features such as these can help ensure that your data efforts are part of a unified, consistent approach, rather than being splintered across platforms and teams.

Read More  Cloud Spanner Myths Busted

6. Run queries and have fun!

Once your data is available, it’s time to start querying! To make sure you don’t hit any snags, your platform should ideally provide an easy onramp that lets people get started right away.

As an ANSI-compliant solution, BigQuery SQL provides the average SQL developer with the ability to leverage their existing skills right from the start. There are also numerous third-party tools that provide native connectors to BigQuery or leverage BigQuery’s JDBC/ODBC drivers to author queries on the user’s behalf. If you have numerous SQL scripts from a previous data warehouse investment, BigQuery’s Migration Service can help automate translation of jobs coming from Teradata, Redshift, and several other services. These features allow you to make data available, protected, and smartly-budgeted, and helps ensure it can easily plug into user-friendly interfaces for analysis.

And if you’re making the move to BigQuery, be sure to take advantage of BigQuery’s unique features, rather than just moving existing queries and continuing to operate in the status quo. Run those large analyses that wouldn’t have been able to execute on another system. Try training a prototype machine learning model using SQL-based BigQuery ML. Query streaming data in real-time. Perform geospatial analysis with built-in GIS functions. It’s time to innovate.

Building a solid data foundation takes time and planning

The tips put forth in this article should help position your company for success in the near- and long-term, sparing you from the need to rearchitect your warehousing solution as your business matures. Deciding to put the time, effort, and monetary investment into any new technology requires careful evaluation, so we encourage you to get hands-on with BigQuery through quickstarts, and by visiting our Startups page or reaching out to Google Cloud experts.

 

 

By: Julianne Cuneo (Big Data & Analytics Cloud Consultant)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Best Practice
  • Data Analytics
  • Data Management
  • Google Cloud
  • Startups
You May Also Like
Getting things done makes her feel amazing
View Post
  • Computing
  • Data
  • Featured
  • Learning
  • Tech
  • Technology

Nurturing Minds in the Digital Revolution

  • April 25, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.