aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • People

How To Use Google Cloud To Find And Protect PII

  • aster.cloud
  • September 27, 2022
  • 4 minute read

BigQuery is a leading data warehouse solution in the market today, and is valued by customers who need to gather insights and advanced analytics on their data. Many common BigQuery use cases involve the storage and processing of Personal Identifiable Information (PII)—data that needs to be protected within Google Cloud from unauthorized and malicious access.

Too often, the process of finding and identifying PII in BigQuery data relies on manual PII discovery and duplication of that data. One common way to do this is by taking an extract of columns used for PII and copying them into a separate table with restricted access. However, creating unnecessary copies of this data and processing it manually to identify PII increases the risks of failure and subsequent security events.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

In addition, the security of PII data is often mandated by multiple regulations and failure to apply appropriate safeguards may result in heavy penalties. To address this issue, customers need solutions that 1) identify PII in BigQuery and 2) automatically implement access control on that data to prevent unauthorized access and misuse, all without having to duplicate it.

This blog will discuss a solution developed by Google Professional Services for leveraging Google Cloud DLP to inspect and classify sensitive data and suggest a solution for using these insights to automatically tag and protect data in BigQuery tables.

BigQuery Auto Tagging solution overview

Automatic DLP can help to identify sensitive data, such as PII, in BigQuery. Organizations can leverage Automatic DLP to automatically search across their entire BigQuery data warehouse for tables that contain sensitive data fields and report detailed findings in the console (see Figure 1 below,) in Data Studio, and in a structured format (such as a BigQuery results table.) Newly created and updated tables can be discovered, scanned, and classified automatically in the background without a user needing to invoke or schedule it. This way you have an ongoing view into your sensitive data.

Read More  2022 Resolution: Learn Google Cloud, Free Of Charge
Figure 1: Visibility of sensitive data fields in BigQuery

 

In this blog, we show how a new open source solution called BigQuery Auto Tagging Solution solves our second goal—automating access control on data. This solution sits as a layer on top of Automatic DLP and automatically enforces column-level access controls to restrict access to specific sensitive data types based on user-defined data classification taxonomies (such as high confidentiality or low confidentiality) and domains (such as Marketing, Finance, or ERP System.) This solution minimizes the risk of unrestricted access to PII and ensures that there is only one copy of data maintained with appropriate access control applied down to the column level.

The code for this solution is available on Github at GoogleCloudPlatform/bq-pii-classifier. Please note that while Google does not maintain this code, you can reach out to your Sales Representative to get in contact with our Professional Services team for guidance on how to implement it.

BigQuery and Data Catalog Policy Tags (now Dataplex) have some limitations that you should be aware of before implementing this solution to ensure that it will work for your organization:

  • Taxonomies and Policy Tags are not shared across regions: If you have data in multiple regions you will need to create or replicate your taxonomy in each region that you want to apply policy tags.
  • Maximum number of 40 taxonomies per project: If you require different taxonomies for different business domains or have replications to support multiple Cloud regions those will count against this quota.
  • Maximum number of 100 policy tags per taxonomy: Cloud DLP supports up to 150 infoTypes for classification, however, a single policy taxonomy can only support up to 100 including any nested categories. If you need to support more than 100 data types, you may need to split these across more than one taxonomy.
Read More  Green Game Jam: Environmentally Conscious Gaming

High-level overview of the solution

Figure 2: High level architecture of solution

 

The solution is composed mainly of the following components: Dispatcher Requests topic, Dispatcher service, BigQuery Policy Tagger Requests topic, and BigQuery Policy Tagger service and logging components.

The Dispatcher Service is a Cloud Run service that expects a BigQuery scope to be expressed as inclusion and exclusion lists of projects, datasets, and tables. This Dispatcher service will query Automatic DLP Data Profiles to check if the tables in-scope have data profiles generated. For these tables, it will publish one request per table to the “BigQuery Policy Tagger Requests” PubSub topic. This topic enables rate limiting of BigQuery column tagging operations and apply auto-retries with backoffs.

The “BigQuery Policy Tagger” Service is also a Cloud Run service that receives the information of the DLP scan results of a BigQuery table. This service will determine the final InfoType of each column and apply the appropriate Policy Tags as defined in the InfoType – Policy Tags mapping. Only-one INFO_TYPE is selected and the function assigns the associated policy tag.

Lastly, all Cloud Run services maintain structured logs that are exported by a log sink to BigQuery. There are multiple BigQuery views that help with monitoring and debugging Cloud Run call chains and tagging actions on columns.

Deployment options

After deploying the solution, it can be used in two different ways:

[Option 1] Automatic DLP-triggered immediate tagging:

Figure 3: Deployment option 1 – Automatic DLP triggered tagging and inspection

 

Automatic DLP is configured to send a Pub/Sub notification on each inspection job completion. The Pub/Sub notification includes the resource name and it triggers the Tagger service directly.

Read More  Helping European Education Providers Navigate Privacy Assessments

[Option 2] Scheduled tagging:

Figure 4: Deployment option 2 – Scheduled tagging and inspection

 

In this scenario, the Dispatcher service is invoked on a schedule with a payload representing a BigQuery scope to list inspected tables by Automatic DLP and create a tagging request per table. You could use Cloud Scheduler (or any Orchestration tool) to invoke the Dispatcher service. If the solution is deployed within a VPC-SC perimeter, other schedulers that support VPC-SC should be used (such as Composer or Custom App.)

In addition, more than one Cloud Scheduler/Trigger could be defined to group projects/datasets/tables that have the same tagging schedule (such as daily or monthly.)

To learn more about Automatic DLP, see our webpage. To learn more about the BQ classifier, see the open source project on Github: GoogleCloudPlatform/bq-pii-classifier and get started today!

 

 

By: Karim Wadie (Strategic Cloud Engineer) and Nikhiel Sawhney (Head of EMEA Security Practice)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Certifications
  • Education
  • Google Cloud
  • Training
You May Also Like
View Post
  • Featured
  • People

Conclave: How A New Pope Is Chosen

  • April 25, 2025
View Post
  • People
  • Technology

AI is automating our jobs – but values need to change if we are to be liberated by it

  • April 17, 2025
Cloud
View Post
  • Engineering
  • People
  • Public Cloud

Why We Need Both Cloud Engineers And Cloud Architects

  • March 11, 2024
View Post
  • People
  • Technology
  • Work & Jobs

Usher’s New Look Embarks on AI-Focused Collaboration with IBM to Set Students up for Career Success

  • March 7, 2024
goswifties-taylor-swift-travis-kelce-fathers-daughters-super-bowl-bonding
View Post
  • Featured
  • People

How Taylor Swift Unexpectedly Brought Fathers and Daughters Together Through Football

  • February 11, 2024
Cloud
View Post
  • People
  • Technology

Clean Air Is A Valuable Economic Asset. Here Are 4 Steps To Achieve It

  • January 31, 2024
Earth
View Post
  • People

Why Standards And Controls Are Essential To The Future Of Digital Financial Markets

  • January 31, 2024
Netherlands
View Post
  • People

How We Can Identify Climate-Vulnerable Neighbourhoods And Protect Inhabitants

  • January 26, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • oracle-ibm 3
    IBM and Oracle Expand Partnership to Advance Agentic AI and Hybrid Cloud
    • May 6, 2025
  • 4
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 5
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 6
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 7
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 8
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 9
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 10
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
  • 2
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 3
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 4
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 5
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.