aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering

Introducing Easier De-Identification Of Cloud Storage Data

  • aster.cloud
  • August 24, 2022
  • 4 minute read

De-identification of Cloud Storage just got easier

Many organizations require effective processes and techniques for removing or obfuscating certain sensitive information in the data they store. An important tool to achieve this goal is de-identification. Defined by NIST as a technique that “removes identifying information from a dataset so that individual data cannot be linked with specific individuals. De-identification can reduce the privacy risk associated with collecting, processing, archiving, distributing or publishing information.”

Always striving to make data security easier, today we are happy to announce the availability of a de-identification action for our Cloud Storage inspection jobs. Now, you can de-identify Cloud Storage objects, folders, and buckets without needing to run your own pipeline or custom code. Additionally, we have enhanced our transforms by adding a new dictionary replacement method that can help you achieve stronger privacy protection – especially with unstructured data you might store like customer support chat logs.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

The “De-identify findings” Action

The “de-identify findings” action for Cloud DLP inspection jobs is a fully managed feature that creates a de-identified copy of the data objects that are inspected. This means that you can inspect a Cloud Storage bucket for sensitive data like Personal Identifiable Information (PII) and then create a redacted copy of these objects all with a few clicks in the Console UI. No need to write custom code or manage complex pipelines and since it’s fully managed, it will auto-scale for you without you needing to manage quota.

 

This new action supports the following data types:

  • Text files
  • Comma- or tab-separated values
  • Images (see regional limitations)
Read More  Linux Series | The Top 100 Commands According To The Logs.

Once enabled, the DLP job will perform an inspection of the data and produce a de-identified copy of all supported files into the output bucket or folder.

You can also use the new de-identify action on Job Triggers to automatically de-identify new content as it appears on a recurring schedule. This is useful for creating a workflow with a safe drop zone for incoming files that need to be de-identified before being made accessible.

What can automatic De-identification do?

Cloud DLP provides a set of transformation techniques to de-identify sensitive data while attempting to make the data still useful for your business.  These techniques include:

  • Redaction: Deletes all or part of a detected sensitive value.
  • Replacement: Replaces a detected sensitive value with a specified surrogate value.
  • Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*).
  • Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Cloud DLP supports several types of tokenization, including transformations that can be reversed, or “re-identified.”
  • Bucketing: “Generalizes” a sensitive value by replacing it with a range of values. (For example, replacing a specific age with an age range, or temperatures with ranges corresponding to “Hot,” “Medium,” and “Cold.”)
  • Date shifting: Shifts sensitive date values by a random amount of time.
  • Time extraction: Extracts or preserves specified portions of date and time values.

New Dictionary Replace method

When a sensitive data element is found, dictionary replacement replaces it with a randomly selected value from a list of words that you provide. This transformation method is especially useful if you want the redacted output to have more realistic surrogate values.

Read More  Investing $1 Billion In Digital Connectivity To Japan

Consider the following example: You collect customer support chat logs as part of providing service to your customers. These support chat logs contain various types of Personal Identifiable Information (PII) including people’s names and email addresses. Cloud DLP can find and de-identify the sensitive elements with static replacements such as “[REDACTED]” to help prevent someone from seeing this sensitive data.

With the new dictionary replacement method you can instead replace these findings with a randomly selected value from a dictionary. This dictionary replacement provides two key benefits over static replacement:

  1. The resulting output can look more realistic
  2. Because the output looks more realistic, it can help conceal any residual names (a privacy de-identification technique sometimes referred to as “hiding in plain sight”)

An example of this:


Input:

[Agent] Hi, my name is Jason, can I have your name?

[Customer] My name is Valeria

[Agent] In case we need to contact you, what is your email address?

[Customer] My email is [email protected]

[Agent] Thank you.  How can I help you?

De-identified Output:

[Agent] Hi, my name is Gavaia, can I have your name?

[Customer] My name is Bijal

[Agent] In case we need to contact you, what is your email address?

[Customer] My email is [email protected]

[Agent] Thank you.  How can I help you?


As you can see in the output, the names and email addresses have been replaced with a random value that both protects the original sensitive information but also makes the output look more realistic. This can make the data more useful and help “hide” any residual PII.

Read More  BigQuery’s Performance Powers Auto Trader UK’s Real-Time Analytics

Next Steps:

To learn more about De-Identification check out our Technical Docs, try De-identification of Storage in the Cloud Console and Watch a recent Google I/O talk on De-identification of data.

 

 

By: Scott Ellis (Senior Product Manager) and Jordanna Chord (Staff Software Engineer)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Automation
  • Cloud Storage
  • De-Identification
  • Google Cloud
You May Also Like
View Post
  • Engineering
  • Technology

Apple supercharges its tools and technologies for developers to foster creativity, innovation, and design

  • June 9, 2025
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024

Stay Connected!
LATEST
  • Camping 1
    The Summer Adventures : Camping Essentials
    • June 27, 2025
  • Host a static website on AWS with Amazon S3 and Route 53
    • June 27, 2025
  • Prioritize security from the edge to the cloud
    • June 25, 2025
  • 6 edge monitoring best practices in the cloud
    • June 25, 2025
  • Genome 5
    AlphaGenome: AI for better understanding the genome
    • June 25, 2025
  • 6
    Pure Accelerate 2025: All the news and updates live from Las Vegas
    • June 18, 2025
  • 7
    ‘This was a very purposeful strategy’: Pure Storage unveils Enterprise Data Cloud in bid to unify data storage, management
    • June 18, 2025
  • What is cloud bursting?
    • June 18, 2025
  • 9
    There’s a ‘cloud reset’ underway, and VMware Cloud Foundation 9.0 is a chance for Broadcom to pounce on it
    • June 17, 2025
  • What is confidential computing?
    • June 17, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Oracle adds xAI Grok models to OCI
    • June 17, 2025
  • Fine-tune your storage-as-a-service approach
    • June 16, 2025
  • 3
    Advanced audio dialog and generation with Gemini 2.5
    • June 15, 2025
  • Google Cloud, Cloudflare struck by widespread outages
    • June 12, 2025
  • 5
    Global cloud spending might be booming, but AWS is trailing Microsoft and Google
    • June 13, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.