aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Platforms

Building More Secure Data Pipelines With Cloud Data Fusion

  • aster.cloud
  • February 11, 2020
  • 4 minute read

For those of you working in data analytics, ETL and ELT pipelines are an important piece of your data foundation. Cloud Data Fusion is our fully managed data integration service for quickly building and managing data pipelines.

Cloud Data Fusion is built on the open source project CDAP, and this open core lets you build portable data pipelines. A CDAP server might satisfy your need to run a few simple data pipelines. But when it comes to securing a larger number of business-critical data pipelines, you’ll often need to put a lot more effort into logging and monitoring those pipelines. You will also need to manage authentication and authorization to protect that data when you have servers running workloads for multiple teams and environments. These additional services can require a lot of maintenance effort from your operations team and take time away from development. The goal is running pipelines, not logging, monitoring, or the identity and access management (IAM) service.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

We designed Cloud Data Fusion to take care of most of this work for you. And since it’s part of Google Cloud, you can take advantage of built-in security benefits when using Cloud Data Fusion rather than self-managed CDAP servers:

  • Cloud-native security control with Cloud IAM—Identity management and authentication efforts are taken care of by Cloud Identity
  • Full observability with Stackdriver Logging and Monitoring—Logs include pipeline logs and audit logs
  • Reduced exposure to public internet with private networking

Let’s take a look at these features in detail.

Access control with Cloud IAM

The number one reason to use Cloud Data Fusion over self-managed CDAP servers is that it integrates seamlessly with Cloud IAM. That lets you control access to your Cloud Data Fusion resources. With Cloud IAM, Cloud Data Fusion is able to easily integrate with other Google Cloud services. You can also use Cloud Identity for users and groups management and authentication [such as multi-factor authentication (MFA)], instead of implementing or deploying your own.

Read More  Google Workspace Extends Enterprise-Grade Security And Device Management For Hybrid Work With Okta And VMware

There are two predefined roles in Cloud Data Fusion: admin and viewer. As a practice of the IAM principle of least privilege, the admin role should only be assigned to users who need to manage (create and delete) the instances. The viewer role should be assigned to users who only need to access the instances, not manage them. Both roles can access the Cloud Data Fusion web UI to create pipelines and plugins.

Assign roles and permissions to groups with users instead of assigning them to users directly whenever possible. This helps you control users’ access to Cloud Data Fusion resources in a more organized manner, especially when you assign permissions to the groups repeatedly on multiple projects.

Read more about the two Cloud Data Fusion roles and their corresponding permissions.

Private IP instance

The private IP instance of Cloud Data Fusion connects with your Virtual Private Cloud (VPC) privately. Traffic over this network does not go through the public internet, and reduces potential attack surface as a result. You can find more about setting up private IP for Cloud Data Fusion.

VPC Service Controls

We’re also announcing beta support for VPC Service Controls to Cloud Data Fusion. You can now prevent data exfiltration by adding a Cloud Data Fusion instance to your service perimeter. When configured with VPC-SC, any pipeline that reads data from within the perimeter will fail if it tries to write the data outside the service perimeter.

Stackdriver Logging

Stackdriver Logging and Monitoring are disabled by default in Cloud Data Fusion, but we recommend you enable these tools for observability.

Read More  Google Cloud Next 2019 | Kroger Digital's Journey to GCP

With the extra information provided by the logs and metrics, you can not only investigate and respond to incidents faster, but understand how to manage your particular infrastructure and workloads more effectively in the long run. There are a range of logs that can help you run your Cloud Data Fusion pipelines better.

Pipeline logs

These are generated by your pipelines in Cloud Data Fusion. They are useful for understanding and troubleshooting your Cloud Data Fusion pipelines. You can find these logs in the Cloud Data Fusion UI as well as in the Stackdriver logs of the Dataproc clusters that execute the pipelines.

Admin activity audit logs

These logs record operations that modify the configuration or metadata of your resources. Admin activity audit logs are enabled by default and cannot be disabled.

Data access audit logs

Data access audit logs contain API calls that read the configuration or metadata of the resources, as well as user-driven API calls that create, modify, or read user-provided resource data.

Admin activity audit logs and data access audit logs are useful for tracking who accessed or made changes to your Cloud Data Fusion resources. In case there’s any malicious activity, a security admin will be able to find and track down the bad actor in the audit logs.

These Google Cloud features can give you extra control and visibility into your Cloud Data Fusion pipelines. Cloud IAM helps you to control who can access your Cloud Data Fusion resources; private instance minimizes exposure to public internet; and Stackdriver Logging and Monitoring provides information about your workloads, changes in permission, and access to your resources. Together, they create a more secure solution for your data pipeline on Google Cloud.

Read More  Building Security Guardrails For Developers With Google Cloud

 

Jeanno Cheung

Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud Data Fusion
  • Cloud IAM
  • Google Cloud
You May Also Like
Google Cloud and Smart Communications
View Post
  • Platforms
  • Technology

Smart Communications, Inc. Dials into Google Cloud AI to Help Personalize Digital Services for Filipinos

  • October 25, 2024
View Post
  • Platforms
  • Public Cloud

Empowering builders with the new AWS Asia Pacific (Malaysia) Region

  • August 30, 2024
Red Hat and Globe Telecoms
View Post
  • Platforms
  • Technology

Globe Collaborates with Red Hat Open Innovation Labs to Modernize IT Infrastructure for Greater Agility and Scalability

  • August 19, 2024
Huawei Cloud Cairo Region Goes Live
View Post
  • Cloud-Native
  • Computing
  • Platforms

Huawei Cloud Goes Live in Egypt

  • May 24, 2024
Asteroid
View Post
  • Computing
  • Platforms
  • Technology

Asteroid Institute And Google Cloud Identify 27,500 New Asteroids, Revolutionizing Minor Planet Discovery With Cloud Technology

  • April 30, 2024
IBM
View Post
  • Hybrid Cloud
  • Platforms

IBM To Acquire HashiCorp, Inc. Creating A Comprehensive End-to-End Hybrid Cloud Platform

  • April 24, 2024
View Post
  • Platforms
  • Technology

Canonical Delivers Secure, Compliant Cloud Solutions for Google Distributed Cloud

  • April 9, 2024
Redis logo
View Post
  • Platforms
  • Software

Redis Moves To Source-Available Licenses

  • April 2, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 2
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 3
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • oracle-ibm 4
    Google Cloud and Philips Collaborate to Drive Consumer Marketing Innovation and Transform Digital Asset Management with AI
    • May 20, 2025
  • notta-ai-header 5
    Notta vs Fireflies: Which AI Transcription Tool Deserves Your Attention in 2025?
    • May 16, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.