aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Platforms

Building More Secure Data Pipelines With Cloud Data Fusion

  • aster.cloud
  • February 11, 2020
  • 4 minute read

For those of you working in data analytics, ETL and ELT pipelines are an important piece of your data foundation. Cloud Data Fusion is our fully managed data integration service for quickly building and managing data pipelines.

Cloud Data Fusion is built on the open source project CDAP, and this open core lets you build portable data pipelines. A CDAP server might satisfy your need to run a few simple data pipelines. But when it comes to securing a larger number of business-critical data pipelines, you’ll often need to put a lot more effort into logging and monitoring those pipelines. You will also need to manage authentication and authorization to protect that data when you have servers running workloads for multiple teams and environments. These additional services can require a lot of maintenance effort from your operations team and take time away from development. The goal is running pipelines, not logging, monitoring, or the identity and access management (IAM) service.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

We designed Cloud Data Fusion to take care of most of this work for you. And since it’s part of Google Cloud, you can take advantage of built-in security benefits when using Cloud Data Fusion rather than self-managed CDAP servers:

  • Cloud-native security control with Cloud IAM—Identity management and authentication efforts are taken care of by Cloud Identity
  • Full observability with Stackdriver Logging and Monitoring—Logs include pipeline logs and audit logs
  • Reduced exposure to public internet with private networking

Let’s take a look at these features in detail.

Access control with Cloud IAM

The number one reason to use Cloud Data Fusion over self-managed CDAP servers is that it integrates seamlessly with Cloud IAM. That lets you control access to your Cloud Data Fusion resources. With Cloud IAM, Cloud Data Fusion is able to easily integrate with other Google Cloud services. You can also use Cloud Identity for users and groups management and authentication [such as multi-factor authentication (MFA)], instead of implementing or deploying your own.

Read More  How MEDITECH Adds Advanced Security To Its Cloud-Based Healthcare Solutions With Cloud IDS

There are two predefined roles in Cloud Data Fusion: admin and viewer. As a practice of the IAM principle of least privilege, the admin role should only be assigned to users who need to manage (create and delete) the instances. The viewer role should be assigned to users who only need to access the instances, not manage them. Both roles can access the Cloud Data Fusion web UI to create pipelines and plugins.

Assign roles and permissions to groups with users instead of assigning them to users directly whenever possible. This helps you control users’ access to Cloud Data Fusion resources in a more organized manner, especially when you assign permissions to the groups repeatedly on multiple projects.

Read more about the two Cloud Data Fusion roles and their corresponding permissions.

Private IP instance

The private IP instance of Cloud Data Fusion connects with your Virtual Private Cloud (VPC) privately. Traffic over this network does not go through the public internet, and reduces potential attack surface as a result. You can find more about setting up private IP for Cloud Data Fusion.

VPC Service Controls

We’re also announcing beta support for VPC Service Controls to Cloud Data Fusion. You can now prevent data exfiltration by adding a Cloud Data Fusion instance to your service perimeter. When configured with VPC-SC, any pipeline that reads data from within the perimeter will fail if it tries to write the data outside the service perimeter.

Stackdriver Logging

Stackdriver Logging and Monitoring are disabled by default in Cloud Data Fusion, but we recommend you enable these tools for observability.

Read More  Microsoft Build 2019 | Microsoft's journey to becoming an open source enterprise with GitHub

With the extra information provided by the logs and metrics, you can not only investigate and respond to incidents faster, but understand how to manage your particular infrastructure and workloads more effectively in the long run. There are a range of logs that can help you run your Cloud Data Fusion pipelines better.

Pipeline logs

These are generated by your pipelines in Cloud Data Fusion. They are useful for understanding and troubleshooting your Cloud Data Fusion pipelines. You can find these logs in the Cloud Data Fusion UI as well as in the Stackdriver logs of the Dataproc clusters that execute the pipelines.

Admin activity audit logs

These logs record operations that modify the configuration or metadata of your resources. Admin activity audit logs are enabled by default and cannot be disabled.

Data access audit logs

Data access audit logs contain API calls that read the configuration or metadata of the resources, as well as user-driven API calls that create, modify, or read user-provided resource data.

Admin activity audit logs and data access audit logs are useful for tracking who accessed or made changes to your Cloud Data Fusion resources. In case there’s any malicious activity, a security admin will be able to find and track down the bad actor in the audit logs.

These Google Cloud features can give you extra control and visibility into your Cloud Data Fusion pipelines. Cloud IAM helps you to control who can access your Cloud Data Fusion resources; private instance minimizes exposure to public internet; and Stackdriver Logging and Monitoring provides information about your workloads, changes in permission, and access to your resources. Together, they create a more secure solution for your data pipeline on Google Cloud.

Read More  A New Look For The Red Pin On Maps JavaScript, Android And iOS

 

Jeanno Cheung

Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud Data Fusion
  • Cloud IAM
  • Google Cloud
You May Also Like
Google Cloud and Smart Communications
View Post
  • Platforms
  • Technology

Smart Communications, Inc. Dials into Google Cloud AI to Help Personalize Digital Services for Filipinos

  • October 25, 2024
View Post
  • Platforms
  • Public Cloud

Empowering builders with the new AWS Asia Pacific (Malaysia) Region

  • August 30, 2024
Red Hat and Globe Telecoms
View Post
  • Platforms
  • Technology

Globe Collaborates with Red Hat Open Innovation Labs to Modernize IT Infrastructure for Greater Agility and Scalability

  • August 19, 2024
Huawei Cloud Cairo Region Goes Live
View Post
  • Cloud-Native
  • Computing
  • Platforms

Huawei Cloud Goes Live in Egypt

  • May 24, 2024
Asteroid
View Post
  • Computing
  • Platforms
  • Technology

Asteroid Institute And Google Cloud Identify 27,500 New Asteroids, Revolutionizing Minor Planet Discovery With Cloud Technology

  • April 30, 2024
IBM
View Post
  • Hybrid Cloud
  • Platforms

IBM To Acquire HashiCorp, Inc. Creating A Comprehensive End-to-End Hybrid Cloud Platform

  • April 24, 2024
View Post
  • Platforms
  • Technology

Canonical Delivers Secure, Compliant Cloud Solutions for Google Distributed Cloud

  • April 9, 2024
Redis logo
View Post
  • Platforms
  • Software

Redis Moves To Source-Available Licenses

  • April 2, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.