aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Platforms

Building More Secure Data Pipelines With Cloud Data Fusion

  • aster_cloud
  • February 11, 2020
  • 4 minute read

For those of you working in data analytics, ETL and ELT pipelines are an important piece of your data foundation. Cloud Data Fusion is our fully managed data integration service for quickly building and managing data pipelines.

Cloud Data Fusion is built on the open source project CDAP, and this open core lets you build portable data pipelines. A CDAP server might satisfy your need to run a few simple data pipelines. But when it comes to securing a larger number of business-critical data pipelines, you’ll often need to put a lot more effort into logging and monitoring those pipelines. You will also need to manage authentication and authorization to protect that data when you have servers running workloads for multiple teams and environments. These additional services can require a lot of maintenance effort from your operations team and take time away from development. The goal is running pipelines, not logging, monitoring, or the identity and access management (IAM) service.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

We designed Cloud Data Fusion to take care of most of this work for you. And since it’s part of Google Cloud, you can take advantage of built-in security benefits when using Cloud Data Fusion rather than self-managed CDAP servers:

  • Cloud-native security control with Cloud IAM—Identity management and authentication efforts are taken care of by Cloud Identity
  • Full observability with Stackdriver Logging and Monitoring—Logs include pipeline logs and audit logs
  • Reduced exposure to public internet with private networking

Let’s take a look at these features in detail.

Access control with Cloud IAM

The number one reason to use Cloud Data Fusion over self-managed CDAP servers is that it integrates seamlessly with Cloud IAM. That lets you control access to your Cloud Data Fusion resources. With Cloud IAM, Cloud Data Fusion is able to easily integrate with other Google Cloud services. You can also use Cloud Identity for users and groups management and authentication [such as multi-factor authentication (MFA)], instead of implementing or deploying your own.

Read More  Top 10 Best Free Online Cloud Storage Providers in 2020

There are two predefined roles in Cloud Data Fusion: admin and viewer. As a practice of the IAM principle of least privilege, the admin role should only be assigned to users who need to manage (create and delete) the instances. The viewer role should be assigned to users who only need to access the instances, not manage them. Both roles can access the Cloud Data Fusion web UI to create pipelines and plugins.

Assign roles and permissions to groups with users instead of assigning them to users directly whenever possible. This helps you control users’ access to Cloud Data Fusion resources in a more organized manner, especially when you assign permissions to the groups repeatedly on multiple projects.

Read more about the two Cloud Data Fusion roles and their corresponding permissions.

Private IP instance

The private IP instance of Cloud Data Fusion connects with your Virtual Private Cloud (VPC) privately. Traffic over this network does not go through the public internet, and reduces potential attack surface as a result. You can find more about setting up private IP for Cloud Data Fusion.

VPC Service Controls

We’re also announcing beta support for VPC Service Controls to Cloud Data Fusion. You can now prevent data exfiltration by adding a Cloud Data Fusion instance to your service perimeter. When configured with VPC-SC, any pipeline that reads data from within the perimeter will fail if it tries to write the data outside the service perimeter.

Stackdriver Logging

Stackdriver Logging and Monitoring are disabled by default in Cloud Data Fusion, but we recommend you enable these tools for observability.

Read More  Google Cloud Next 2019 | SCRUM-Japan Genesis; Virtual Sequencing Utilizing Nationwide Cancer Genome Database

With the extra information provided by the logs and metrics, you can not only investigate and respond to incidents faster, but understand how to manage your particular infrastructure and workloads more effectively in the long run. There are a range of logs that can help you run your Cloud Data Fusion pipelines better.

Pipeline logs

These are generated by your pipelines in Cloud Data Fusion. They are useful for understanding and troubleshooting your Cloud Data Fusion pipelines. You can find these logs in the Cloud Data Fusion UI as well as in the Stackdriver logs of the Dataproc clusters that execute the pipelines.

Admin activity audit logs

These logs record operations that modify the configuration or metadata of your resources. Admin activity audit logs are enabled by default and cannot be disabled.

Data access audit logs

Data access audit logs contain API calls that read the configuration or metadata of the resources, as well as user-driven API calls that create, modify, or read user-provided resource data.

Admin activity audit logs and data access audit logs are useful for tracking who accessed or made changes to your Cloud Data Fusion resources. In case there’s any malicious activity, a security admin will be able to find and track down the bad actor in the audit logs.

These Google Cloud features can give you extra control and visibility into your Cloud Data Fusion pipelines. Cloud IAM helps you to control who can access your Cloud Data Fusion resources; private instance minimizes exposure to public internet; and Stackdriver Logging and Monitoring provides information about your workloads, changes in permission, and access to your resources. Together, they create a more secure solution for your data pipeline on Google Cloud.

Read More  Google I/O 2019: Turning Flagships Into Affordable & Mainline

 

Jeanno Cheung

Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster_cloud

Related Topics
  • Cloud Data Fusion
  • Cloud IAM
  • Google Cloud
You May Also Like
OpenAI
View Post
  • Platforms

How We Interact With Information: The New Era Of Search

  • September 28, 2023
View Post
  • Engineering
  • Platforms

Bring AI To Looker With The Machine Learning Accelerator

  • September 28, 2023
View Post
  • Data
  • Engineering
  • Platforms
  • Solutions

How ‘Anything Is Possible’ Automated Data Pipelines With BigQuery And Windsor.ai

  • September 27, 2023
View Post
  • Computing
  • Platforms

Oracle CloudWorld 2023: 6 Key Takeaways From The Big Annual Event

  • September 25, 2023
View Post
  • Platforms

Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes

  • September 21, 2023
View Post
  • Computing
  • Platforms

Start Your Ubuntu Confidential VM With Intel® TDX On Google Cloud

  • September 20, 2023
Microsoft and Adobe
View Post
  • Platforms
  • Solutions

Microsoft And Adobe Partner To Deliver Cost Savings And Business Benefits

  • September 20, 2023
View Post
  • Platforms
  • Technology

Huawei Releases Data Center 2030, Leading Innovation and Development of New Data Centers

  • September 20, 2023

Stay Connected!
LATEST
  • OpenAI 1
    How We Interact With Information: The New Era Of Search
    • September 28, 2023
  • 2
    Bring AI To Looker With The Machine Learning Accelerator
    • September 28, 2023
  • 3
    How ‘Anything Is Possible’ Automated Data Pipelines With BigQuery And Windsor.ai
    • September 27, 2023
  • 4
    Artificial Intelligence, AI, Generative AI, Mercy, Azure OpenAI Service,
    • September 27, 2023
  • 5
    Oracle CloudWorld 2023: 6 Key Takeaways From The Big Annual Event
    • September 25, 2023
  • 6
    Nvidia H100 Tensor Core GPUs Come To Oracle Cloud
    • September 24, 2023
  • 7
    Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes
    • September 21, 2023
  • 8
    Start Your Ubuntu Confidential VM With Intel® TDX On Google Cloud
    • September 20, 2023
  • Microsoft and Adobe 9
    Microsoft And Adobe Partner To Deliver Cost Savings And Business Benefits
    • September 20, 2023
  • Coffee | Laptop | Notebook | Work 10
    First HP Work Relationship Index Shows Majority of People Worldwide Have an Unhealthy Relationship with Work
    • September 20, 2023
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Oracle Expands Distributed Cloud Offerings to Help Organizations Innovate Anywhere
    • September 20, 2023
  • 2
    Huawei Connect 2023: Accelerating Intelligence For Shared Success
    • September 20, 2023
  • 3
    Huawei Releases Data Center 2030, Leading Innovation and Development of New Data Centers
    • September 20, 2023
  • Penguin 4
    How To Find And Fix Broken Packages On Linux
    • September 19, 2023
  • Volkswagen 5
    Volkswagen Races Toward Next-Gen Automotive Manufacturing Leadership With Google Cloud And T-Systems
    • September 19, 2023
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.