aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering

Pub/Sub Group Kafka Connector Now GA: A Drop-In Solution For Data Movement

  • aster.cloud
  • January 3, 2023
  • 4 minute read

We’re excited to announce that the Pub/Sub Group Kafka Connector is now Generally Available with active support from the Google Cloud Pub/Sub team. The Connector (packaged in a single jar file) is fully open source under an Apache 2.0 license and hosted on our GitHub repository. The packaged binaries are available on GitHub and Maven Central.

The source and sink connectors packaged in the Connector jar allow you to connect your existing Apache Kafka deployment to Pub/Sub or Pub/Sub Lite in just a few steps.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Simplifying data movement

As you migrate to the cloud, it can be challenging to keep systems deployed on Google Cloud in sync with those running on-premises. Using the sink connector, you can easily relay data from an on-prem Kafka cluster to Pub/Sub or Pub/Sub Lite, allowing different Google Cloud services as well as your own applications hosted on Google Cloud to consume data at scale. For instance, you can stream Pub/Sub data straight to BigQuery, enabling analytics teams to perform their workloads on BigQuery tables.

If you have existing analytics tied to your on-prem Kafka cluster, you can easily bring any data you need from microservices deployed on Google Cloud or your favorite Google Cloud services using the source connector. This way you can have a unified view across your on-prem and Google Cloud data sources.

The Pub/Sub Group Kafka Connector is implemented using Kafka Connect, a framework for developing and deploying solutions that reliably stream data between Kafka and other systems. Using Kafka Connect opens up the rich ecosystem of connectors for use with Pub/Sub or Pub/Sub Lite. Search your favorite source or destination system on Confluent Hub.

Read More  Workflows Patterns And Best Practices - Part 1

Flexibility and scale

You can configure exactly how you want messages from Kafka to be converted to Pub/Sub messages and vice versa with the available configuration options. You can also choose your desired Kafka serialization format by specifying which key/value converters to use. For use cases where message order is important, the sink connectors transmit the Kafka record key as the Pub/Sub message `ordering_key`, allowing you to use Pub/Sub ordered delivery and ensuring compatibility with Pub/Sub Lite order guarantees. To keep the message order when sending data to Kafka using the source connector, you can set the Kafka record key as a desired field.

The Connector can also take advantage of Pub/Sub’s and Pub/Sub Lite’s high-throughput messaging capabilities and scale up or down dynamically as stream throughput requires. This is achieved by running the Kafka Connect cluster in distributed mode. In distributed mode, Kafka Connect runs multiple worker processes on separate servers, each of which can host source or sink connector tasks. Configuring the `tasks.max` setting to greater than 1 allows Kafka Connect to enable parallelism and shard relay work for a given Kafka topic across multiple tasks. As message throughput increases, Kafka Connect spawns more tasks, increasing concurrency and thereby increasing total throughput.

A better approach

Compared to existing ways of transmitting data between Kafka and Google Cloud, the connectors are a step-change.

To connect Kafka to Pub/Sub or Pub/Sub Lite, one option is to write a custom relay application to read data from the source and write to the destination system. For developers with Kafka experience who want to connect to Pub/Sub Lite, we provide a Kafka Shim Client that can make the task of consuming from and producing to a Pub/Sub Lite topic easier using the familiar Kafka API. This approach has a couple of downsides. It can take significant effort to develop and can be challenging for high-throughput use-cases since there is no out-of-the-box horizontal scaling. You’ll also need to learn to operate this custom solution from scratch and add any monitoring to ensure data is relayed smoothly. Instead there are easier options to build or deploy using existing frameworks.Pub/Sub, Pub/Sub Lite, and Kafka all have respective I/O connectors with Apache Beam. You can write a Beam pipeline using KafkaIO to move data between a cluster Pub/Sub or Pub/Sub Lite and then run it on an execution engine like Dataflow. This requires some familiarity with the Beam programming model, writing code to create the pipeline and possibly expanding your architecture to a supported runner like Dataflow. Using the Beam programming model with Dataflow gives you the flexibility to perform transformations on streams connecting your Kafka cluster to Pub/Sub or to create complex topologies like fan-out to multiple topics. For simple data movement especially when using an existing Connect cluster, however, the connectors offer a simpler experience requiring no development and low-operational overhead.No code is required to set up a data integration pipeline in Cloud Data Fusion between Kafka and Pub/Sub, thanks to plugins that support all three products. Like a Beam pipeline that must execute somewhere, a Data Fusion pipeline needs to execute on a Cloud Dataproc cluster. It is a valid option most suitable for Cloud-native data practitioners who prefer drag-and-drop option in a GUI and who do not manage Kafka clusters directly. If you do manage Kafka clusters already, you may prefer a native solution, i.e., deploying the connector directly into a Kafka Connect cluster between your sources/sinks and your Kafka cluster, for more direct control.To give the Pub/Sub connector a try, head over to the how-to guide.

Read More  Developing And Securing A Platform For Healthcare Innovation With Google Cloud

 


1. Dataflow compute cost. 2. autoscaling. 3. Cloud Data Fusion cost

 

By: Samarth Singal (Software Engineer) and Tianzi Cai (Developer Relations Engineer)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Data Analytics
  • Google Cloud
  • Kafka
  • Pub/Sub
You May Also Like
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.