Editor’s notes: In this guest blog, we have the pleasure of inviting Alok Pareek, Founder & EVP Products, at Striim to share latest experimental results from a performance study on real-time data integration from Oracle to Google Cloud BigQuery using Striim.

Relational databases like Oracle are designed to store data, but they aren’t well suited for supporting analytics at scale. Google Cloud BigQuery is a serverless, scalable cloud data warehouse that is ideal for analytics use cases. To ensure timely and accurate analytics, it is essential to be able to continuously move data streams to BigQuery with minimal latency.

Partner with aster.cloud
for your next big idea.
Let us know here.

From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.

CYBERPOGO.COM :: For the Arts, Sciences, and Technology.

DADAHACKS.COM :: Parenting For The Rest Of Us.

ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.

TAKUMAKU.COM :: For The Hearth And Home.

ASTER.CLOUD :: From The Cloud And Beyond.

LIWAIWAI.COM :: Intelligence, Inside and Outside.

GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.

FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.

ASTERCASTER.COM :: Supra Astra. Beyond The Stars.

BARTDAY.COM :: Prosperity For Everyone.

The best way to stream data from databases to BigQuery is through log-based Change Data Capture (CDC). Log-based CDC works by directly reading the transaction logs to collect DML operations, such as inserts, updates, and deletes. Unlike other CDC methods, log-based CDC provides a non-intrusive approach to streaming database changes that puts minimal load on the database.

Striim — a unified real-time data integration and streaming platform — comes with out-of-the-box log-based CDC readers that can move data from various databases (including Oracle) to BigQuery in real-time. Striim enables teams to act on data quickly, producing new insights, supporting optimal customer experiences, and driving innovation.

In this blog post, we will outline experimental results cited in Striim’s recent white paper, Real-Time Data Integration from Oracle to Google BigQuery: A Performance Study.

Building a Data Pipeline from Oracle to Google BigQuery with Striim: Components

We used the following components to build a data pipeline to move data between an Oracle database to BigQuery in real time:Oracle CDC AdaptersA Striim adapter is a process that connects the Striim platform to a specific type of external application or file. Adapters enable various data sources to be connected to target systems with streaming data pipelines for real-time data integration.Striim comes with two Oracle CDC adapters to help manage different workloads.

LogMiner-based Oracle CDC Reader uses Oracle LogMiner to ingest database changes on the server side and replicate them to the streaming platform. This adapter is ideal for low and medium workloads.
OJet adapter uses a high-performance log mining API to support high volumes of database changes on the source and replicate them in real time. This adaptor is ideal for high volume high throughput CDC workloads.

With two types of Oracle adapters to choose from, when is it advisable to use one over the other?

Our results show that if your DB workload profile is between 20GB and 80GB of CDC data per hour, the LogMiner based Oracle CDC reader is a good choice. If you work with a higher amount of data, then the OJet adapter is better; currently, it’s the fastest Oracle CDC Reader available. Here’s a table and chart that shows the latency (read-lag) for both adapters:

BigQuery WriterStriim’s BigQuery Writer is designed to save time and storage; it takes advantage of partitioned tables on the target BigQuery system and supports partition pruning in its merge queries.Database WorkloadFor our experiment, we used a custom-built, high-scale database workload simulation. This workload, SwingerMultiOps, is based on Swingbench — a popular workload for Oracle databases. It’s a multithreaded JDBC (Java Database Connectivity) application that generates concurrent DB sessions against the source database. We took the Order Entry (OE) schema of the Swingbench workload. In SwingerMultiOps, we continued to add more tables until we reached a total of 50 tables. Each of these tables comprised of varying data types.

Building the Data Pipeline: Steps

We built the data pipeline for our experiment following these steps:

1. Configure the source database and profile the workload

Striim’s Oracle adapters connect to Oracle server instances to mine for redo data. Therefore it’s important to have the source database instance tuned for optimum redo mining performance. Here’s what you need to keep in mind about the configuration:

Profile the DB workload to measure the load it generates on the source database
Redo log sizes to a reasonably large value of 2G per log group
For the OJet adapter, set a large size for the DB streams_pool_size to mine redo as quickly as possible
For an extremely high CDC data rate of around 150 Gb/hour, set streams_pool_size to 4G

2. Configure the Oracle adapter

For both adapters, default settings are enough to get started. The only configuration required is to set the DB endpoints to read data from the source database. Based on your need, you can use Striim to perform any of the following:

Handle large transactions
Read and write data to a downstream database
Mine from a specific SCN or timestamp

Regardless of which Oracle adapter you choose, only one adapter is needed to collect all data streams from the source database. This practice helps to cut the overhead incurred by both adapters.

3. Configure the BigQuery Writer

Use BigQuery Writer to configure how your data moves from source to database. For instance, you can set your writers to work with a specified dataset to move large amounts of data in parallel.

For performance improvement, you can use multiple BigQuery writers to integrate incoming data in parallel. Using a router ensures that events are distributed such that a single event isn’t sent to multiple writers.

Tuning the number of writers and their properties helps to ensure that data is moved from Oracle to BigQuery in real time. Since we’re dealing with large volumes of incoming streams, we configure 20 BigQuery Writers in our experiment. There are many other BigQuery Writer properties that can help you to move and control data. You can learn about them in detail here.

How to Execute the Striim App and Analyze Results

We used a Google BigQuery dataset to run our data integration infrastructure. We performed the following tasks to run our simulation and capture results for analysis:

Start the Striim app on the Striim server
Start monitoring our app components using the Tungsten Console by passing a simple script
Start the Database Workload
Capture all DB events in the Striim app, and let the app commit all incoming data to the BigQuery target
Analyze the app performance

The Striim UI image below shows our app running on the Striim server. From this UI, we can monitor the app throughput and latency in real time.

Results Analysis: Comparing the Performance of two Oracle Readers

At the end of the DB workload run, we looked at our captured performance data and analyzed the performance. Details are tabulated below for each of the source adapter types.

*LEE => Lag End-to-End

The charts below show how the CDC reader lag varies with the input rate as the workload progresses on the DB server.Lag chart for Oracle Reader:

Lag chart for OJet Reader:

Use Striim to Move Data in Real Time to Google Cloud BigQuery

This experiment showed how to use Striim to move large amounts of data in real time from Oracle to BigQuery. Striim offers two high-performance Oracle CDC readers to support data streaming from Oracle databases. We demonstrated that Striim’s OJet Oracle reader is optimal for larger workloads, as measured by read-lag, end-to-end lag, and CPU and memory utilization. For smaller workloads, Striim’s LogMiner-based Oracle reader offers excellent performance.For more in-depth information, please refer to the white paper, check out a demo, Striim’s Marketplace listing or contact Striim.

By: Sudhir Hasbe (Sr. Director of Product Management, Google Cloud) and Alok Pareek (Founder & EVP Products, Striim)
Source: Google Cloud Blog

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

Real-Time Data Integration From Oracle To Google BigQuery Using Striim

From our partners:

Building a Data Pipeline from Oracle to Google BigQuery with Striim: Components

Building the Data Pipeline: Steps

How to Execute the Striim App and Analyze Results

Results Analysis: Comparing the Performance of two Oracle Readers

Use Striim to Move Data in Real Time to Google Cloud BigQuery

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

aster.cloud

Related Topics

The Summer Adventures : Camping Essentials

Host a static website on AWS with Amazon S3 and Route 53

Prioritize security from the edge to the cloud

6 edge monitoring best practices in the cloud

AlphaGenome: AI for better understanding the genome

Pure Accelerate 2025: All the news and updates live from Las Vegas

‘This was a very purposeful strategy’: Pure Storage unveils Enterprise Data Cloud in bid to unify data storage, management

What is cloud bursting?

There’s a ‘cloud reset’ underway, and VMware Cloud Foundation 9.0 is a chance for Broadcom to pounce on it

What is confidential computing?

Most Popular

Oracle adds xAI Grok models to OCI

Fine-tune your storage-as-a-service approach

Advanced audio dialog and generation with Gemini 2.5

Google Cloud, Cloudflare struck by widespread outages

Global cloud spending might be booming, but AWS is trailing Microsoft and Google

Real-Time Data Integration From Oracle To Google BigQuery Using Striim

From our partners:

Building a Data Pipeline from Oracle to Google BigQuery with Striim: Components

Building the Data Pipeline: Steps

How to Execute the Striim App and Analyze Results

Results Analysis: Comparing the Performance of two Oracle Readers

Use Striim to Move Data in Real Time to Google Cloud BigQuery

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Related Topics

You May Also Like