aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
  • Tools
  • About
  • Data

Ingestion As A Service: How Tyson Foods Reimagined Their Data Platform

  • relay
  • March 10, 2022
  • 4 minute read

As data environments become more complex, companies are turning to streaming analytics solutions that analyze data as it’s ingested and deliver immediate, high-value insights into what is happening now. These insights enable decision makers to act in real time to take advantage of opportunities or respond to issues as they occur.

While understanding what is happening now has great business value, forward-thinking companies are taking things a step further, using real-time analytics integrated with artificial intelligence (AI) and business intelligence (BI) to answer the question, “what might happen in the future?” Arkansas-based Tyson Foods has embraced AI/BI analytics to enable predictive insights that unlock new opportunities and drive future growth.

Creating a digital twin for connected intelligence company wide

Before using AI/BI, Tyson’s analytics capabilities consisted of traditional BI solutions focused on KPIs and simplifying data so that humans could understand it. Tyson wanted to leverage its data to uncover ways to improve current processes and grow its business. But with BI alone, Tyson struggled to use data to run the simulations and scenarios essential to make educated decisions. To keep growing, it had to embrace the complexity of its data, building ways to analyze it and use it to inform decision making.

Tyson’s on-premises analytics solutions limited its ability to be aggressive and make intelligent, timely, prescriptive decisions. The solution was to create a digital twin to scale optimizations within business processes, moving from local optimizations to system-wide connected optimizations. Doing so meant shifting entirely to cloud computing, with an initial focus on building the ingestion component of the digital twin platform.

Read More  Digital Realty Research: Budget, Complexity And Connectivity Emerge As Key Challenges For Australian Data Strategies

Investing in a digital twin enabled Tyson to accelerate new capabilities like supply chain simulation “what-if” scenarios, prescriptive price elasticity recommendations, and improvement of customer intimacy.

Solving the ingestion problem for faster time to insights

Before its migration to Google Cloud, analytics projects that Tyson suffered from uncertainty over how to obtain the data. This problem was prolific and caused project times to be extended for weeks or even months due to the need to write and support one-off data ingestion processes at the front end. This problem also prevented the IT team from delivering analytics solutions fast enough for the business to take full advantage of them.

To solve this analytics problem, the team created Data Ingestion Compute Engine (DICE). DICE is a Google Cloud-hosted, open-source, cloud-native ingestion platform developed to provide configuration-based, no-ops, code-free ingestion from disparate enterprise data systems, both internal and external. It is centered on three high-level goals:

  1. Accelerate the speed of delivery of IT analytics solutions
  2. Enable growth of IT capabilities to produce meaningful insight
  3. Reduce long-term total cost of ownership for ingestion solutions

Creating DICE ingestion platform with Google Cloud services

Teams use DICE to set up secure data ingestion jobs in minutes without having to manage complex connections or write, deploy, and support their own code. DICE enables unbound scale, highly parallel processing, DevSecOps, open source, and the implementation of Lambda Data Architecture.

A DICE job is the logical unit of work in the DICE platform, consisting of immutable and mutable configurations persisted as JSON documents stored in Firestore. The job exists as an instruction set for the DICE data engine, which is Apache Beam running Dataflow to instruct which data to pull, how to pull it, how often to pull it, how to process it, when it changes, and where to direct it.

Read More  Announcing Google Cloud Support For Impact Level 5 (IL5) Workloads

Two of DICE’s primary layers include the metadata engine and the data engine. The metadata engine is responsible for the creation and management of DICE job configuration and orchestration. It is made up of many microservices that interact with multiple Google Cloud services, including the job configuration creation API, job build configuration helper API, and job execution scheduler API.

The data engine is responsible for the physical ingestion of data, the change detection processing of that data, and the delivery of that data to specified targets. The data engine is Java code that uses the Apache Beam unified programming model and runs in Dataflow. It is comprised of streaming, jobs, and Dataflow flex template batch jobs. Logically, the data engine is segmented across three layers: the inbound processing layer, the DICE file system layer, and the target processing layer, which takes the data from the DICE file system and moves it to targets.

 

Rolling DICE for thousands of ingestion jobs each day

DICE was first deployed to a production environment in November 2019, and just two years later, it has more than 3,000 data ingestion jobs from more than a hundred disparate data systems, both internal and external to Tyson Foods. Most of these jobs run multiple times a day. On a daily basis the DICE environment sees more than 25,000 Dataflow jobs running and an average of 3.25 terabytes of new data being ingested.

DICE @ Tyson Platform in Numbers

 

DICE supports ingestion from many different types of technologies, including BigQuery, SQL Server, SAP HANA, Postgres, Oracle, MySQL, Db2, various types of file systems, and FTP servers. Additionally, DICE supports target platform technologies for ingestion jobs that include multiple JDBC targets, multiple file system targets, and BigQuery and queue-based store and forward technologies.

Read More  Pro Tools For Pros: Industry Leading Observability Capabilities For Dataflow

The platform continues to see linear growth of DICE jobs, all while keeping platform costs relatively flat. With increasing demand for the platform, Tyson’s IT team is constantly enhancing DICE to support new sources and targets.

This intelligent platform keeps adding new value and makes it simple for Tyson to take advantage of its data. This innovation is a necessity in this fast-changing world of digital business in which companies must transform a high volume of complex data into actionable insight.

 

 

By: Nathan Marks (Staff Engineer, Tyson) and Sagar Kewalramani (Customer Engineer, Google)
Source: Google Cloud Blog

relay

Related Topics
  • Consumer Packaged Goods
  • Data
  • DICE
  • Google Cloud
  • Ingestion as a Service
You May Also Like
View Post
  • Computing
  • Data

Sovereign Clouds Are Becoming A Big Deal Again

  • March 23, 2023
View Post
  • Big Data
  • Data

The Benefits And Core Processes Of Data Wrangling

  • March 17, 2023
mobile-laptop-developer-christina-wocintechchat-com-UTw3j_aoIKM-unsplash
View Post
  • Data
  • Software
  • Solutions

Build Customer Trust Through Secure Front End App Development & Cyber Security

  • March 14, 2023
View Post
  • Data
  • Engineering

Sentiment Analysis With BigQuery ML

  • March 13, 2023
View Post
  • Data
  • Engineering

Rapidly Expand The Reach Of Spanner Databases With Read-Only Replicas And Zero-Downtime Moves

  • March 12, 2023
View Post
  • Data
  • Engineering

Shorten The Path To Insights With Aiven For Apache Kafka And Google BigQuery

  • March 9, 2023
View Post
  • Computing
  • Data

Snowflake’s Telecom Data Cloud Bets On Accelerating Cloud Efficiency

  • March 3, 2023
View Post
  • Data
  • Engineering

10 Free Resources To Become A Health Data Scientist

  • March 1, 2023

Stay Connected!
LATEST
  • 1
    My First Pull Request At Age 14
    • March 24, 2023
  • 2
    AWS Chatbot Now Integrated Into Microsoft Teams
    • March 24, 2023
  • 3
    Verify POST Endpoint Availability With Uptime Checks
    • March 24, 2023
  • 4
    Sovereign Clouds Are Becoming A Big Deal Again
    • March 23, 2023
  • 5
    Ditching Google: The 3 Search Engines That Use AI To Give Results That Are Meaningful
    • March 23, 2023
  • 6
    Pythonic Techniques For Handling Sequences
    • March 21, 2023
  • 7
    Oracle Cloud Infrastructure to Increase the Reliability, Efficiency, and Simplicity of Large-Scale Kubernetes Environments at Reduced Costs
    • March 20, 2023
  • 8
    Monitor Kubernetes Cloud Costs With Open Source Tools
    • March 20, 2023
  • 9
    What Is An Edge-Native Application?
    • March 20, 2023
  • 10
    Eclipse Java Downloads Skyrocket
    • March 19, 2023
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Cloudflare Takes On Online Fraud Detection Market
    • March 15, 2023
  • 2
    Linux Foundation Training & Certification & Cloud Native Computing Foundation Partner With Corise To Prepare 50,000 Professionals For The Certified Kubernetes Administrator Exam
    • March 16, 2023
  • 3
    Cloudflare Democratizes Post-Quantum Cryptography By Delivering It For Free, By Default
    • March 16, 2023
  • 4
    Daily QR “Scan Scams” Phishing Users On Their Mobile Devices
    • March 16, 2023
  • 5
    Lockheed Martin Launches Commercial Ground Control Software For Satellite Constellations
    • March 14, 2023
  • /
  • Platforms
  • Architecture
  • Engineering
  • Programming
  • Tools
  • About

Input your search keywords and press Enter.