aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Computing
  • Technology

Google Cloud and NVIDIA expand AI innovation across industries at GTC 2026

  • aster.cloud
  • March 17, 2026
  • 7 minute read

The era of agentic AI is fundamentally changing enterprise infrastructure needs. As organizations build systems capable of dynamic reasoning and autonomous execution, the underlying infrastructure must evolve as well. Scaling these agentic workloads alongside massive mixture-of-experts (MoE) architectures demands a deeply optimized co-engineered stack.

To meet these demands, we’ve built the Google Cloud AI Hypercomputer, an AI-optimized infrastructure as a service, that integrates performance-optimized hardware, leading software, open frameworks, and flexible consumption models into a single, cohesive system to deliver ultra-low latency, high-throughput, and cost-effective inference. To give our customers even more options within this integrated architecture, we are expanding our partnership with NVIDIA.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

This week at NVIDIA GTC 2026, Google Cloud and NVIDIA are expanding our partnership with a wave of new announcements, showcasing a co-engineered AI infrastructure foundation:

  • Infrastructure and hardware
    • Strong momentum for Google Cloud G4 VMs, powered by NVIDIA RTX Pro™ 6000 Blackwell Server Edition
    • Preview of flexible, fractional G4 VMs using NVIDIA vGPU technology — a first in the industry for NVIDIA RTX Pro™ 6000 Blackwell Server Edition
    • Upcoming support for NVIDIA Vera Rubin NVL72 Platform
  • Software and platform
    • NVIDIA Dynamo integration with GKE Inference Gateway
    • Enhanced NVIDIA support across Vertex AI Training and Model Garden
  • Ecosystem
    • Launch of a dedicated public sector AI startup accelerator program

Let’s take a closer look at the announcements.

Accelerating AI workloads with G4 VMs

G4 VMs, powered by NVIDIA RTX Pro 6000 Blackwell Server Edition GPUs, are built to power a diverse spectrum of high-performance workloads — from advanced spatial computing to complete AI development lifecycles. For instance, companies like Otto Group One.O and WPP use the G4 to run physically accurate simulations and real-time 3D rendering at scale.

Beyond simulation, the G4 also shines in model fine-tuning and inference, particularly for models ranging from 30B to more than 100B parameters. By leveraging 4-bit floating point (FP4) precision and Google’s peer-to-peer (P2P) communication, customers are achieving higher throughput for model serving and considerable latency reductions, enabling a new class of real-time, multimodal AI agents and highly responsive generative AI applications.

Here are some examples of how customers are already leveraging the performance and efficiency of G4 VMs to accelerate their most demanding workloads:

“Google Cloud’s G4 VMs give us the scalable GPU backbone we need to push billions of miles of photorealistic simulation through our pipeline. The 4x lift in throughput means our ML teams can iterate faster, train on richer data, and validate edge cases long before our models ever see the real world.” – Sony Mohapatra, Director, AI/ML Engineering, General Motors

“Now with G4 VMs powered by NVIDIA Blackwell, we’re pushing our multimodal models even further — faster inference, better reliability, instant replies across languages. The goal stays the same: making voice agents that work at enterprise scale without compromise. We are excited to keep building together and see what our customers deploy with this.” – Mati Staniszewski, Cofounder, ElevenLabs

Read More  Intel Launches Intel Core 14th Gen Desktop Processors for Enthusiasts

“Google Cloud G4 VMs provide the computational backbone for our Robotic Coordination Layer, allowing us to synchronize autonomous fleets across our logistics centers with millisecond precision. By simulating complex warehouse environments in a high-fidelity digital twin, we can optimize our entire supply chain virtually before a single robot moves on the floor.” – Dr. Stefan Borsutzky, CEO of Otto Group One.O

“After transitioning to G4 VMs, we achieved a 50% reduction in processing latency and 6x increase in throughput just by updating our Terraform scripts. It’s rare to get that kind of performance boost for our core workloads without adding any operational overhead.” – Alfonso Acosta, Head of Engineering, Imgix

Introducing fractional G4 VMs 

We are excited to announce the preview of fractional G4 VMs, providing a highly efficient and cost-effective entry point for AI and graphics workloads. These new configurations, using NVIDIA virtual GPU (vGPU) technology, allow you to leverage the power of the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs in flexible, smaller increments, so you can right-size your infrastructure to match the specific demands of your applications.

“Enterprises need unprecedented flexibility to scale complex, agentic AI workloads. With Google Cloud, we’re introducing fractional G4 VMs powered by NVIDIA RTX PRO 6000 to let customers right‑size GPU capacity and maximize ROI. Together with our co‑engineered stack – from NVIDIA NeMo on Vertex AI to NVIDIA Dynamo with GKE – we’re delivering an open, high‑performance platform for next‑generation reasoning and MoE models.” – Ian Buck, VP / General Manager, Hyperscale and HPC, NVIDIA

By providing more granular access to advanced hardware, fractional G4 VMs let you optimize resource allocation and reduce overhead without sacrificing performance. You can now select from additional GPU slice sizes for your specific needs:

  • 1/2 GPU: Ideal for more intensive tasks such as LLM inference, robotics sensor simulation, and high-fidelity 3D rendering.
  • 1/4 GPU: Optimized for mainstream workloads, including mid-range creative design, video transcoding, and real-time data visualization.
  • 1/8 GPU: Great for lightweight applications such as remote desktops, productivity tools, and entry-level streaming services.

These flexible G4 size portfolio let you:

  • Right-size infrastructure: Precisely match GPU capacity to application demands, ranging from lightweight remote desktops to intensive data processing.
  • Maximize cost efficiency: Lower operational overhead by utilizing — and paying for — only the fractional GPU resources you need for specific tasks.
  • Scale diverse workloads: Power a broad spectrum of innovation, from high-fidelity creative design and streaming to complex robotics simulations and real-time inference.

These fractional G4 VMs can be managed by Google Kubernetes Engine (GKE), allowing developers to use advanced container binpacking to achieve even higher price-performance and resource utilization. When managed through Dynamic Workload Scheduler, you can set fallback priorities for fractional slices. This significantly improves obtainability by allowing the scheduler to automatically find available GPU configurations for each workload.

Read More  Furthering Our AI Ambitions – Announcing Bing Chat Enterprise And Microsoft 365 Copilot Pricing

“The G4 vGPU’s flexible sizing allows us to precisely tailor compute resources to the scale of each molecular simulation, ensuring maximum efficiency across our drug discovery pipeline. This granular control means our researchers can seamlessly pivot between smaller workflows and massive parallel processing without being constrained by fixed hardware configurations.” – Shane Brauner, EVP, CIO, Schrödinger

Scaling AI Hypercomputer with NVIDIA Vera Rubin NVL72

Building on our deep engineering partnership with NVIDIA, we’re proud to support the successor to NVIDIA Blackwell architecture, the recently announced NVIDIA Vera Rubin platform. We plan to be among the first cloud providers to offer NVIDIA Vera Rubin NVL72 rack-scale systems in the second half of 2026, integrating them into our AI Hypercomputer architecture to empower the next generation of reasoning and agentic AI. 

Delivering efficiency across the AI infrastructure stack 

As part of our commitment to a fully open ecosystem, we are excited to announce the integration of Dynamo and GKE Inference Gateway. This integration provides a modular, open-source control plane across the application layer and the hardware. By combining Dynamo with Inference Gateway on GKE, teams can tailor their infrastructure to their exact needs, allowing them to extract the maximum ROI from accelerators, accelerate time-to-market for new AI models, and future-proof their deployments.

You can learn to maximize performance for massive MoE architectures through new advanced scaling recipes for A4X VMs (powered by NVIDIA GB200 NVL72 and Dynamo). These configurations show how to overcome memory and interconnect bottlenecks when running AI inference workloads on AI Hypercomputer.

We are also enhancing resource obtainability through the Dynamic Workload Scheduler, with Calendar Mode and Flex Start for A4X and A4X Max (powered by NVIDIA GB300 NVL72), as well as new Flex Start support for G4 VMs. Dynamic Workload Scheduler lets you reserve the precise capacity that you need, or use flexible start windows. 

Snap, a long-time Google Cloud customer, achieved significant cost savings by migrating two of its primary data processing pipelines to Google Cloud G2 VMs powered by NVIDIA L4 Tensor Core GPUs. This was made possible by leveraging Spark on GKE alongside NVIDIA’s new cuDF libraries, which automated the optimization of its shuffle-heavy workloads for optimal GPU efficiency. Learn more at GTC session S81678. 

Advancing Vertex AI training and Model Garden 

We are meeting the demands of next-generation AI with two major infrastructure advancements to Vertex AI training clusters. First, support for A4X VM domains lets you leverage Vertex AI’s managed infrastructure and framework capabilities for massive-scale training on NVIDIA GB200 NVL72 rack-scale systems. To ensure these intensive workloads remain uninterrupted, new hardware resiliency capabilities let you apply configurable, proactive fault detection scans, which identify and mitigate potential hardware issues before they can disrupt critical “hero” training runs. These capabilities enable higher goodput and helps ensure that multi-week training jobs stay on track without costly restarts.

Read More  Amazing Federated Multicloud Apps

“We are setting a new standard for the agentic enterprise — delivering highly capable, consistent, accurate, and responsive AI agents with Google and NVIDIA. By leveraging Vertex AI training clusters on NVIDIA GB200 NVL72 to power our Agentforce 360 Platform, we’ve eliminated infrastructure bottlenecks to keep our GPUs fully saturated. This high-performance, resilient architecture allows our researchers to focus on innovation at scale, driving substantial gains for our most complex reasoning workloads.” – Silvio Savarese, Chief Scientist, Salesforce

At the same time, we continue to broaden Vertex AI Model Garden with support for NVIDIA’s Nemotron 3 family of open models. These include the Nemotron 3 Nano, featuring one-click deployment to simplify integration into private VPCs. We’ve also expanded our catalog to include the NVIDIA Nemotron 3 Super 120B model for immediate access to high-performance, large-scale reasoning. To maximize the value of these models, we’ve integrated NVIDIA’s latest performance libraries directly into Vertex AI to optimize popular open-source models on NVIDIA TensorRT-LLM. 

Empowering public sector AI startups 

To foster continued innovation within the ecosystem, Google Public Sector and NVIDIA are launching an AI startup accelerator program. This year-long initiative will support a select cohort of AI-focused Independent Software Vendors (ISVs) building solutions for the public sector.

Participants gain dual access to both NVIDIA Inception and Google Cloud’s ISV accelerator resources. Kicking off at GTC and continuing through Google Cloud Next, this joint program will equip emerging technology leaders with the co-engineered infrastructure, technical guidance, and go-to-market support required to scale mission-critical public sector applications. To learn more about the program, please complete the interest form. Additional cohorts will be selected and announced in the future.

Co-engineering collaboration powers every layer of the AI stack

The transition to complex, agentic AI demands more than just raw compute. It requires a fully optimized, co-engineered stack. By integrating flexible hardware like fractional G4 instances and the upcoming Vera Rubin platform into our AI Hypercomputer architecture, and pairing it with deep software co-engineering, we provide the scale, resilience, and efficiency you need to turn your most ambitious AI visions into reality.

Coming to GTC? Stop by booth #513 to learn more and talk to our team. And you can always learn more about our collaboration with NVIDIA at cloud.google.com/NVIDIA.

Source: zedreviews.com


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • AI
  • Google Cloud
  • GTC 2026
  • NVIDIA
You May Also Like
View Post
  • Technology

IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery

  • June 4, 2026
View Post
  • Technology

Banks race to patch new cyber vulnerabilities, and other cybersecurity news

  • May 25, 2026
pope-leo-xiv-cq5dam-1500.844
View Post
  • Technology

Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May

  • May 22, 2026
View Post
  • Technology

Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work

  • May 20, 2026
reMarkable Paper Pure
View Post
  • Gears
  • Technology

Everything The reMarkable Paper Pure Actually Does

  • May 14, 2026
View Post
  • Data
  • Platforms
  • Technology

Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future

  • May 11, 2026
reMarkable Paper Pure
View Post
  • Featured
  • Gears
  • Technology

The Quiet Revolution You Did Not Know You Needed

  • May 9, 2026
View Post
  • Technology

Why The CLOUD Act And Geopolitics Are Forcing A Data Sovereignty Reckoning In Europe

  • May 2, 2026

Stay Connected!
LATEST
  • 1
    IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery
    • June 4, 2026
  • Data center 2
    Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency
    • June 3, 2026
  • 3
    Ink vs Pixels. What you miss versus what you are actually missing.
    • June 1, 2026
  • 4
    Banks race to patch new cyber vulnerabilities, and other cybersecurity news
    • May 25, 2026
  • pope-leo-xiv-cq5dam-1500.844 5
    Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May
    • May 22, 2026
  • 6
    Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work
    • May 20, 2026
  • reMarkable Paper Pure 7
    Everything The reMarkable Paper Pure Actually Does
    • May 14, 2026
  • 8
    Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future
    • May 11, 2026
  • reMarkable Paper Pure 9
    The Quiet Revolution You Did Not Know You Needed
    • May 9, 2026
  • spain-qNO3XMQILTA-unsplash 10
    When the World Feels Unstable, Spain Remains the Calm. Here’s How to Get There Safely.
    • May 2, 2026
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Anthropic Institute 1
    Introducing The Anthropic Institute
    • March 11, 2026
  • 2
    Why The CLOUD Act And Geopolitics Are Forcing A Data Sovereignty Reckoning In Europe
    • May 2, 2026
  • Red Hat OpenShift 3
    Red Hat Further Drives Digital Sovereignty for the AI Era with Red Hat OpenShift on Google Cloud Dedicated
    • April 21, 2026
  • Illustration of data storage 4
    The Splinternet Comes for European Supply Chains Why Fragmentation Is Now a Boardroom Problem
    • April 20, 2026
  • 5
    “A lot of other cloud vendors have been let off the hook”: Oracle leans hard on one-size-fits-all appeal of OCI for enterprises
    • March 30, 2026
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.