aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Big Data
  • Programming
  • Public Cloud

Scale Your EDA Flows: How Google Cloud Enables Faster Verification

  • aster.cloud
  • December 23, 2020
  • 4 minute read

Companies embark on modernizing their infrastructure in the cloud for three main reasons: 1) to accelerate product delivery 2) to reduce system downtime and 3) to enable innovation. Chip designers with Electronic Design Automation (EDA) workloads share these goals, and can greatly benefit from using cloud.

Chip design and manufacturing includes several tools across the flow, with varied compute and memory footprints. Register Transfer Level (RTL) design and modeling is one of the most time consuming steps in the design process, accounting for more than half the time needed in the entire design cycle. RTL designers use Hardware Description Languages (HDL) such as SystemVerilog and VHDL to create a design which then goes through a series of tools. Mature RTL verification flows include static analysis (checks for design integrity without use of test vectors), formal property verification (mathematically proving or falsifying design properties), dynamic simulation (test vector-based simulation of actual designs) and emulation (a complex system that imitates the behavior of the final chip, especially useful to validate functionality of the software stack).


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Dynamic simulation arguably takes up the most compute in any design team’s data center. We wanted to create an easy set up using Google Cloud technologies and open-source designs and solutions to showcase three key points:

  1. How simulation can accelerate with more compute
  2. How verification teams can benefit from auto-scaling cloud clusters
  3. How organizations can effectively leverage the elasticity of cloud to build highly utilized technology infrastructure
1 OpenPiton Tile architecture.jpg
OpenPiton Tile architecture(a) and Chipset Architecture(b)Source: OpenPiton: An Open Source Manycore Research Framework, Balkind et al

We did this using a variety of tools: We used the OpenPiton design verification scripts, Icarus Verilog Simulator, SLURM workload management solution and Google Cloud standard compute configurations.

  • OpenPiton is the world’s first open-source, general-purpose, multithreaded manycore processor and framework. Developed at Princeton University, it’s scalable and portable and can scale up to 500-million cores. It’s wildly popular within the research community and comes with scripts for performing the typical steps in the design flow, including dynamic simulation, logic synthesis and physical synthesis.
  • Icarus Verilog, sometimes known as iverilog, is an open-source Verilog simulation and synthesis tool.
Read More  Build, Deploy, And Scale ML Models Faster With Vertex AI’s New Training Features

Simple Linux Utility for Resource Management or SLURM is an open-source, fault-tolerant and highly scalable cluster management and job scheduling system for Linux clusters. SLURM provides functionality such as enabling user access to compute nodes, managing a queue of pending work, and a framework for starting and monitoring jobs. Auto-scaling of a SLURM cluster refers to the capability of the cluster manager to spin up nodes on demand and shut down nodes automatically after jobs are completed.

2 SLURM Components.jpg
SLURM Components. Source: slurm.schedmd.com/quickstart.html

 

Setup

We used a very basic reference architecture for the underlying infrastructure. While simple, it was sufficient to achieve our goals. We used standard N1 machines (n1-standard-2 with 2 vCPUs, 7.5 GB memory), and set up the SLURM cluster to auto-scale to 10 compute nodes. The reference architecture is shown here. All required scripts are provided in this github repo.

3 architecture.jpg

 

Running the OpenPiton regression

The first step in running the OpenPiton regression is to follow the steps outlined in the github repo and complete the process successfully.

The next step is to download the design and verification files. Instructions are provided in the github repo. Once downloaded, there are three simple setup tasks to perform:

  1. Set up the PITON_ROOT environment variable (%export PITON_ROOT=<location of root of OpenPiton extracted files>)
  2. Set up the simulator home (%export ICARUS_HOME=/usr). The scripts provided to you in the github repo already take care of installing Icarus on the machines provisioned. This shows yet another advantage of cloud: simplified machine configuration.
  3. Finally, source your required settings (%source $PITON_ROOT/piton/piton_settings.bash)

For the verification run, we used the single tile setup for OpenPiton, the regression script ‘sims’ provided in the OpenPiton bundle and the ‘tile1_mini’ regression. We tried two runs—sequential and parallel. The parallel runs were managed by SLURM.

Read More  PyCon 2019 | Writing About Python (Even When You Hate Writing)

We invoked the sequential run using the following command:

%sims -sim_type=icv -group=tile1_mini

And the distributed run using this command:

%sims -sim_type=icv -group=tile1_mini -slurm -sim_q_command=sbatch

 

Results

The ‘tile1_mini’ regression has 46 tests. Running all 46 tile1_mini tests sequentially took an average of 120 minutes. The parallel run for tile1_mini with 10 auto-scaled SLURM nodes completed in 21 minutes—a 6X improvement!

4 View of VM instances on GCP console.jpg
View of VM instances on GCP console; node instances edafarm-compute0-<0-9> are created when the regression is launched
5 View of VM instances on GCP console.jpg
View of VM instances on GCP console when the regression was winding down; notice that the number of nodes has decreased

Further, we wanted to also highlight the value of autoscaling. The SLURM cluster was set up with two static nodes, and 10 dynamic nodes. The dynamic nodes were up and active quite soon after the distributed run was invoked. Since the nodes are shut down if there are no jobs, the cluster auto-scaled to 0 nodes after the run was complete. The additional cost of the dynamic nodes for the time of the simulation was $8.46.

6 Report generated to view compute utilization.jpg
Report generated to view compute utilization of SLURM nodes; notice the high utilization of the top 5 nodes
7 The cost of the extra compute.jpg
The cost of the extra compute can also be easily viewed by the several existing reports on GCP console

The above example shows a simple regression run, with very standard machines. By providing the capability to scale to more than 10 machines, further improvements in turnaround time can be achieved. In real-life, it is common for project teams to run millions of simulations. By having access to elastic compute capacity, you can dramatically reduce the verification process and shave months off verification sign-off.

Other considerations

Typical simulation environments use commercial simulators that extensively leverage multi-core machines and large compute farms. When it comes to Google Cloud infrastructure, it’s possible to build many different machine types (often referred to as “shapes”) with various numbers of cores, disk types, and memory. Further, while a simulation can only tell you whether the simulator ran successfully, verification teams have the subsequent task of validating the results of a simulation. Elaborate infrastructure that captures the simulation results across simulation runs—and provides follow-up tasks based on findings—is an integral part of the overall verification process. You can use Google Cloud solutions such as Cloud SQL and BigTable to create a high-performance, highly scalable and fault-tolerant simulation and verification environment. Further, you can use solutions such as AutoML Tables to infuse ML into your verification flows.

Read More  Sniip Takes The Sting Out Of Paying Bills With The Help Of Google Cloud

Interested? Try it out!

All the required scripts are publically available—no cloud experience is necessary to try them out. Google Cloud provides everything you need, including free Google Cloud credits to get you up and running. Click here to learn more about high performance computing (HPC) on Google Cloud.

 

Source Google Cloud by Sashi Obilisetty Chief Architect, Silicon Solutions | Mark Mims Solutions Architect, Big Data


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud SQL
  • Google Cloud
  • Icarus Verilog
  • OpenPiton
  • SLURM
You May Also Like
View Post
  • Computing
  • Public Cloud
  • Technology

United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services

  • April 15, 2025
DeepSeek R1 is now available on Azure AI Foundry and GitHub
View Post
  • Public Cloud
  • Technology

DeepSeek R1 is now available on Azure AI Foundry and GitHub

  • February 2, 2025
Cloud platforms among the clouds
View Post
  • Computing
  • Learning
  • Public Cloud

Best Cloud Platforms Offering Free Trials for Cloud Mastery

  • December 23, 2024
Vehicle Manufacturing
View Post
  • Hybrid Cloud
  • Public Cloud

Toyota shifts into overdrive: Developing an AI platform for enhanced manufacturing efficiency

  • December 10, 2024
IBM and AWS
View Post
  • Public Cloud

IBM and AWS Accelerate Partnership to Scale Responsible Generative AI

  • December 2, 2024
COP29 AI and Climate Change
View Post
  • Public Cloud
  • Technology

How Cloud And AI Are Bringing Scale To Corporate Climate Mitigation And Adaptation

  • November 18, 2024
Cloud Workstations
View Post
  • Public Cloud

FEDRAMP High Development in the Cloud: Code with Cloud Workstations

  • November 8, 2024
View Post
  • Public Cloud

PyTorch/XLA 2.5: vLLM support and an improved developer experience

  • October 31, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • oracle-ibm 3
    IBM and Oracle Expand Partnership to Advance Agentic AI and Hybrid Cloud
    • May 6, 2025
  • 4
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 5
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 6
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 7
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 8
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 9
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 10
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
  • 2
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 3
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 4
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 5
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.