Providing Open Access To The Genome Aggregation Database (gnomAD) On Google Cloud

We are excited to announce a collaboration between Google Cloud Healthcare & Life Sciences and the Broad Institute of MIT and Harvard to provide free access to one of the world’s most comprehensive public genomic datasets, the Genome Aggregation Database (gnomAD).

gnomAD brings together data from numerous large-scale sequencing projects, including population and disease-specific genetic studies. With more than 241 million unique short human genetic variants and 335,000 structural variants observed in more than 141,000 healthy adult individuals across a diverse range of genetic ancestry groups, this dataset is a near-ubiquitous resource for human genetics research and clinical variant interpretation. It is used in clinical genetic diagnostic pipelines worldwide.

gnomAD data is hosted in several formats to address a broad range of biomedical and healthcare use cases. This data is available in Hail-formatted tables and Variant Call Format (VCF) files in Google Cloud Storage. This data is also made available in BigQuery as part of the Public Datasets Program. Users receive 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Google Cloud users can securely access this data in any of these formats across all Google Cloud regions through their bioinformatics pipelines on Google Cloud without paying egress charges.

Partner with aster.cloud
for your next big idea.
Let us know here.

From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.

CYBERPOGO.COM :: For the Arts, Sciences, and Technology.

DADAHACKS.COM :: Parenting For The Rest Of Us.

ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.

TAKUMAKU.COM :: For The Hearth And Home.

ASTER.CLOUD :: From The Cloud And Beyond.

LIWAIWAI.COM :: Intelligence, Inside and Outside.

GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.

FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.

ASTERCASTER.COM :: Supra Astra. Beyond The Stars.

BARTDAY.COM :: Prosperity For Everyone.

To make gnomAD available in BigQuery, the Google Cloud team used Variant Transforms to ingest VCF files. Once ingested, the variants were sharded to split the output tables by chromosome. In addition, we utilized integer range partitioning and clustering to reduce the cost of queries. This work enables researchers to explore gnomAD quickly and efficiently, without needing to request or pay for dedicated cloud compute resources. By querying a smaller targeted genomic region, query costs are expected to be reduced significantly compared to querying the whole dataset. This application of Variant Transforms has been leveraged by partners and customers like the Mayo Clinic and Color Genomics to accelerate their genomics research. More information on using gnomAD in BigQuery is available in this tutorial.

The data in the Google Cloud Storage bucket also includes standard truth sets used to assess and validate variant calls, data from the Broad Institute’s papers in Nature, interval lists, and other annotation resources.

To access gnomAD on Google Cloud, explore the documentation here. Files can also be browsed and downloaded using the Cloud Console or the command line tool gsutil. After installing gsutil, start browsing with

$ gsutil ls gs://gcp-public-data--gnomad.

Explore additional Healthcare and Life Sciences dataset offerings on Google Cloud here.

By Johanna Katz, Grace Tiao

Source https://cloud.google.com/blog/topics/healthcare-life-sciences/google-cloud-providing-free-access-to-genome-aggregation-database

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

Providing Open Access To The Genome Aggregation Database (gnomAD) On Google Cloud

aster.cloud

IBM Study: One in Four Malicious Breaches are AI-Enabled, Costing Companies $6 Million on Average

Accelerating the frontiers of scientific discovery: Google’s $40M commitment to the Genesis Mission

3 Questions: Neural transparency and the future of AI design

Intel Invests €5 Billion to Expand Manufacturing in Europe

IBM and Red Hat Expand Lightwell with New Offerings to Build the Trust Infrastructure for AI-Era Open Source

When I Was Young

The Fastest AI Fried Chicken In The World

Zed Approves | How to Stay Cool in Extreme Heat

The AI investment surge hasn’t produced the expected results yet. That could change in 2026

Zed Approves | It’s Prime Day 2026! Time to Upgrade Your World Cup Viewing Setup and Beat the Heat

Most Popular

Zed Approves | The Best Prime Day PC Deals: Top Gaming Rigs, Workstations, and Everyday Laptops

Zed Approves: How to Gear Up for GTA 6 This Amazon Prime Day (2026 Quick Guide)

Father’s Day Outdoors – Build Dad the Ultimate Backyard Watch Party

Father’s Day Outdoors, Round Two – Gear for the Action, the Tailgate, and Beating the Heat

The Ultimate Father’s Day Gift Guide – Home Entertainment Upgrades Dad Actually Wants

Providing Open Access To The Genome Aggregation Database (gnomAD) On Google Cloud

From our partners:

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Related Topics

You May Also Like