aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Platforms

There Is No Upside To VM Colocation

  • aster.cloud
  • July 26, 2022
  • 4 minute read
Posted on July 13, 2022

Guest post originally published on the Clockwork blog

TL;DR:


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

  • Contrary to expectation, colocated VMs do not enjoy lower-latency connectivity to each other
  • On network links between colocated VMs, packet drops are just as likely as on non-colocated links
  • For maximum cloud system performance, VM colocation should be avoided

In the previous blog of this series, we discussed how cloud providers’ VM placement algorithms end up creating multiple VMs on the same physical machine, and how the resulting VM colocation can impair network bandwidth.

In this second blog post, we investigate the impact of VM colocation on additional key network performance metrics, namely network latency and network packet drops.

Network Latency

Using Clockwork’s highly accurate clock synchronization, we can identify packets that do not incur any queueing delay on their path from the sender VM to the receiver VM, and measure the intrinsic networking latency. We use the raw measurements to compute two-way delays, defined as the sum of the one-way delays from source to destination and from destination back to source, while ensuring that neither of the two terms is affected by queueing delays. This measurement is different from roundtrip time (ping time), which additionally includes turnaround processing time at the destination VM.

Since colocated VMs can communicate without using physical network links, it is reasonable to assume that their network latency is lower than in the non-colocated case. Let’s see if this assumption holds in reality. To this end, we measure the minimum two-way delay for small UDP packets while the network is at 50% load.

Read More  Rackspace Technology Expands Strategic Relationship With Amazon Web Services

Amazon Web Services (EKS)

Our data demonstrates that in Amazon Web Services, the latency does not significantly differ between the colocated case and the non-colocated case.


Drawing data from 188 EKS clusters of 50 m4.xlarge instances in eu-west-2 clearly shows that two-way delays have the same distribution regardless of colocation.

The same observation holds across 178 EKS clusters in us-east-1. Interestingly, there are two distinct modes near 70μs and 210μs, which may be explained by two different generations of networking software or hardware being used in different parts of the region.

This conclusion holds for all other regions and instance types in EKS that we investigated. The distribution of two-way delays is not affected by colocation. The AWS virtual network implementation hides any potential latency benefit of colocation and yields the same latency regardless of colocation.

Microsoft Azure (AKS)

It turns out that in Azure, contrary to expectations, the communication latency between colocated VMs is actually larger than between non-colocated VMs. Consider the following histogram:


Across 70 clusters in the southeastasia region on Microsoft Azure (AKS), each with 50 Standard_D4s_v3 VMs, two-way delays are consistently larger if the source and destination VMs are colocated. 

We observe the same counterintuitive behavior in all regions and all instance types that we investigated. The explanation lies in Azure’s accelerated networking, which is turned on by default for AKS. Packets destined for a VM on the same physical host take longer because they do not benefit from hardware acceleration. Instead, they are handled as exception packets in the software-based programmable virtual switch (VFP) on the physical host [Link].

Read More  Seven Zero Trust Rules For Kubernetes

Google Cloud Platform (GKE)

In Google Cloud Platform, the latency impact of colocation depends on a combination of region and VM type.

In some cases, links between colocated VMs have higher latency than for non-colocated VMs. This pattern looks very similar to the latency behavior in Microsoft Azure, and may be caused by a similar hardware acceleration that does not apply to colocated VMs.


For n1-standard-4 instances in GCP’s europe-west2-b region, links between colocated VMs often have higher latency than links between non-colocated VMs.

For n2-standard-4 instances in GCP’s asia-southeast1-b region, links between colocated VMs often have slightly lower latencies than links between non-colocated VMs.

For n2-standard-4 instances in GCP’s us-east4-c region, link latency is not affected by colocation.

Network Packet Drops

During a Clockwork Latency Sensei audit, we measure the fraction of packets that are dropped under idle conditions and under moderate (50%) network load. Under idle conditions, typically very few packets are lost. Under moderate load, intermittent congestion causes packet drops up to a rate of several hundreds packets per million (ppm), thus degrading network performance due to retransmissions and shrinking transmission windows.

Does colocation affect the rate of packet drops? For colocated VMs, packets do not pass through physical network hops, thus encountering fewer potential drop occasions. If drops occur mostly within the network (rather than at the edge of the network), then links between colocated VMs should exhibit a lower packet drop rate.

Across thousands of 50-node cluster instances, we observe that colocation does not have a significant effect on packet drop rate, as shown in the table below.

Read More  Google Cloud Next 2019 | State Of The Art: Deploying SAP In The Cloud

Packet drop rate

Links between non-colocated VMs Links between colocated VMs
AKS 68 ppm 60 ppm
EKS 220 ppm 213 ppm
GKE 60 ppm 62 ppm

This indicates that practically all packet drops are caused by overflowing queues that are traversed by links between colocated and non-colocated VMs alike. In other words, most packet drops occur in the virtualization hypervisor, not within the network proper.

VM colocation is bad for business

The analysis in this article and the previous blog post clearly shows that VM colocation has a net negative effect on cloud networking performance: Colocated VMs achieve lower bandwidth, incur the same or higher latency (except in certain special cases in Google Cloud), and do not provide a benefit of lower packet drop rate.

For optimal cloud system performance, colocation should be avoided.

Clockwork Latency Sensei provides visibility into VM colocation. The Latency Sensei audit report indicates which VMs are colocated, and quantifies the performance impact of colocation. Thankfully, in the cloud, once a colocation problem is identified, it is easy to shut down the underperforming VMs and replace them with new, hopefully non-colocated VMs.

 

 

Guest post originally published on Clockwork blog
Source CNCF

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Amazon Web Services
  • Clockwork
  • Cloud Native Computing Foundation
  • Microsoft Azure
  • Virtual Machine
You May Also Like
Google Cloud and Smart Communications
View Post
  • Platforms
  • Technology

Smart Communications, Inc. Dials into Google Cloud AI to Help Personalize Digital Services for Filipinos

  • October 25, 2024
View Post
  • Platforms
  • Public Cloud

Empowering builders with the new AWS Asia Pacific (Malaysia) Region

  • August 30, 2024
Red Hat and Globe Telecoms
View Post
  • Platforms
  • Technology

Globe Collaborates with Red Hat Open Innovation Labs to Modernize IT Infrastructure for Greater Agility and Scalability

  • August 19, 2024
Huawei Cloud Cairo Region Goes Live
View Post
  • Cloud-Native
  • Computing
  • Platforms

Huawei Cloud Goes Live in Egypt

  • May 24, 2024
Asteroid
View Post
  • Computing
  • Platforms
  • Technology

Asteroid Institute And Google Cloud Identify 27,500 New Asteroids, Revolutionizing Minor Planet Discovery With Cloud Technology

  • April 30, 2024
IBM
View Post
  • Hybrid Cloud
  • Platforms

IBM To Acquire HashiCorp, Inc. Creating A Comprehensive End-to-End Hybrid Cloud Platform

  • April 24, 2024
View Post
  • Platforms
  • Technology

Canonical Delivers Secure, Compliant Cloud Solutions for Google Distributed Cloud

  • April 9, 2024
Redis logo
View Post
  • Platforms
  • Software

Redis Moves To Source-Available Licenses

  • April 2, 2024

Stay Connected!
LATEST
  • 1
    Just make it scale: An Aurora DSQL story
    • May 29, 2025
  • 2
    Reliance on US tech providers is making IT leaders skittish
    • May 28, 2025
  • Examine the 4 types of edge computing, with examples
    • May 28, 2025
  • AI and private cloud: 2 lessons from Dell Tech World 2025
    • May 28, 2025
  • 5
    TD Synnex named as UK distributor for Cohesity
    • May 28, 2025
  • Weigh these 6 enterprise advantages of storage as a service
    • May 28, 2025
  • 7
    Broadcom’s ‘harsh’ VMware contracts are costing customers up to 1,500% more
    • May 28, 2025
  • 8
    Pulsant targets partner diversity with new IaaS solution
    • May 23, 2025
  • 9
    Growing AI workloads are causing hybrid cloud headaches
    • May 23, 2025
  • Gemma 3n 10
    Announcing Gemma 3n preview: powerful, efficient, mobile-first AI
    • May 22, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Understand how Windows Server 2025 PAYG licensing works
    • May 20, 2025
  • By the numbers: How upskilling fills the IT skills gap
    • May 21, 2025
  • 3
    Cloud adoption isn’t all it’s cut out to be as enterprises report growing dissatisfaction
    • May 15, 2025
  • 4
    Hybrid cloud is complicated – Red Hat’s new AI assistant wants to solve that
    • May 20, 2025
  • 5
    Google is getting serious on cloud sovereignty
    • May 22, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.