aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering

Schedule Connectivity Tests For Continuous Networking Reachability Diagnostics

  • aster.cloud
  • October 18, 2022
  • 6 minute read

As the scope and size of your cloud deployments expand, the need for automation to quickly and consistently diagnose service-affecting issues increases in parallel.

Connectivity Tests – part of the Network Intelligence Center capabilities focused on Google Cloud network observability, monitoring, and troubleshooting – help you quickly troubleshoot network connectivity issues by analyzing your configuration and, in some cases, validating the data plane by sending synthetic traffic.  It’s common to start using Connectivity Tests in an ad hoc manner, for example, to determine whether an issue reported by your users is caused by a recent configuration change.  Another popular  use case for Connectivity Tests is to verify that applications and services are reachable post-migration, which helps verify that the cloud networking design is working as intended.  Once workloads are migrated to Google Cloud, Connectivity Tests help prevent regressions caused by mis-configuration or maintenance issues.  As you become more familiar with the power of Connectivity Tests, you may discover different use cases for running Connectivity Tests on a continuous basis.  In this post, we’ll walk through a solution to continuously run Connectivity Tests.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Scheduling Connectivity Tests leverages existing Google Cloud platform tools to continuously execute tests and surface failures through Cloud Monitoring alerts.  We use the following products and tools as part of this solution:

  • One or more Connectivity Tests to check connectivity between network endpoints by analyzing the cloud networking configuration and (when eligible) performing live data plane analysis between the endpoints.
  • A single Cloud Function to programmatically run the Connectivity Tests using the Network Management API, and publish results to Cloud Logging.
  • One or more Cloud Scheduler jobs that run the Connectivity Tests on a continuous schedule that you define.
  • Operations Suite integrates logging, log-based metrics and alerting to surface test results that require your attention.

Let’s get started.

In this example there are two virtual machines running in different cloud regions of the same VPC.

 

Connectivity Tests

We configure a connectivity test to verify that the VM instance in cloud region us-east4 can reach the VM instance in cloud region europe-west1 on port 443 using the TCP protocol.  The following Connectivity Test UI example shows the complete configuration of the test.

 

For more detailed information on the available test parameters, see the Connectivity Tests documentation.

Read More  TELUS And Google Cloud Partner To Create A More Sustainable Future

At this point you can verify that the test passes both the configuration and data plane analysis steps, which tells you that the cloud network is configured to allow the VM instances to communicate and the packets transmitted between the VM instances were successfully passed through the network.

 

Before moving on to the next step, note the name of the connectivity test in URI format, which is visible in the equivalent REST response output:

 

We’ll use this value as part of the Cloud Scheduler configuration in a later step.

Create Cloud Function

Cloud Functions provide a way to interact with the Network Management API to run a connectivity test.  While there are other approaches for interacting with the API, we take advantage of the flexibility in Cloud Functions to run the test and enrich the output we send to Cloud Logging.  Cloud Functions also provide support for numerous programming languages, so you can adapt these instructions to the language of your choice.  In this example, we use Python for interfacing with the Network Management API.

Let’s walk through the high-level functionality of the code.

First, the Cloud Function receives an HTTP request with the name of the connectivity test that you want to execute.  By providing the name of the connectivity test as a variable, we can reuse the same Cloud Function for running any of your configured connectivity tests.

 

if http_request.method != 'GET':
    return flask.abort(
        flask.Response(
            http_request.method +
            ' requests are not supported, use GET instead',
            status=405))
  if 'name' not in http_request.args:
    return flask.abort(
        flask.Response("Missing 'name' URL parameter", status=400))
  test_name = http_request.args['name']

 

Next, the code runs the connectivity test specified using the Network Management API.

 

client = network_management_v1.ReachabilityServiceClient()
  rerun_request = network_management_v1.RerunConnectivityTestRequest(
      name=test_name)
  try:
    response = client.rerun_connectivity_test(request=rerun_request).result(
        timeout=60)

 

And finally, if the connectivity test fails for any reason, a log entry is created that we’ll later configure to generate an alert.

 

if (response.reachability_details.result !=
        types.ReachabilityDetails.Result.REACHABLE):
      entry = {
          'message':
              f'Reran connectivity test {test_name!r} and the result was '
              'unreachable',
          'logging.googleapis.com/labels': {
              'test_resource_id': test_name
          }
      }
      print(json.dumps(entry))

 

There are a couple of things to note about this last portion of sample code:

  • We define a custom label (test_resource_id: test_name) used when a log entry is written.  We’ll use this as part of the logs-based metric in a later step.
  • We only write a log entry when the connectivity test fails.  You can customize the logic for other use cases, for example logging when tests that you expect to fail succeed or writing logs for successful and unsuccessful test results to generate a ratio metric.
Read More  Scaling Heterogeneous Graph Sampling For GNNs With Google Cloud Dataflow

The full example code for the Cloud Function is below.

 

import json
import flask
from google.api_core import exceptions
from google.cloud import network_management_v1
from google.cloud.network_management_v1 import types


def rerun_test(http_request):
  """Reruns a connectivity test and prints an error message if the test fails."""
  if http_request.method != 'GET':
    return flask.abort(
        flask.Response(
            http_request.method +
            ' requests are not supported, use GET instead',
            status=405))
  if 'name' not in http_request.args:
    return flask.abort(
        flask.Response("Missing 'name' URL parameter", status=400))
  test_name = http_request.args['name']
  client = network_management_v1.ReachabilityServiceClient()
  rerun_request = network_management_v1.RerunConnectivityTestRequest(
      name=test_name)
  try:
    response = client.rerun_connectivity_test(request=rerun_request).result(
        timeout=60)
    if (response.reachability_details.result !=
        types.ReachabilityDetails.Result.REACHABLE):
      entry = {
          'message':
              f'Reran connectivity test {test_name!r} and the result was '
              'unreachable',
          'logging.googleapis.com/labels': {
              'test_resource_id': test_name
          }
      }
      print(json.dumps(entry))
    return flask.Response(status=200)
  except exceptions.GoogleAPICallError as e:
    print(e)
    return flask.abort(500)

 

We use the code above and create a Cloud Function named run_connectivity_test.  Use the default trigger type of HTTP and make note of the trigger URL to use in a later step

 

https://us-east4-project6.cloudfunctions.net/run_connectivity_test

 

Under Runtime, build, connections and security settings, increase the Runtime Timeout to 120 seconds.

For the function code, select Python for the Runtime.

For main.py, use the sample code provided above and configure the following dependencies for the Cloud Function in requirements.txt.

 

# Function dependencies, for example:
# package>=version
google-cloud-network-management>=1.3.1
google-api-core>=2.7.2

 

Click Deploy and wait for the Cloud Function deployment to complete.

Cloud Scheduler

The functionality to execute the Cloud Function on a periodic schedule is accomplished using Cloud Scheduler.  A separate Cloud Scheduler job is created for each connectivity test you want to schedule.

The following Cloud Console example shows the Cloud Scheduler configuration for our example.

 

Note that the Frequency is specified in unix-cron format and in our example schedules the Cloud Function to run once an hour.  Make sure you take the Connectivity Tests pricing into consideration when configuring the frequency of the tests.

The URL parameter of the execution configuration in the example below is where we bring together the name of the connectivity test and the Cloud Function trigger from the previous steps.  The format of the URL is

Read More  Expand Your Multicloud Resume With New Courses And Skill Badges

{cloud_function_trigger}?name={connectivity-test-name}

In our example, the URL is configured as:

https://us-east4-project6.cloudfunctions.net/run_connectivity_test?name=projects/project6/locations/global/connectivityTests/inter-region-test-1

 

The following configuration options complete the Cloud Scheduled configuration:

  • Change the HTTP method to GET.
  • Select Add OIDC token for the Auth header.
  • Specify a service account that has the Cloud Function invoker permission for your Cloud Function.
  • Set the Audience to the URL minus the query parameters, e.g.:

https://us-east4-project6.cloudfunctions.net/run_connectivity_test

Logs-based Metric

The Logs-based metric will convert unreachable log entries created by our Cloud Function into a Cloud Monitoring metric that we can use to create an alert. We start by configuring a Counter logs-based metric named unreachable_connectivity_tests.  Next, configure a filter to match the `test_resource_id` label that is included in the unreachable log messages.

The complete metric configuration is shown below.

 

Alerting Policy

The Alerting Policy is triggered any time the logs-based metric increments, indicating that one of the continuous connectivity tests has failed.  The alert includes the name of the test that failed, allowing you to quickly focus your effort on the resources and traffic included in the test parameters.

To create a new Alerting Policy, select the logging/user/unreachable_connectivity_test metric for the Cloud Function resource.

 

Under Transform data, configure the following parameters:

  • Within each time series
    • Rolling window = 2 minutes
    • Rolling window function = rate
  • Across time series
    • Time series aggregation = sum
    • Time series group by = test_resource_id

Next, configure the alert trigger using the parameters shown in the figure below.

 

Finally, configure the Documentation text field to include the name of the specific test that logged an unreachable result.

 

Connectivity Tests provide critical insights into the configuration and operation of your cloud networking environment.  By combining multiple Google Cloud services, you can transform your Connectivity Tests usage from an ad-hoc troubleshooting tool to a solution for ongoing service validation and issue detection.

We hope you found this information useful.  For a more in-depth look into Network Intelligence Center check out the What is Network Intelligence Center? post and our documentation.

 

 

By: Zach Seils (Networking Specialist, Google Cloud) and Maor Itzkovitch (Software Engineer, Google Cloud)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud Function
  • Cloud Scheduler
  • Google Cloud
  • Networking
  • Tutorials
You May Also Like
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025
View Post
  • Data
  • Engineering

Hiding in Plain Site: Attackers Sneaking Malware into Images on Websites

  • January 16, 2025
View Post
  • Computing
  • Design
  • Engineering
  • Technology

Here’s why it’s important to build long-term cryptographic resilience

  • December 24, 2024
IBM and Ferrari Premium Partner
View Post
  • Data
  • Engineering

IBM Selected as Official Fan Engagement and Data Analytics Partner for Scuderia Ferrari HP

  • November 7, 2024
View Post
  • Engineering

Transforming the Developer Experience for Every Engineering Role

  • July 14, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.