aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Engineering
  • Tech
  • Technology

How To Deploy The Google Cloud Ops Agent With Ansible

  • aster.cloud
  • January 16, 2022
  • 4 minute read

Site Reliability Engineering (SRE) and Operations teams responsible for operating virtual machines (VMs) are always looking for ways to provide a more reliable, more scalable environment for their development partners. Part of providing that stable experience is having telemetry data (metrics, logs and traces) from systems and applications so you can monitor and troubleshoot effectively. Many Google Cloud services, including Google Compute Engine, provide basic system metrics out of the box. However, if you want in-depth metrics about your VMs or application telemetry, installing the Google Cloud Ops Agent is necessary.

At Cloud Ops we make it easy to install the Ops Agent in our UI on one or a handful of VMs, but installing, configuring, and managing an agent on a fleet of VMs, especially when many are hosting production workloads at an enterprise organization can be incredibly taxing. There are simply too many configuration and provisioning tools and often simply too much complexity. In that vein, we at Cloud Operations want to meet our users where they are in their process of digital transformation. That’s why we’ve introduced support for the most common automation tools in the configuration and provisioning space to deploy the Cloud Ops Agent. This lets our users prioritize automation as a way to reduce operational toil so they can  focus on building and managing reliable and highly performant infrastructure.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

 

Today we’ll be taking a look at how to deploy the Cloud Ops agent in an automated fashion across a fleet of VMs, and in this example we’ll use Ansible. Ansible is a popular open source configuration management tool that provides a lightweight way to get started automating your infrastructure. We’ll also look at a more advanced example, using some templating tools available to streamline your automation code. But first let’s talk a little about what Ansible is, and how it works.

Read More  IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management

What is Ansible, and how does it work?

Ansible is an open source tool written in Python which provides an agentless framework for connecting and interacting with machines. To do this it leverages the native connection protocols for Linux and Windows, SSH and Powershell respectively. The key benefit of using existing connection protocols is that it helps to reduce overhead on the systems, while benefiting from the security of these longstanding and heavily adopted protocols. When working with Ansible, one of the simplest units of work is a playbook:

 

---
- name: Sample playbook
  hosts: localhost
  tasks:
    - ansible.builtin.debug:
        msg: "Hello World!"

 

This really simple playbook runs against your localhost, and executes a task essentially equivalent to echoing “Hello World!”

Deploying the Ops Agent to monitor and troubleshoot VMs

The new Google Cloud Ops Agent makes it really easy to immediately start collecting telemetry data from your systems at a high level. By simply installing the agent we can immediately ingest standard system logs and additional telemetry about the system beyond the defaults, including running processes.

Adding workload specifics to your configuration

Now let’s take a look at a more complex example, like a playbook that will deploy Nginx and a custom configuration for the Ops Agent to collect telemetry.

Here’s what the simple custom configuration file looks like for the Ops Agent, to collect default metrics and logs from Nginx, also written in YAML format:

 

logging:
  receivers:
    nginx_default_access:
      type: nginx_access
    nginx_default_error:
      type: nginx_error
  service:
    pipelines:
      nginx:
        receivers:
          - nginx_default_access
          - nginx_default_error
metrics:
  receivers:
    nginx_metrics:
      type: nginx
      stub_status_url: http://127.0.0.1:80/status
      collection_interval: 60s
  service:
    pipelines:
      nginx_pipeline:
        receivers:
          - nginx_metrics

 

Read More  Reimagining AutoML With Google Research: Announcing Vertex AI Tabular Workflows

And here’s a playbook, specifying the custom `ops_agent.yaml` configuration file in the role:

 

---
- name: Deploy and configure Cloud Ops Agent
  hosts: all
  become: true
  roles:
    - role: googlecloudplatform.google_cloud_ops_agents
      vars:
        agent_type: ops-agent
        version: 1.0.1
        main_config_file: ops_agent.yaml
     notify:
        - Restart Ops Agent

  tasks:
    - name: Install nginx
      ansible.builtin.package: 
        name: nginx
        state: present

    - name: Customize nginx config for telemetry
      ansible.builtin.template:
        src: ansible_templates/status.conf
        dest: /etc/nginx/conf.d/status.conf
      notify:
        - Restart Nginx


    - name: Start nginx
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: yes

    - name: Start Ops Agent
      ansible.builtin.service:
        name: google-cloud-ops-agent
        state: started
        enabled: yes

  handlers:
    - name: Restart Nginx
      ansible.builtin.service:
        name: nginx
        state: restarted
        enabled: yes

    - name: Restart Ops Agent
      ansible.builtin.service:
        name: google-cloud-ops-agent
        state: restarted
        enabled: yes

 

After running this playbook we should have successfully installed NGINX in all hosts within our inventory, and should be submitting both metrics and data from Nginx! To copy the example playbook check out this GitHub sample.

Now it’s time to visualize some of this information! We provide an out of the box dashboard for Nginx, that you can import like so:

 

 

 

And that’s it! Now we can see the metrics we’ve been collecting from Nginx with the Cloud Ops Agent

Get started today

Whether you are managing a handful of VMs or an entire fleet, ensuring robust observability data is available from systems and applications is key to effective monitoring and troubleshooting. With the VM Instances dashboard in Cloud Monitoring, Agent Policies, or use of open source tooling such as Ansible, Chef, Puppet and Terraform, you have many options to install agents on your Google Cloud VMs. The Ops Agent helps you gather data to keep your infrastructure and applications performing their very best, and automating the deployment makes day to day management all that much easier.

Read More  Get To Know Workflows, Google Cloud’s Serverless Orchestration Engine

If you’d like to watch a video where I walk through these steps, check out our YouTube video that demonstrates this blog post, and see the rest of our O11y In Depth playlist!

Or if you’d like to get started with a tutorial, you can also use our Cloud Ops Agent tutorial for Ansible to walkthrough a simple deployment in Google Cloud Shell.

Lastly, if you have feedback or want to ask us questions, drop us a line on the Google Cloud Community Cloud Ops area!

 

 

By: Kyle Benson (Product Manager, Cloud Ops) and Rahul Harpalani (Product Manager)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Ansible
  • Cloud Operations
  • Google Cloud
  • Tutorial
You May Also Like
View Post
  • Computing
  • Multi-Cloud
  • Technology

Pure Accelerate 2025: All the news and updates live from Las Vegas

  • June 18, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

‘This was a very purposeful strategy’: Pure Storage unveils Enterprise Data Cloud in bid to unify data storage, management

  • June 18, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

What is cloud bursting?

  • June 18, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

There’s a ‘cloud reset’ underway, and VMware Cloud Foundation 9.0 is a chance for Broadcom to pounce on it

  • June 17, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

What is confidential computing?

  • June 17, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Oracle adds xAI Grok models to OCI

  • June 17, 2025
View Post
  • Computing
  • Multi-Cloud
  • Technology

Fine-tune your storage-as-a-service approach

  • June 16, 2025
View Post
  • Technology

Advanced audio dialog and generation with Gemini 2.5

  • June 15, 2025

Stay Connected!
LATEST
  • 1
    Pure Accelerate 2025: All the news and updates live from Las Vegas
    • June 18, 2025
  • 2
    ‘This was a very purposeful strategy’: Pure Storage unveils Enterprise Data Cloud in bid to unify data storage, management
    • June 18, 2025
  • What is cloud bursting?
    • June 18, 2025
  • 4
    There’s a ‘cloud reset’ underway, and VMware Cloud Foundation 9.0 is a chance for Broadcom to pounce on it
    • June 17, 2025
  • What is confidential computing?
    • June 17, 2025
  • Oracle adds xAI Grok models to OCI
    • June 17, 2025
  • Fine-tune your storage-as-a-service approach
    • June 16, 2025
  • 8
    Advanced audio dialog and generation with Gemini 2.5
    • June 15, 2025
  • 9
    A Father’s Day Gift for Every Pop and Papa
    • June 13, 2025
  • 10
    Global cloud spending might be booming, but AWS is trailing Microsoft and Google
    • June 13, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • Google Cloud, Cloudflare struck by widespread outages
    • June 12, 2025
  • What is PC as a service (PCaaS)?
    • June 12, 2025
  • 3
    Crayon targets mid-market gains with expanded Google Cloud partnership
    • June 10, 2025
  • By the numbers: Use AI to fill the IT skills gap
    • June 11, 2025
  • 5
    Apple services deliver powerful features and intelligent updates to users this autumn
    • June 11, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.