Posts in tag

PyTorch


Previously in the PyTorch on Google Cloud series, we trained, tuned and deployed a PyTorch text classification model using Training and Prediction services on Vertex AI. In this post, we will show how to automate and monitor a PyTorch based ML workflow by orchestrating the pipeline in a serverless manner using Vertex AI Pipelines. Let’s …

This article is the final in the three part series to explore the performance debugging ecosystem of PyTorch/XLA on Google Cloud TPU VM. In the first part, we introduced the key concept to reason about the training performance using PyTorch/XLA profiler and ended with an interesting performance bottleneck we encountered in the Multi-Head-Attention (MHA) implementation …

This article is part-II of the series on ‘PyTorch/XLA:Performance Debugging on TPU-VM’. In the previous article we introduced the basic metrics of performance analysis. We used the client side debugging with the PyTorch/XLA profiler to identify how the .equal() operator used inside the Multihead Attention module implementation caused frequent recompilation of the graph causing  the …

In this three part series we explore the performance debugging ecosystem of PyTorch/XLA on Google Cloud TPU VM. TPU VM earlier this year (2021). The TPU VM architecture allows the ML practitioners to work directly on the host where TPU hardware is attached. With the TPU profiler launched earlier this year, debugging your PyTorch training …

We are releasing Opacus, a new high-speed library for training PyTorch models with differential privacy (DP) that’s more scalable than existing state-of-the-art methods. Differential privacy is a mathematically rigorous framework for quantifying the anonymization of sensitive data. It’s often used in analytics, with growing interest in the machine learning (ML) community. With the release of Opacus, …