Benchmarking Your Dataflow Jobs For Performance, Cost And Capacity Planning
Calling all Dataflow developers, operators and users…So you developed your Dataflow job, and you’re now wondering how exactly will it perform in the wild, in particular: How many workers does it need to handle your peak load and is there sufficient capacity (e.g. CPU quota)? What is your pipeline’s total cost of ownership (TCO), and is there room to optimize performance/cost ratio? Will the pipeline meet your expected service-level objectives (SLOs) e.g. daily volume, event throughput and/or end-to-end latency? To answer all these questions, you need to performance test your pipeline with real data to measure things like throughput and…
Share