The Power of Careful Benchmarking
A few months ago, I developed a deep appreciation for meticulous profiling of distributed applications. Our team was working on an ambitious demo for a customer where we had to lower the cost of running a specific workflow by about a half. I had four weeks to pull off this stunt before the end of the trial.
Since workflows are essentially just programs running for multiple hours on cloud computers, one way to lower the cost is by decreasing the execution time of the workflow. Unfortunately, we couldn’t change the customer’s code, so it wasn’t possible to speed up the algorithm. So how do you decrease the cost of something that you can’t dive in and optimize directly?
I started just by running the workflow and seeing what were the most expensive steps. After inspecting the workflow code, I noticed that most of the steps of the workflow were actually single threaded, but they were provisioning multiple cores which blew up the costs of the workflow tremendously.
I had an idea. What if I ran the workflow and recorded the memory and CPU usage of every step of the workflow to see if we are heavily over-provisioning resources? I set up a quick script that monitored workflow task’s pods and graphed their resource usage. The test dataset was quite large, so some steps would take 12-18 hours to finish and the whole workflow would take a couple of days to fully run. Since each workflow execution was very slow, I had to learn a lot in only a few executions before the deal deadline.
After profiling, I was right, some tasks would provision 4 CPUs while only using 1. Other tasks would provision hundreds of gigabytes of memory while only using a fraction. So after heavily cutting the resources for the workflow, it still functioned as intended, but the execution cost decreased by half which helped to close the deal.
What did I learn? Profiling is one of the most useful techniques to speed up algorithms and make their execution cheaper. Also, you can pull off seemingly impossible tasks when you approach the problems from the right angles.
© Taras Priadka.RSS