I had to benchmark an EC2 instance to see whether a database could be safely moved to it. It is a good practice, which helps avoiding surprises when an instance or its storage are allocated in a noisy neighborhood, where the neighbors use so much resources that it affects the performance of our MySQL database. It is understandable that one can never get very reliable results on EC2, this is a shared environment after all, and that some fluctuations should be expected, however it is still good to know the numbers. I started my benchmarks and everything seemed fine at first, but then sometimes statistics I was getting started looking quite odd.
I was running the benchmarks on a High-CPU Extra Large Instance and couldn't see any reliability in the results at all. I mean, in one moment I was getting poor throughput and horrible response times only to see it improve a lot a few minutes later. I ruled out a possibility that it could be some kind of problems with the storage performance causing this, so I started suspecting MySQL itself - perhaps some sort of contention. But no system statistics, nor any information from MySQL wanted to confirm that theory.
Finally, I went for oprofile and started it during one of the benchmarks and what I saw surprised me:
Note: The Y axis on the right has values in the reverse order for context switches and interrupts.
When oprofile was running MySQL performance nearly doubled. It was also reflected in the CPU utilization which increased accordingly and also got a lot more stable. After a while, I stopped oprofile and the performance dropped again.
At this point I am not sure how to explain this, however this was repeatable. Anyone?