Monster VMs & the dangers of vCPU hotplug

By | September 7, 2015
Turning on Memory Hot Add and CPU Hot Plug can be a great way to add more resources to a constrained VM without needing to reboot your guest (if the guest OS allows it):
090715_1927_MonsterVMst1.jpg
However you can run into efficiency problems if you do it willy-nilly (yes, I said willy-nilly).  I’m working with a large environment where CPU is the most constrained resource.  As such we look at every opportunity to improve performance.  Let me at this time recommend Michael Corey’s *excellent* book on tuning virtualized SQL machines.  This book definitely helped me out in a number of areas:
Scenario:  In talking to the SQL DBAs here (we’re a windows shop) we came up with a little test to validate the difference that vNUMA makes.
1.  On the SQL VM we bumped the # of vCPUs to 16 (anything over 8 will work without modifying the host by changing the numa.vcpu.min setting, but we wanted to go overboard).  We have several hundred VMs that are over 8 vCPUs, so this is an important topic.
2.  Make sure “Max degree of parallelism” is set to 0 so that the SQL server uses all procs
3.  Turn ON vCPU hotplug to kill NUMA awareness
4.  Run coreinfo (another excellent tool from Mark Russinovich) to make sure NUMA isn’t on:
090715_1927_MonsterVMst3.png
5.  Run the following SQL query to do a countdown & time it:
DECLARE    @T DATETIME, @F BIGINT, @cnt int =1000000
SET @T = GETDATE();
WHILE @cnt>0
begin
SET    @F=POWER(2,30);
set @cnt=@cnt-1
end
print DATEdiff(millisecond,@T,getdate())
090715_1927_MonsterVMst4.png
6.  Then turn OFF vCPU hotplug to make the server NUMA aware:
090715_1927_MonsterVMst5.jpg
7.  and re-run the query, which does in fact run faster:
090715_1927_MonsterVMst6.png
The above query doesn’t hammer the CPUs as much as I’d like (If you know of a better one please share in the comments!) so we’re still doing more research to find out exactly how much performance you lose if you leave hotplug on.