Months ago, Jimmy May (@aspiringgeek on Twitter) asked me if I'd used the SQL Server -k startup option. I'd read about it, but never tested it and hadn't seen it deployed on systems I'd worked with. What I'd read to that point had to do with checkpoint throttling. Details on that angle of -k startup option can be found in kba 929240 below.
FIX: I/O requests that are generated by the checkpoint process may cause I/O bottlenecks if the I/O subsystem is not fast enough to sustain the IO requests in SQL Server 2005
https://support.microsoft.com/en-us/help/929240
Now the systems I work on stress SQL Server in lots of corner-case ways :grin: but an overwhelming checkpoint is something I haven't yet observed.
On the other hand, I do see overwhelming tempdb spills . Tempdb sort/hash spill writes are some of the most aggressive writes to come out of SQL Server. Systems susceptible to them are advised to consider how to mitigate performance risk to that very system, as well as mitigating the risk of becoming a noisy neighbor if shared storage or shared plumbing (ESXi server, top-of-rack switch, etc) is involved.
The most common performance interventions for tempdb - trace flag 1117 or equivalent, trace flag 1118 or equivalent, increasing data file count to reduce allocation page contention - do not mitigate the risk posed by a tempdb spill write flood. In fact, since none of the resources for those interventions I am aware of address the underlying windows volume, vHBA, or ESXi host LUN layout for tempdb there is a chance of actions taken to alleviate allocation page contention increasing the risk posed by tempdb spills. More on that another day - io weaving is a topic I'll have to prepare some diagrams for :grin:
Most disk IO throttles are a poor fit for trying to mitigate this risk also. VMware provides SIOC and adaptive queue throttling if using vmdks. Neither work well to tame tempdb write floods without also throttling access to persistent databases. Many storage arrays provide QoS controls at their front end adapters for rate limiting by IOPs or bytes/sec. These limits can apply per initiator (host hadapter) or per target LUN depending on the array model. Per LUN QoS can be ok... but also unwieldy. What about IO governance in Resource Governor? It works per volume!! Yay! But its share-based - rather than limit-based - and will kick in under contention only. So... nope, not that either (but do keep in mind that RG IO governance works per Windows volume - I'll come back to that someday and how it fits into my recommendation NOT to co-locate data files for tempdb and persistent databases on the same Windows volume:wink:).
But here's something tantalizing. A kba about -k startup option initially written for SQL Server 2012. Hmm. It mentions throttling tempdb "work files". Gives an example with checkdb.
Enable the "-k" startup parameter to control the rate that work files can spill to tempdb for SQL Server
https://support.microsoft.com/en-us/help/3133055/enable-the--k-startup-parameter-to-control-the-rate-that-work-files-can-spill-to-tempdb-for-sql-server
Recall that I am using a create index statement with sort_in_tempdb as my proxy for simulating large sort/hash spills. You can see my initial work with that here.
tempdb: "insert into... select" vs "select... into" vs index sort_in_tempdb write behavior
http://sql-sasquatch.blogspot.com/2017/03/tempdb-index-sortintempdb-vs-select.html
So what happens to the create index if the -k option is enabled? I'm glad you asked!
(to be continued...)