Measure the exact performance your SAN/RAID array is giving SQL Server

Brett Hawton – 7 September 2010

One of the most significant “grey areas” in SQL Server performance is the IO performance of your SAN, NAS or RAID array. To be fair, it’s not easy for a SAN administrator to juggle limited resources in order to provide consistently good performance to SQL Server’s widely varying IO requests.

If you view SQL Server’s IO demands through a tool such as ProcMon or FileMon one observes the wildly varying IO requests it generates. Interspersed amongst significantly varying rates of 64KB buffer pages reads are 512KB read-ahead reads and lazy writer writes of random dirty pages. Things are no clearer on the transaction log side. Depending on the amount of data altered as part of an implicit or explicit transaction, log writes vary in size from 512 bytes all the way up to multiple 60KB writes (the log never writes more than 60KB at a time (NOT the 64KB most DBA’s believe would be the case).

So we have both varying read and write sizes in terms of number of bytes and wildly varying rates of read and write requests. The trouble is that most disks/RAID arrays/SAN/NAS devices work in IO’s per second. The SAN admin wants to know how many IO’s per second a specific database and its attendant transaction log require. It’s fiendishly difficult to convert the widely varying IO demands of SQL Server (both in IO size and requests per sec) into metrics understandable to an IO subsystem.

Take a look at the TSQL script below. What it does is measure the performance of your IO subsystem (per SQL Server file) over a short period of time (typically 5 minutes). Run it on your SQL Server 2005 or 2008 production system at a really busy time and see what results it generates:

SET NOCOUNT ON

DECLARE @IOStats TABLE (
	[database_id] [smallint] NOT NULL,
	[file_id] [smallint] NOT NULL,
	[num_of_reads] [bigint] NOT NULL,
	[num_of_bytes_read] [bigint] NOT NULL,
	[io_stall_read_ms] [bigint] NOT NULL,
	[num_of_writes] [bigint] NOT NULL,
	[num_of_bytes_written] [bigint] NOT NULL,
	[io_stall_write_ms] [bigint] NOT NULL)
INSERT INTO @IOStats
	SELECT database_id,
		vio.file_id,
		num_of_reads,
		num_of_bytes_read,
		io_stall_read_ms,
		num_of_writes,
		num_of_bytes_written,
		io_stall_write_ms
	FROM sys.dm_io_virtual_file_stats (NULL, NULL) vio
DECLARE @StartTime datetime, @DurationInSecs int
SET @StartTime = GETDATE()
WAITFOR DELAY '00:05:00'
SET @DurationInSecs = DATEDIFF(ss, @startTime, GETDATE())
SELECT DB_NAME(vio.database_id) AS [Database],
	mf.name AS [Logical name],
	mf.type_desc AS [Type],
		(vio.io_stall_read_ms - old.io_stall_read_ms) / CASE (vio.num_of_reads-old.num_of_reads) WHEN 0 THEN 1 ELSE vio.num_of_reads-old.num_of_reads END AS [Ave read speed (ms)],
	vio.num_of_reads - old.num_of_reads AS [No of reads over period],
	CONVERT(DEC(14,2), (vio.num_of_reads - old.num_of_reads) / (@DurationInSecs * 1.00)) AS [No of reads/sec],
	CONVERT(DEC(14,2), (vio.num_of_bytes_read - old.num_of_bytes_read) / 1048576.0) AS [Tot MB read over period],
	CONVERT(DEC(14,2), ((vio.num_of_bytes_read - old.num_of_bytes_read) / 1048576.0) / @DurationInSecs) AS [Tot MB read/sec],
	(vio.num_of_bytes_read - old.num_of_bytes_read) / CASE (vio.num_of_reads-old.num_of_reads) WHEN 0 THEN 1 ELSE vio.num_of_reads-old.num_of_reads END AS [Ave read size (bytes)],
		(vio.io_stall_write_ms - old.io_stall_write_ms) / CASE (vio.num_of_writes-old.num_of_writes) WHEN 0 THEN 1 ELSE vio.num_of_writes-old.num_of_writes END AS [Ave write speed (ms)],
	vio.num_of_writes - old.num_of_writes AS [No of writes over period],
	CONVERT(DEC(14,2), (vio.num_of_writes - old.num_of_writes) / (@DurationInSecs * 1.00)) AS [No of writes/sec],
	CONVERT(DEC(14,2), (vio.num_of_bytes_written - old.num_of_bytes_written)/1048576.0) AS [Tot MB written over period],
	CONVERT(DEC(14,2), ((vio.num_of_bytes_written - old.num_of_bytes_written)/1048576.0) / @DurationInSecs) AS [Tot MB written/sec],
	(vio.num_of_bytes_written-old.num_of_bytes_written) / CASE (vio.num_of_writes-old.num_of_writes) WHEN 0 THEN 1 ELSE vio.num_of_writes-old.num_of_writes END AS [Ave write size (bytes)],
	mf.physical_name AS [Physical file name],
	size_on_disk_bytes/1048576 AS [File size on disk (MB)]
FROM sys.dm_io_virtual_file_stats (NULL, NULL) vio,
	sys.master_files mf,
	@IOStats old
WHERE mf.database_id = vio.database_id AND
	mf.file_id = vio.file_id AND
	old.database_id = vio.database_id AND
	old.file_id = vio.file_id AND
	((vio.num_of_bytes_read - old.num_of_bytes_read) + (vio.num_of_bytes_written - old.num_of_bytes_written)) > 0
ORDER BY ((vio.num_of_bytes_read - old.num_of_bytes_read) + (vio.num_of_bytes_written - old.num_of_bytes_written)) DESC
GO

Your results should look similar to what is shown below. The results are sorted in descending order by read+write activity so the most active files will show first and the least active last.


There are a number of interesting columns which require further explanation however by far the two most important columns are Ave read speed (ms)/Ave write speed (ms) . On a modern high-quality SAN or RAID subsystem these figures should be less than 4 ms on average. Higher values indicate:

  1. The IO subsystem is flooded with requests.
  2. The LUN does not have sufficient devices allocated to it
  3. That you are not using a RAID 10 configuration for writing and either a RAID 5 or RAID 10 configuration for reading.
  4. That you are not using modern 15K disk drives affording low access times
  5. That the devices are not correctly sector aligned
  6. That write-back cache has not been enabled.

The other columns provide information to the IO administrator in order to correct sub-standard read and write speeds:

The number of reads per sec, MB’s read per sec and the average read size will provide the IO administrator with the metrics required to correctly configure the IO subsystem resulting in improved performance. On the write side it’s essentially the same metrics again; number of writes per sec, MB’s written per sec and average write size which will guide the administrator.

So long as the script is run during SQL Server’s busiest period then the results should give an accurate picture of where improvements need to be made and provide the IO administrator with accurate metrics in order to re-configure the IO subsystem in the best possible way.

Comments

Comment from VA_SQL_GUY
Time November 4, 2010 at 1:46 pm

Hello, I do not see the view named in the statement,sys.dm_io_virtual_file_stats. Are there special circumstances for the view to appear (Enterprise Edition etc)?

Comment from Brett
Time November 4, 2010 at 2:11 pm

I had mentioned in the article that this is only for SQL Server 2005 and 2008 http://msdn.microsoft.com/en-us/library/ms190326.aspx

Comment from VA_SQL_GUY
Time November 4, 2010 at 2:37 pm

Brett, I figured that piece out, but I do believe that the script is truncated in the pane above, tables with aliases are not named in the script available etc. It also ends in a comma. Thanks…

Comment from Brett
Time November 4, 2010 at 5:23 pm

Hi There,
Maybe its a browser thing as I copy the TSQL out of the pane just fine. Here is the TSQL:
SET NOCOUNT ON

DECLARE @IOStats TABLE (
[database_id] [smallint] NOT NULL,
[file_id] [smallint] NOT NULL,
[num_of_reads] [bigint] NOT NULL,
[num_of_bytes_read] [bigint] NOT NULL,
[io_stall_read_ms] [bigint] NOT NULL,
[num_of_writes] [bigint] NOT NULL,
[num_of_bytes_written] [bigint] NOT NULL,
[io_stall_write_ms] [bigint] NOT NULL)
INSERT INTO @IOStats
SELECT database_id,
vio.file_id,
num_of_reads,
num_of_bytes_read,
io_stall_read_ms,
num_of_writes,
num_of_bytes_written,
io_stall_write_ms
FROM sys.dm_io_virtual_file_stats (NULL, NULL) vio
DECLARE @StartTime datetime, @DurationInSecs int
SET @StartTime = GETDATE()
WAITFOR DELAY ’00:05:00′
SET @DurationInSecs = DATEDIFF(ss, @startTime, GETDATE())
SELECT DB_NAME(vio.database_id) AS [Database],
mf.name AS [Logical name],
mf.type_desc AS [Type],
(vio.io_stall_read_ms – old.io_stall_read_ms) / CASE (vio.num_of_reads-old.num_of_reads) WHEN 0 THEN 1 ELSE vio.num_of_reads-old.num_of_reads END AS [Ave read speed (ms)],
vio.num_of_reads – old.num_of_reads AS [No of reads over period],
CONVERT(DEC(14,2), (vio.num_of_reads – old.num_of_reads) / (@DurationInSecs * 1.00)) AS [No of reads/sec],
CONVERT(DEC(14,2), (vio.num_of_bytes_read – old.num_of_bytes_read) / 1048576.0) AS [Tot MB read over period],
CONVERT(DEC(14,2), ((vio.num_of_bytes_read – old.num_of_bytes_read) / 1048576.0) / @DurationInSecs) AS [Tot MB read/sec],
(vio.num_of_bytes_read – old.num_of_bytes_read) / CASE (vio.num_of_reads-old.num_of_reads) WHEN 0 THEN 1 ELSE vio.num_of_reads-old.num_of_reads END AS [Ave read size (bytes)],
(vio.io_stall_write_ms – old.io_stall_write_ms) / CASE (vio.num_of_writes-old.num_of_writes) WHEN 0 THEN 1 ELSE vio.num_of_writes-old.num_of_writes END AS [Ave write speed (ms)],
vio.num_of_writes – old.num_of_writes AS [No of writes over period],
CONVERT(DEC(14,2), (vio.num_of_writes – old.num_of_writes) / (@DurationInSecs * 1.00)) AS [No of writes/sec],
CONVERT(DEC(14,2), (vio.num_of_bytes_written – old.num_of_bytes_written)/1048576.0) AS [Tot MB written over period],
CONVERT(DEC(14,2), ((vio.num_of_bytes_written – old.num_of_bytes_written)/1048576.0) / @DurationInSecs) AS [Tot MB written/sec],
(vio.num_of_bytes_written-old.num_of_bytes_written) / CASE (vio.num_of_writes-old.num_of_writes) WHEN 0 THEN 1 ELSE vio.num_of_writes-old.num_of_writes END AS [Ave write size (bytes)],
mf.physical_name AS [Physical file name],
size_on_disk_bytes/1048576 AS [File size on disk (MB)]
FROM sys.dm_io_virtual_file_stats (NULL, NULL) vio,
sys.master_files mf,
@IOStats old
WHERE mf.database_id = vio.database_id AND
mf.file_id = vio.file_id AND
old.database_id = vio.database_id AND
old.file_id = vio.file_id AND
((vio.num_of_bytes_read – old.num_of_bytes_read) + (vio.num_of_bytes_written – old.num_of_bytes_written)) > 0
ORDER BY ((vio.num_of_bytes_read – old.num_of_bytes_read) + (vio.num_of_bytes_written – old.num_of_bytes_written)) DESC
GO

Comment from VA_SQL_GUY
Time November 8, 2010 at 1:10 pm

Brett, thanks for everything.

Comment from alexhatcher
Time January 11, 2011 at 5:03 pm

minor translation issues, ` from ‘ and some funky minus sign. a quick F5 will show what to replace. (Windows and MS and their ‘helpful’ tricks )

Comment from fast.ryder
Time March 2, 2011 at 9:23 am

Hello everyone.

I have kind of a “newb” question, but don’t @variable tables exist only in RAM? How can we measure IO performance on disk while using these tables in RAM?

Cheers,
Ivo Pereira

Write a comment

You need to login to post comments!