I've been asked to look at a database/application server that's "performing slow". No one could really tell me what was performing slow (except for the database backups) so this has been a discovery process.
This server is a Dell PowerEdge R720
running Windows Server 2008 R2
and SQL Server 2012 Standard (11.0.2100.60) (x64)
.
-Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz, 2500 Mhz, 6 Core(s), 12 Logical Processor(s)
-16GB RAM
-DELL PERC S110 SCSI Disk Device running as a software RAID 5.
The disk is partitioned into two local drives: C and D. The OS files are all on C. SQL Server is installed on D, and the database files (.mdf and .ldf) are also on D. In addition, we're writing our nightly backups to the D drive! :-)
The database mdf file is around 6GB. Backups are taking 30 minutes.
A quick look at the wait types revealed that the WRITELOG
wait type was the most prevalent - consuming 69%:
![enter image description here][1]
Suspecting transaction log/disk performance issues, I enabled various PerfMon counters and established baseline averages over a couple of days. The PerfMon counters include, but are not limited to:
\PhysicalDisk(0 C: D:)\Avg. Disk sec/Write - Avg = 0.073
\SQLServer:Databases\Log Flush Wait Time - Avg = 71.403
\SQLServer:Databases\Log Flushes/sec - Avg = 1.910
\SQLServer:Databases\Log Bytes Flushed/sec - Avg = 1724.604
The transaction log was originally configured with an initial size of 100MB
and set to grow at 10%
. When I first logged onto the machine, the transaction log had grown to 5GB
across 256 VLFs
suggesting some fragmentation.
Paul Randall's [query][2] against sys.dm_io_virtual_file_stats
reveals some serious write latency with both the data and log files - although contrary to the WRITELOG wait type, the data file write latency is quite a bit more:
Data File: 1369
Log File: 66
Based on the [article][3] by Kimberly Tripp, I "rebuilt" the transaction log and assigned it a size of 2GB (somewhat arbitrarily), yielding 24VLFs.
This seemed to decrease the WRITELOG
wait type percentage some (now at ~61%
) as well as the \SQLServer:Databases\Log Flush Wait Time
(now at ~48ms
).
Bus this still seems high.
I realize that the data and log files (and backups) should all be on a different drives for parallel reads/writes and maximum throughput. Best practices aside, we have other production systems configured the same way as this one, but that are running better, e.g. Avg Log Flush Wait Time = < 3 ms.
What else could I look at to figure out why this server in particular is slower with regards to disk read/write response time and backups?
Could index fragmentation/fill factor settings cause this poor of performance?
What about disk fragmentation?
If I need to tell the client we need another disk, then so be it - I just want to be certain before spending money.
Any other ideas???
[1]: https://i.sstatic.net/DVrKm.png
[2]: http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
[3]: http://www.sqlskills.com/blogs/kimberly/8-steps-to-better-transaction-log-throughput/