Troubleshooting TMWSuite Database Performance

I’m occasionally asked to troubleshoot database performance in TMWSuite, usually by new customers experiencing slow performance in TMWSuite or preparing to purchase hardware upgrades to ensure they are justified.

Most often I find that poor optimization of the SQL Server and TMWSuite database itself or sub-optimal usage of the TMWSuite platform (e.g. creating load boards in the Card Planner that have more than ~50 drivers) is impacting performance to a moderate degree, but can be remediated with settings on the server and/or changes to TMWSuite configuration.

Additionally, I frequently encounter a small number of poorly written custom tasks causing significant degradation of the system overall. These could be Metrics for ResultsNow, WatchDawg Alerts, SQL Jobs, or Update_Move_PostProcessing hooks that have a large impact on the system as they are executed frequently.

If you suspect you have jobs causing this type of drain, or just want to see what jobs are the heaviest drain on system resources overall, you can utilize the Dynamic Management Views (DMVs) available since SQL 2005.

The following query will extract the top 10 worst performing statements (based on CPU currently, though it may also be worth sorting by Reads/Writes) and display part of the query causing the performance hit, as well as the overall execution plan and statistics.

SELECT	TOP (10)
		[SQL STATEMENT]			= SUBSTRING(QT.[TEXT]
											,(QS.[statement_start_offset] / 2) + 1
											,( (CASE QS.[statement_end_offset] 
													WHEN -1 THEN DATALENGTH(QT.[TEXT])
													ELSE QS.[statement_end_offset]
												END - QS.[statement_start_offset]) / 2 ) + 1)
		,[QUERY PLAN]				= QP.[query_plan]
		,[LAST RUN DATE]			= QS.[last_execution_time]
		,[TOTAL RUN COUNT]		= QS.[execution_count]
		,[TOTAL CPU TIME]			= QS.[total_worker_time]
		,[TOTAL DISK READS]		= QS.[total_logical_reads]
		,[TOTAL DISK WRITES]		= QS.[total_logical_writes]
		,[TOTAL EXEC TIME]		= QS.[total_elapsed_time] / 1000000
		,[LAST CPU TIME]			= QS.[last_worker_time]
		,[LAST DISK READS]		= QS.[last_logical_reads]
		,[LAST DISK WRITES]		= QS.[last_logical_writes]
		,[LAST EXEC TIME]			= QS.[last_elapsed_time] /1000000
FROM	sys.dm_exec_query_stats QS WITH (NOLOCK)
		CROSS APPLY sys.dm_exec_sql_text(QS.[sql_handle]) QT
		CROSS APPLY sys.dm_exec_query_plan(QS.[plan_handle]) QP
ORDER BY [TOTAL CPU TIME] DESC	-- CPU TIME

I typically sort by [TOTAL CPU TIME] to start, as this gives an insight into what jobs require the most effort to process.  Many companies suffer from I/O bottlenecks as well however, so sorting by [TOTAL DISK READS] and [TOTAL DISK WRITES] can also be valuable to isolate these.

Begin troubleshooting by taking [SQL STATEMENT] from the first entry and pasting it into a query window for review.  It likely won’t contain the entire process being executed, but it will hone in on the worst performing part of it (you can use the [QUERY PLAN] to see the entire statement if you need to run it down).

Exhaustively listing all the ways to improve a statement is far in excess of a single post, but here are some of the common problems I see that should be rewritten if at all possible.

Repeated use of Sub-Queries in the SELECT statement

It’s astonishing how often this happens, and how bad it is.  In short, a query is built using the FROM section first, then WHERE, then GROUP BY then SELECT and finally ORDER BY.  When you put a subquery into a SELECT statement direction as a field, you are going to fire that statement for every single row returned, after you already pulled all the data for the query.  This is a significant impact, and should only be done when you absolutely cannot get the data to load in the FROM clause.

Using a Cursor to iterate data or perform calculations

It is possible that a cursor is the best way to do something (typically a FAST_FORWARD cursor only, for an extremely complex Row By Row evaluation of some data).  In 10 years of SQL work, I’ve found one single time that a FAST_FORWARD cursor outperformed every other solution.  Generally speaking though, it is almost never worth the cost to the server to use this approach, especially if there is a set based method for it instead.

Perform extensive reads without READ UNCOMMITTED or NOLOCK

The SQL community will argue over NOLOCK extensively, but with TMWSuite you have a transaction that is extremely heavy on writing data in nearly every operation.  Writing data requires a table lock, and you will see the system slow down enormously if you have tables locked for reading large datasets and it has to wait to complete a write operations.  While there are dangers to using NOLOCK, I find myself placing it on 99% of my TMWSuite queries simply to avoid grinding the system to a halt for complex queries.  Each table should explicitly specify WITH (NOLOCK), or the SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED should be set at the top of the query.

Using Table Variables for large sets of data (>100 Rows)

Table Variables have some advantages of Temporary Tables (supposedly to be entirely in memory, though in practice they are often created in tempdb, less potential conflict due to smaller scope, etc…), however these advantages only really show with very small numbers of rows.  If you expect any chance of a result set having more than 100 rows, use a Temporary Table and evaluate creating indexes on it for faster operations.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s