Replication Troubleshooting: Log Reader Latency

1. Log Reader Latency script
The script provide below should be run on the publisher server and publishing database. The script will highlight to the DBAs on where the log reader is in the log and how far behind it is to catch up.

-- Find log reader agent position in time and latency

-- RUN ON PUBLISHER SERVER

USE csn_product -- Set publication database

GO

DECLARE @LSN NVARCHAR(25)

DECLARE @SPID INT

IF OBJECT_ID('tempdb.dbo.#temp_dbcc_opentran') IS NOT NULL

DROP TABLE #temp_dbcc_opentran

CREATE TABLE #temp_dbcc_opentran (RowName SYSNAME, Value sql_variant)

INSERT INTO #temp_dbcc_opentran EXEC ('DBCC OPENTRAN WITH TABLERESULTS, NO_INFOMSGS');

SELECT @LSN = CAST(Value as NVARCHAR(128)) FROM #temp_dbcc_opentran WHERE RowName = 'REPL_NONDIST_OLD_LSN'

SELECT @SPID = REPLACE(CAST(Value as NVARCHAR(128)),'s','') FROM #temp_dbcc_opentran WHERE RowName = 'OLDACT_SPID'

IF @LSN = '(0:0:0)'

BEGIN

PRINT 'REPL_NONDIST_OLD_LSN = (0:0:0)'

PRINT 'Please try again'

END

ELSE

SELECT @SPID AS [Open transaction SPID] ,

[Begin Time] ,

DATEDIFF(MINUTE, [Begin Time], GETDATE()) AS [Log reader latency (Minutes)]

FROM ::fn_dblog(REPLACE(REPLACE(@LSN,'(',''),')',''),REPLACE(REPLACE(@LSN,'(',''),')',''))

WHERE [Begin Time] IS NOT NULL

IF EXISTS(SELECT 1 FROM tempdb.sys.tables WHERE NAME LIKE '#temp_dbcc_opentran%')

DROP TABLE #temp_dbcc_opentran

PRINT @SPID

IF @SPID IS NOT NULL

EXEC sp_who2 @SPID


This query will return the latency by session.

You can use EXEC sp_who3 SPID to get the query causing latency.


2. Check Log Reader agent latency based on Mslogreader_history table

On the distribution database, check to see if there are any recent errors recorded in the Mslogreader_history table.

use [DistributorDatabase]

go

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT s.srvname PublisherServer,

ma2.publisher_db PublisherDatabase,

ma2.name as JobName,

CASE mh1.runstatus

WHEN 1 THEN 'Start'

WHEN 2 THEN 'Succeed.'

WHEN 3 THEN 'In progress.'

WHEN 4 THEN 'Idle.'

WHEN 5 THEN 'Retry.'

WHEN 6 THEN 'Fail'

END AS Status,

(mh1.delivery_latency / 1000) AS LatencySeconds,

mh1.Comments

FROM MSlogreader_history mh1 WITH (NOLOCK)

INNER JOIN (SELECT mh1.agent_id, MAX(mh1.time) as maxtime

FROM MSlogreader_history mh1 WITH (NOLOCK)

JOIN MSlogreader_agents ma WITH (NOLOCK) on ma.id = mh1.agent_id

GROUP BY mh1.agent_id) AS mh2 ON mh1.agent_id = mh2.agent_id and mh1.time = mh2.maxtime

INNER JOIN MSlogreader_agents ma2 WITH (NOLOCK) on ma2.id = mh2.agent_id

INNER JOIN MSreplservers s WITH (NOLOCK) ON publisher_id = s.srvid

ORDER BY 1;

3. Determine any open long running transactions

On the publisher server/database, determine if there are any long running transactions. Long running transactions can generate a log of data modifications - > which leads to bigger log files → that requests the log reader to read through.


4. Determine if any blocking occurred

5. Reviewing any SQL Agent job that may be contributing to the log file growth (index maintenance)

6. Terminate Replication Monitor connections to the publisher's distributor server

7. Change the log reader' profile to Cleanup

8. Via DPA, determine if any performance issues going on either the Publisher or Distributor

9. Check the message: Replicated transactions are waiting for next Log backup or for mirroring partner to catch up.
This alert inform us that log reader will wait on transaction log backups to happen.
Need to enable trace flag 1448 to fix the issue:

:CONNECT server

-- Check trace flags enabled

DBCC TRACESTATUS (-1)

-- Enable trace flags

DBCC TRACEON (1448, -1);

DBCC TRACESTATUS (-1)

GO