Replication Troubleshooting: Log Reader Latency
1. Log Reader Latency script
The script provide below should be run on the publisher server and publishing database. The script will highlight to the DBAs on where the log reader is in the log and how far behind it is to catch up.
-- Find log reader agent position in time and latency
-- RUN ON PUBLISHER SERVER
USE csn_product -- Set publication database
GO
DECLARE @LSN NVARCHAR(25)
DECLARE @SPID INT
IF OBJECT_ID('tempdb.dbo.#temp_dbcc_opentran') IS NOT NULL
DROP TABLE #temp_dbcc_opentran
CREATE TABLE #temp_dbcc_opentran (RowName SYSNAME, Value sql_variant)
INSERT INTO #temp_dbcc_opentran EXEC ('DBCC OPENTRAN WITH TABLERESULTS, NO_INFOMSGS');
SELECT @LSN = CAST(Value as NVARCHAR(128)) FROM #temp_dbcc_opentran WHERE RowName = 'REPL_NONDIST_OLD_LSN'
SELECT @SPID = REPLACE(CAST(Value as NVARCHAR(128)),'s','') FROM #temp_dbcc_opentran WHERE RowName = 'OLDACT_SPID'
IF @LSN = '(0:0:0)'
BEGIN
PRINT 'REPL_NONDIST_OLD_LSN = (0:0:0)'
PRINT 'Please try again'
END
ELSE
SELECT @SPID AS [Open transaction SPID] ,
[Begin Time] ,
DATEDIFF(MINUTE, [Begin Time], GETDATE()) AS [Log reader latency (Minutes)]
FROM ::fn_dblog(REPLACE(REPLACE(@LSN,'(',''),')',''),REPLACE(REPLACE(@LSN,'(',''),')',''))
WHERE [Begin Time] IS NOT NULL
IF EXISTS(SELECT 1 FROM tempdb.sys.tables WHERE NAME LIKE '#temp_dbcc_opentran%')
DROP TABLE #temp_dbcc_opentran
PRINT @SPID
IF @SPID IS NOT NULL
EXEC sp_who2 @SPID
This query will return the latency by session.
You can use EXEC sp_who3 SPID to get the query causing latency.
2. Check Log Reader agent latency based on Mslogreader_history table
On the distribution database, check to see if there are any recent errors recorded in the Mslogreader_history table.
use [DistributorDatabase]
go
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT s.srvname PublisherServer,
ma2.publisher_db PublisherDatabase,
ma2.name as JobName,
CASE mh1.runstatus
WHEN 1 THEN 'Start'
WHEN 2 THEN 'Succeed.'
WHEN 3 THEN 'In progress.'
WHEN 4 THEN 'Idle.'
WHEN 5 THEN 'Retry.'
WHEN 6 THEN 'Fail'
END AS Status,
(mh1.delivery_latency / 1000) AS LatencySeconds,
mh1.Comments
FROM MSlogreader_history mh1 WITH (NOLOCK)
INNER JOIN (SELECT mh1.agent_id, MAX(mh1.time) as maxtime
FROM MSlogreader_history mh1 WITH (NOLOCK)
JOIN MSlogreader_agents ma WITH (NOLOCK) on ma.id = mh1.agent_id
GROUP BY mh1.agent_id) AS mh2 ON mh1.agent_id = mh2.agent_id and mh1.time = mh2.maxtime
INNER JOIN MSlogreader_agents ma2 WITH (NOLOCK) on ma2.id = mh2.agent_id
INNER JOIN MSreplservers s WITH (NOLOCK) ON publisher_id = s.srvid
ORDER BY 1;
3. Determine any open long running transactions
On the publisher server/database, determine if there are any long running transactions. Long running transactions can generate a log of data modifications - > which leads to bigger log files → that requests the log reader to read through.
4. Determine if any blocking occurred
5. Reviewing any SQL Agent job that may be contributing to the log file growth (index maintenance)
6. Terminate Replication Monitor connections to the publisher's distributor server
7. Change the log reader' profile to Cleanup
8. Via DPA, determine if any performance issues going on either the Publisher or Distributor
9. Check the message: Replicated transactions are waiting for next Log backup or for mirroring partner to catch up.
This alert inform us that log reader will wait on transaction log backups to happen.
Need to enable trace flag 1448 to fix the issue:
:CONNECT server
-- Check trace flags enabled
DBCC TRACESTATUS (-1)
-- Enable trace flags
DBCC TRACEON (1448, -1);
DBCC TRACESTATUS (-1)
GO