Tuesday, June 29, 2010

Copy and Restore Job Errors with Log Shipping

Hopefully this can help some other people out, I couldn't find much on this error when I got it, and after a bit of digging I was able to sort out the problem.

I set up log shipping for a database in SQL 2008, very simple set up, just following the SSMS GUI. The backup job would run fine, but the copy and restore jobs on the secondary server kept failing.

This was the output of the copy job:

Microsoft (R) SQL Server Log Shipping Agent
[Assembly Version = 10.0.0.0, File Version = 10.0.1600.22 ((SQL_PreRelease).080709-1414 )]
Microsoft Corporation. All rights reserved.

2010-06-29 09:41:52.53 ----- START OF TRANSACTION LOG COPY -----
2010-06-29 09:41:52.80 *** Error: Could not retrieve copy settings for secondary ID '[removed]'.(Microsoft.SqlServer.Management.LogShipping) ***
2010-06-29 09:41:52.81 *** Error: The specified agent_id [removed] or agent_type 1 do not form a valid pair for log shipping monitoring processing.(.Net SqlClient Data Provider) ***
2010-06-29 09:41:52.82 *** Error: Could not log history/error message.(Microsoft.SqlServer.Management.LogShipping) ***
2010-06-29 09:41:52.82 *** Error: The specified agent_id [removed] or agent_type 1 do not form a valid pair for log shipping monitoring processing.(.Net SqlClient Data Provider) ***
2010-06-29 09:41:52.83 *** Error: Could not cleanup history.(Microsoft.SqlServer.Management.LogShipping) ***
2010-06-29 09:41:52.84 *** Error: The specified agent_id [removed] or agent_type 1 do not form a valid pair for log shipping monitoring processing.(.Net SqlClient Data Provider) ***
2010-06-29 09:41:52.84 ----- END OF TRANSACTION LOG COPY -----

Exit Status: 1 (Error)

As you can see, the same error message was the Error 32016: The specified agent_id %s or agent_type %d do not form a valid pair for log shipping monitoring processing.

I ended up doing a trace on the primary server whilst running the job, and was able to see that the error was being thrown by this code:

if (sys.fn_MSvalidatelogshipagentid(@agent_id, @agent_type) = 0)
begin
select @agent_idstring = cast(@agent_id as sysname)
raiserror(32016, 16, 1, @agent_idstring, @agent_type)
return 1
end

The function sys.fn_MSvalidatelogshipagentid returns either 1 or 0 using the following code:

return
case
when
((@agent_type = 0) and
exists (select * from msdb.dbo.log_shipping_monitor_primary
where primary_id = @agent_id))
then 1
when
((@agent_type in (1,2)) and
exists (select * from msdb.dbo.log_shipping_monitor_secondary
where secondary_id = @agent_id))
then 1
else 0
end

Now as I knew the agent type was either 1 (copy) or 2 (restore), so I looked at the table msdb.dbo.log_shipping_monitor_secondary on the primary server, which was empty, hence the function returning 0.

After a bit of banging my head on the desk as to why this function was running on the primary server, as that table is only meant to be populated on the secondary server, I had a look at the job to see how it was calling the sqllogship.exe program:

"D:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqllogship.exe" -Copy [ID_removed] -server SRV01

where SRV01 is the name of the primary server.

So I changed the -server parameter over to the secondary server name (SRV02), re-ran the job, and it worked!

Reading the BOL about sqllogship.exe, it states for the -server parameter:

For -copy or -restore, instance_name must be the name of a secondary server in a log shipping configuration.

Which makes sense why the jobs now ran OK as they were now using the correct parameter value.

To summarize: If you are getting error 32016 The specified agent_id %s or agent_type %d do not form a valid pair for log shipping monitoring processing check the command of the Copy and Restore jobs, that the -server parameter is set to the secondary server name.

The weird thing is that log shipping was setup via SSMS, I just filled in the details and the jobs were created automatically - so SQL Server itself had put the parameter of SRV01 there on its own accord. I can't see how something I did made that parameter be the wrong value. Maybe I did do something, but I did recreate the log shipping several times and double-checked what information I had put in.

I'd be interested to hear if anyone else has had this issue where the primary server name ends up in the copy and restore jobs instead of the secondary server name, and how you set up your log shipping.

24 comments:

  1. Thanks you a are life savior, looked up all google to resolve it and your small change in Copy Job, worked like charm.

    ReplyDelete
  2. I was on the phone with someone for three hours trying to figure this out. This post majorly saved me. Thank you so much for posting this it was the exact fix needed.

    ReplyDelete
  3. Your post was exactly what I needed to fix exactly this problem (banging my head for a couple of hours). Well done! Thank God for Google!

    ReplyDelete
  4. Very good trouble shooting, I had similar issue and was able to fix it by chaging the secorndary server name.
    Is this a SQL server bug.

    ReplyDelete
  5. Thanks This saved me a major headache

    ReplyDelete
  6. Thanks for this post it was very helpful. Although I was getting a similar error, the cause wasn't what you explained above. Rather, in my case, the [log_shipping_monitor_secondary] and [log_shipping_secondary_databases] tables had not been properly updated. I had to insert the rows manually. I've been having lots of small issues setting up log shipping. I've noticed that if you make a mistake while setting it up (i.e. a mistyped folder) the system tables won't always be cleaned up properly.

    ReplyDelete
    Replies
    1. Great Thomas ! Your solution helped me to fix this issue.

      Delete
    2. Thanks Thomas Mucha, I also found this post useful however your comment was the key to finding my problem. I had been trying to add an additional Secondary server/database to an existing log shipping configuration via SSMS (2014). Turns out it doesn't do that very well and forgot to write some important entries out to the log shipping/monitoring tables etc. The trick for me was to remove the existing log shipping configuration and set it up from scratch with both secondary servers all at the same time. Hope that helps someone out.

      Cheers,

      Lukas (http://www.cloudjourneyman.com/)

      Delete
  7. Very Nice.. Really helped me in getting the issue resolved.. Super :)

    ReplyDelete
  8. Thank you so much, this really helped a lot and fixed the issue in no time.

    ReplyDelete
  9. Thanks been banging my head since past 24 hours....this really helped!

    ReplyDelete
  10. Thanks!!!

    Saved me wasting a day...

    ReplyDelete
  11. Even I got the same error, I have job monitor on the secondary server and I went on to the activity monitor and found that the parameter is secondary sqlserver name.How to resolve it

    ReplyDelete
  12. 6 years down the track and this post is still helping people out. Thanks.

    ReplyDelete
    Replies
    1. This post just helped me as well. I had the exact same issue on SQL Server 2012. Thanks much.

      Delete
  13. Thanks, saved us a lot of work in our department. Now we can have a weekend!

    ReplyDelete
  14. That worked for me like magic!!!!Thanx from Zimbabwe.

    ReplyDelete
  15. I have read a few of the articles on your website now, and I really like your style of blogging. I added it to my favorites blog site list and will be checking back soon. Please check out my site as well and let me know what you think. gépszállítás Europa-Road Kft

    ReplyDelete
  16. I've scripted LS creation on Windows 2022 and SQL 2019 EE, and the result was exactly the same. Both secondaries were referencing the primary server on copy and restore jobs. Is MS planning to pay attention to the bugs that are lingering around across multiple versions? I just quit that company for that very reason.

    ReplyDelete