Skip to content
Featured / Lagging Behind or How to Find What Your Backup Image Doesn’t Contain

Lagging Behind or How to Find What Your Backup Image Doesn’t Contain

This is a feature that many of you probably know about but was a bit of a revelation to me.
I had a requirement to do a calculation on a value in one row, using the corresponding value from another row. A quick search of t’internet and I found this page.

(I’m not sure why it’s in the Informix documentation. If you look in the Knowledge Centre for the LAG and LEAD functions, you can find it here) but the Informix pages were a bit clearer to my mind.

 

LAG and LEAD

Anyhow; the point is that these two functions will find values from before the current row (LAG), or after it (LEAD). For my purposes the LAG one was what I needed but both work in a similar fashion.

What I am doing is interrogating the SYSIBMADM.DB_HISTORY view and I am in trying to use this to show me details about my customers backup history. This is partly to make sure it is OK and represents a comprehensive body of data from which to Restore the database to a Point-In-Time should (God forbid) we need to conduct such an exercise.

Our customers, on our advice, tend to have a Full Backup once a week, or fortnight, or at whatever time is appropriate, and then successive Incremental Deltas. This keeps the impact of the backups to a minimum whilst still providing a good robust set of backup images.

 

The Standard View

I’m not going to post the entire query as it’s pretty verbose (contact me if you want a copy) but this is roughly what you see:

Lagging behind Db2

 

Quite a lot of useful information about what happened, when and how long it took but the bit that I want to concentrate here is the “Log Span”. That is a calculation of how many logs are included in the backup image.

The SYSIBMADM.DB_HISTORY view includes two columns: FIRSTLOG and LASTLOG, that show the earliest and latest transaction logs associated with an event (and event is not necessarily a backup; all your transaction log archiving will also be in there, for instance). But if you look at the values in those columns for a Backup operation, it will give you the first and last logs that were included. You can then take the numeric values out of the transaction log file name, turn them into integers and do a calculation to show how many logs were included:

Lagging behind

(NB these are fragments from within the query. They will not work if you simply copy them straight into a query of your own).

 

The Extended View

But what I want to know is how many logs are not included: how many logs must be recovered from the transaction log archive in order to conduct a restore. In other words, what is the gap between the last transaction log included in yesterday’s backup and the first transaction log in todays? I refer to that as the “Log Gap” and this is what I want to see:

Lagging behind

That last log included in Sundays (Full) backup was S1010935 and the first one in Mondays (Delta) backup was S1011556, which means 621 transaction logs will have been created and archived in the intervening period. That represents an overhead in terms of recovery and that is what I want to keep an eye on.

So, what is the SQL required?

 

LAG Syntax

Lagging behind DB2

That is saying find the LASTLOG value from the previous row, but you need to get the details in the OVER clause right. Clearly, you’ll want to sort by the timestamp, or you won’t be looking at the details for the previous row. But remember that the SYSIBMADM.DB_HISTORY view contain many other events than just Backups so you need to PARTITION BY OPERATION, or else you’ll be including other events, such as Log Archiving and your Log Gap will look very odd, as it will be based on Previous Log File details like this (last column).

Gotchas

Apart from that there are some other clauses in the LAG and LEAD functions that might be useful. One such advertised is IGNORE (or RESPECT) NULLS. Chapter and verse from the manual is:

  • If you specify the RESPECT NULLS keywords, rows whose expression evaluates to NULL are included when offset rows are counted.
  • If you specify the IGNORE NULLS keywords, any row whose expression evaluates to NULL is not included when offset rows are counted.

…which might be handy.

But if you just put it in after the column name, you get a failure:

Lagging behind

That’s because the IGNORE NULLS will only be recognised if it is preceded by an expression:

Even if you don’t want to override the default actions. So you would need:

lagging behind db2

 

Conclusion

I didn’t actually need the LEAD function but I think it would be similarly easy to use. Most of the issues I had with incorporating this function into my query were to do with the difficulty of reformatting the value passed back. I can see other places this is going to be of value; any of my monitoring scripts where I take a snapshot of the current values and need to calculate a delta using the last recorded figures. But that will (may) be another blog.

Please let me know if you want a copy of any of the scripting referred to here or if you have other ideas for the use of these functions. You can use my email at mark.gillis@triton.co.uk or just comment on the blog on the right.