DB2 for z/OS Locking for Application Developers Part 8
Welcome to part eight of this blog series on DB2 Locking for Application Developers, which is about considerations for coding your application not only for data integrity (the main focus of this series of articles), but also for performance and concurrency, taking into account the transaction isolation levels in effect at run time. This article continues the discussion in part seven about read-only cursors running at a transaction isolation level of cursor stability (CS), supplemented by the fairly common practice of combining read-only cursors with searched updates. I’ll explain later in the article the perils of access-dependent cursors and why you should avoid using them. The article concludes with a brief discussion of the concept of optimistic locking, also called optimistic concurrency control.
I’ll start off, however, with a reminder that if you intend to read rows via a cursor with the intention of updating them (or deciding whether or not to update them), then where possible you should declare the cursor with the FOR UPDATE clause and use positioned updates: UPDATE … WHERE CURRENT OF … . This is because when you read a row via a for-update cursor, DB2 acquires an update (U) lock1 which is held as long as the cursor is positioned on the row, and only released when the cursor is moved off the row without updating it. The U lock allows simultaneous concurrent transactions to acquire S locks but not S or X locks. Of course, if the row is updated, the U lock is promoted at an X lock which is retained until the application reaches a commit point. This mechanism prevents any concurrent transactions from updating the row (or even evaluating the row for update) when the cursor is positioned on it.
Using an update cursor is not always possible, so some applications have to combine a read-only cursor with a searched update (UPDATE … WHERE cola = :hostvar … ). For example:
- ORDER BY is used to ensure that the rows are read by the application in a required order.
- The application coding framework’s default behaviour is to generate read-only cursor declarations combined with searched update statements.
- The application requires maximum concurrency and minimum locking overhead.
Let’s deal with the more obvious integrity problem of using this technique with isolation CS and CURRENTDATA(NO). Because the application cannot know whether or not there is an S lock, there is no lock protecting the row from update by another transaction. The data integrity exposure is a sequence something like the following, where two transactions are trying to update the same row (this is not as rare as might be thought):
Time T0 : Transaction A reads Row Z from Table T without acquiring a lock
Time T1 : Transaction B reads Row Z from Table T without acquiring a lock
Time T2 : Transaction A updates Row Z (Z0 → Z1A) and acquires an X lock
Time T3 : Transaction B tries to update Row Z and is suspended because of an incompatible lock held by Transaction A
Time T4 : Transaction A terminates, issuing a commit, and the X lock is released
Time T5 : Transaction B is resumed, acquires the X lock on Row Z and updates Row Z (Z0 → Z1B) and overwrites Transaction A’s update.
Time T6 : Transaction B terminates, issuing a commit, and the X lock is released
The end result is a lost update anomaly, and logical data corruption.
Switching to CURRENTDATA(YES) is not guaranteed to solve the problem of the exposure to the lost update anomaly. If the result set is sorted or materialised in the workfile database, then by the time the application reads the row, there will be no lock on the row, and the application is still exposed to the lost update anomaly. This applies regardless of the CURRENTDATA setting. Unless you use the correct coding techniques, then the only alternative is the unattractive one (from an OLTP performance/concurrency point of view) of using isolation RR or RS.
But let’s assume that the result set is not sorted or materialised in a workfile or in memory, and that S locks are obtained when the row is read. Taking the example above, where two transactions are trying to update the same row, having read it via a read-only cursor. When Transaction A, which already has an S lock on Row Z, now attempts to promote the lock to an X lock on that row, it is suspended by DB2 because Transaction B holds an incompatible lock on Row Z – an S lock. Nevertheless, Transaction A retains its S lock on Row Z, because the cursor is still positioned on that row. When Transaction B also attempts to promote its S lock to an X lock on Row Z, then it too is suspended, because transaction A holds an incompatible S lock on it. Both transactions are now suspended and without DB2 deadlock detection would wait for an indefinite time for the lock promotion request to succeed. This is an example of how you can get a deadlock on a single row. On the other hand, although the application is vulnerable to deadlocks, data integrity is guaranteed provided that the access path doesn’t change and the result set is materialised in a workfile.
Similar considerations apply to the singleton SELECT. As discussed in the previous article, because there is no S lock on the row by the time it is returned to the application, using a singleton SELECT followed by a searched update with cursor stability isolation is vulnerable to the lost update anomaly, regardless of the CURRENTDATA setting.
To continue the discussion about cursors, it might be tempting to rely on combining ISOLATION(CS) with CURRENTDATA(YES) and an access path that avoids result set materialisation for a read-only cursor (typically with an ORDER BY) to ensure that there is an S lock on the row when the transaction read it, so although the transaction might be vulnerable to lock waits, timeouts or deadlocks, data integrity is nevertheless guaranteed.
For example, consider a cursor-based SELECT in an ISOLATION(CS), CURRENTDATA(YES) package, where the access path uses index which satisfies the predicates, provides implicit ordering for an ORDER BY clause, and ensures that there is an S-lock on the page or row. The searched UPDATE is now safe because the S-lock prevents any other transaction from taking an X-lock on the page or row.
However, if the access path changes for any of many different reasons2, then the assumption that the searched update is safe is now flawed. There is no guarantee that an ORDER BY will be satisfied by an index, but there is a risk that the ORDER BY causes a sort and result set materialisation either in memory or in a workfile. If this happens, then once again, the application is exposed to the lost update anomaly. For this reason, applications which combine read-only cursors with searched updates of rows in the result set should not rely on access path dependent cursors to guarantee data integrity.
This brings us back to the discussion about FOR UPDATE cursors and ordering of the result set. If the application relies on an index to impose ordering on a FOR UPDATE cursor because the columns in the index match the columns in the list of predicates, then that ordering is vulnerable to unwanted or unexpected access changes. Although data integrity is guaranteed because a FOR UPDATE cursor is used to exploit DB2’s locking semantics, if the application logic is dependent on ordering of the rows, then if the ordering is no longer provided because an index with different ordering is selected or access to the rows reverts to a tablespace scan, then the logic of the application is compromised.
Examples include batch programs with intermediate commits and restart logic to resume processing after a failure, or batch programs which reopen their cursors after a commit and reposition on the data. If the application relies on implied ordering, then there is the prospect of data corruption if the implied ordering is not honoured because of an access path change: rows being updated multiple times, or not updated at all.
It’s time to talk about a topic that so far has only been hinted at, optimistic locking or optimistic concurrency control. Optimistic locking assumes that concurrent transactions can run successfully without affecting each other, and that each individual transaction does not need to lock the rows being accessed. DB2 transactions running with page/row level locking, cursor stability isolation and CURRENTDATA(NO) exploit one of the variants of optimistic locking3. For read-only transactions, the question is, can they tolerate accessing data without locking it. For update transactions, an additional consideration is that the row they are updating might have been changed since they read it. And I would argue that any transaction using a read-only cursor combined with searched updates, even with CURRENTDATA(YES), must be coded as if they were using optimistic locking as they do not know with certainty when a lock was taken on the row they are reading – at fetch time or at open cursor time. This takes us back to the earlier points in this article – whether these transactions intentionally use optimistic locking or not, they must be coded as if that is the case, otherwise data integrity is compromised.
In the next article, we at last move on to recommendations for coding update application for data integrity, even when using read-only cursors, using the same techniques that apply to optimistic locking update transactions, to deliver high performance combined with guaranteed data integrity. Arising out of the discussion on data currency and cursors, there are some additional considerations for read-only transactions which probably only affect a tiny minority of applications, but these are discussed in a later article.
1 With the exception of when you declare the cursor using the combined WITH RS/RR and USE AND KEEP EXCLUSIVE LOCKS clauses, in which case an X lock is acquired at fetch time.
2 For example, a REBIND following a RUNSTATS, DB2 maintenance, database schema changes, the index being unavailable for reasons such as being in rebuild pending etc.
3 Some might argue that this is not pure optimistic locking as DB2 might take a lock on a page or row. Whether or not DB2 obtains a lock depends on the effectiveness of the lock avoidance mechanism, that is, whether or not DB2 can determine that the row/page contains committed data. And bear in mind that DB2 will take an intent lock on the tablespace/partition.