Introduction
If you are using the COS (Cloud Object Storage, or just “remote storage”) Db2 feature and are planning an upgrade of Db2, then read on!
NOTE: This article specifically discusses AWS S3 remote storage, which in fact happens to be the only available vendor (except the Softlayer) in Db2 version 11.5 (where the COS feature was introduced). And, talking about Softlayer – it apparently didn’t make it to the next Db2 level, v12.1, so probably not worth looking into anyway.
The supported COS vendors in Db2 v12.1 are AWS S3 and Azure (as of v12.1.2), which means S3 is the single COS vendor that can survive a Db2 upgrade from v11.5 to v12.1.
Or can it?
Another reason for discussing just the S3 COS is that I had a recent experience with it, which I want to share with the rest of the community.
Here is/was my situation:
I had to upgrade a live HADR cluster, consisting of two nodes, running Db2 v11.5.7 (on RHEL8.1). However, a previous (failed) upgrade left the Standby node running on the upgraded operating system, RHEL9.6. And there was nothing wrong with that, until I tried to convert the Standby database to a standalone database (in order to perform the Db2 upgrade on that node as quickly as possible, and then take care of the other node and HADR resync). But, when I dismantled the HADR cluster and ran the “rollforward DB” command on the Standby database, it hung. And it hung so badly that no other Db2 commands issued from a command line worked, so I had to “db2_kill” the instance. According to the logs, the COS libraries were not there or could not be loaded, which caused all the havoc.
So, what went wrong here?
What happens to COS during a Db2 upgrade?
In short, nothing, and that exactly is the problem!
Why?
Because the AWS S3 COS libraries are linked to other operating system-specific libraries, and therefore OS-level dependent and most certainly will not work after the OS upgrade.
This becomes immediately apparent when you look at those libraries – here’s the situation with the Db2 v11.5 running on RHEL8.1 (slightly redacted, not showing all output):
cd /opt/ibm/db2/V11.5.7/lib64
ls -al libaws*
libaws-c-common.so -> awssdk/RHEL/8.1/libaws-c-common.so
libaws-c-event-stream.so -> awssdk/RHEL/8.1/libaws-c-event-stream.so
libaws-checksums.so -> awssdk/RHEL/8.1/libaws-checksums.so
libaws-cpp-sdk-core.so -> awssdk/RHEL/8.1/libaws-cpp-sdk-core.so
libaws-cpp-sdk-s3.so -> awssdk/RHEL/8.1/libaws-cpp-sdk-s3.so
libaws-cpp-sdk-transfer.so -> awssdk/RHEL/8.1/libaws-cpp-sdk-transfer.so
The problem should by now be clear:
For Db2 running on RHEL, one of the prerequisites for the installation of Db2 v12.1 is to upgrade the OS to level 9.
So, when you do that and then try to start your instance/database (or, in my case, try to bring the existing database out of the “RFWD pending state”) on the new RHEL level, you will get something similar to the below in the Db2 diagnostic log file (or in my case – a complete hang):
...
MESSAGE : ECF=0x90000076=-1879048074=ECF_LIB_CANNOT_LOAD
Cannot load the specified library
DATA #1 : String, 26 bytes
Unable to load API plug-in
DATA #2 : String, 43 bytes
/home/db2inst1/sqllib/lib/libdb2CosApiS3.so
...
MESSAGE : ZRC=0x870F00B7=-2029059913=SQLO_UNSUPPORTED
"Operation is unsupported."
DATA #1 : String, 34 bytes
Failed to initialize any COS APIs.
...
The Solution
The solution to the above-described problem is, thankfully, very simple.
All you have to do is go to your Db2 LIB directory and re-link the AWS S3 COS libraries to match the current OS level – for example (here, “XX.Y” would be your current Db2 level, and “Z.X” your current RHEL level):
cd /opt/ibm/db2/VXX.Y/lib64
rm libaws*
for i in awssdk/RHEL/Z.X/libaws*
do
ln -sf $i .
doneAlternatively, if you don’t like fiddling with the symbolic links, do the following instead:
- Completely unconfigure and remove the S3 remote storage support
- Upgrade Db2 (together with the prerequisite OS upgrade)
- Now on the new Db2/OS levels, reconfigure the S3 remote storage support
The fundamental problem with all of the above is that none of it is described in the Db2 Knowledge Centre (or, at least, wasn’t at the time when I needed the info).
If this peculiarity with the AWS S3 COS libraries had been mentioned somewhere in the Db2 Upgrade Prerequisite steps, that would have made my life so much easier during the Db2 upgrade. And it will certainly help anyone else who finds themselves in the same situation!
(Somewhat) to IBM Support’s credit, the problem is already described (and the solution provided) in the following technote:
https://www.ibm.com/support/pages/db2start-fails-after-linux-os-upgrade
(also reachable via: https://www.ibm.com/support/pages/node/7180669)
But… to get to this article, you have to search for it using very specific keywords (which you most likely took from the db2diag.log error message, only after the problem has occurred), so this is in fact a reactive solution, not a proactive one (which would have been present in the Db2 docs).
Last, but not the least, many thanks to Ian Bjorhovde (IBM Champion) and Matthew Emmerton (Db2 Dev) for pointing me in the right direction at a time of crisis!