Triton Db2 Geek

Confessions of a DB2 geek

IBM Gold Consultant Program and IBM Premier business Partner

DB2 pureScale - Insert Load & Replication

Published 21 July 2010 - by Clair Ross

Following on from James Gill’s recent blog post about the Coupling Facility today I’m sharing with you our experiences of working with Insert Load & Replication in DB2 pureScale. 

In order to test the load capacity of the IBM DB2 pureScale machine we needed some test data and lots of it!  Being somewhat familiar with the capabilities of the CF on z/OS I knew that a few thousand records would not be enough.

After some checking about, I decided to “stand on the shoulders of giants” and use the EMPLOYEE table from the DB2 SAMPLE database.  Simple, elegant and to the point.  The only problem was that with the current “out of the box” table definition, there would only be 999,999 records that could be produced.

Column nameSchema Data type nameLengthScale Nulls
EMPNO  SYSIBMCHARACTER 6 0 No
FIRSTNME                                                SYSIBMVARCHAR     12  0 No
MIDINIT                                                SYSIBMCHARACTER 10 YES
LASTNAME                                           SYSIBM VARCHAR 150 NO
WORKDEPT             SYSIBMCHARACTER          30 YES
PHONENO                     SYSIBM    CHARACTER   40 YES
HIREDATE                                              SYSIBMDATE 40 YES
JOB                                             SYSIBM  CHARACTER  80 YES
GENDERSYSIBM CHARACTER    10 YES
BIRTHDATE                                      SYSIBMDATE   40 YES
SALARY                                          SYSIBMDECIMAL   92 YES
BONUS                                     SYSIBMDECIMAL 92 YES
COMM                                            SYSIBM    DECIMAL   92 YES

14 record(s) selected.

With a little transformation, we changed the EMPNO to be CHAR(12).  This would allow for the increase of data.  (a possible 999,999,999,999 records)

Column nameSchema Data type nameLengthScale Nulls
EMPNO                               SYSIBM CHARACTER    120 NO
FIRSTNME                                     SYSIBMVARCHAR   120 NO
MIDINIT                          SYSIBM   CHARACTER       10 YES
LASTNAME  SYSIBM VARCHAR 150 NO
WORKDEPT          SYSIBM  CHARACTER   30 YES
PHONENO SYSIBM CHARACTER  40 YES
HIREDATE      SYSIBM  DATE      40 YES
JOB     SYSIBM CHARACTER    80 YES
EDLEVEL  SYSIBM  SMALLINT20 NO
GENDERSYSIBM CHARACTER    10 YES
BIRTHDATE SYSIBM  DATE      40 YES
SALARY    SYSIBM DECIMAL        92 YES
BONUS    SYSIBM  DECIMAL   92 YES
COMM   SYSIBM DECIMAL92 YES

14 record(s) selected. 

We exported the data into a text delimited file (all 41 records), created a table “LIKE” the employee table, then altered the definition.

I wrote a Q&D Perl script to iterate through the delimited data file and reassign the new EMPNO value.  After a little testing, I just started the program and then turned my attention to items.

After a period of time, I checked the progress and we had approximately 8.5 million records.  I then set up a db2batch script to load the data from the single file.

Other loading strategies attempted were made by splitting the 8.5 million records into multiple files and loading from a single member, then from multiple members.   Autocommit was initially on for the first run, but was switched off and commits were performed at every 10K records.  For the multiple member loads the commit count was at every 500 records.

After the initial load, James suggested that the data file be partitioned three ways for the three data members and to test for contention on the single table.

This was set up quickly and we were good to go.

Upcoming Articles

Best Practices for DB2 LUW

 
Website designed and built by Accent Design Geeks