Skip to content
Featured / AI Query Optimization

AI Query Optimization

AI Query Optimization

Enabling and monitoring the new feature

Db2 v12.1 has introduced AI enabled query optimization (or optimisation, depending on which side of the pond you’re on). This initial offering is aimed at estimating cardinality more accurately and more thoroughly than is easily possible using RUNSTATS. It will also give more accurate results by modelling cardinality for predicates with ranges, BETWEEN, IN and OR predicates but

NB it will leave predicates with expressions or parameter markers to the traditional optimiser

Unless REOPT is used. This can be a bit of a headache, and you will find using db2exfmt to examine your SQL a bit more of a chore if you want to use the REOPT option.

 

Enabling

This is basically switching a number of Database Configuration parameters on (unless this is a new database, being created in a v12.1 instance; in which case they are on by default).

But you do need them all and you have to preserve a certain order (although you will get an error message if you don’t).

Basically, you switch AI Modelling on with

UPDATE db cfg USING AUTO_AI_MAINT ON
UPDATE db cfg USING AUTO_MODEL_DISCOVER ON

But (to quote the manual)

— The models are managed as part of the automatic statistics collection operation,
— so auto_tbl_maint, auto_ai_maint and auto_runstats must be set to ON for auto_ai_optimizer to be effective.

But you must not forget to enable AUTO_MAINT too as it is the parent for those other parms. So

UPDATE db cfg USING AUTO_MAINT ON
UPDATE db cfg USING AUTO_TBL_MAINT ON
UPDATE db cfg USING AUTO_RUNSTATS ON

Then you can switch AI Query Optimisation on with

UPDATE db cfg USING AUTO_AI_OPTIMIZER ON

 

Checking

So, what has that actually done? What it should have done is enable a model to be created for each table, and this is what the AI Optimisation will be interrogating. There are a couple of simple queries that will confirm this.

First of all, there is a new System Catalogue called SYSCAT.AIOPT_TABLECARDMODELS. Take a look in there to see which models have been created; you should see something like this

Model Name

I’ve censored the table names as they belong to a customer, but the query has been limited to just the 4 tables that are part of a query I want to be tuned with the aid of the AI Query Optimiser.

It’s worth noting the create times; the models weren’t all created simultaneously but were built using the same triggers (small ‘t’; not Triggers) that initiate automatic Runstats. There might be a considerable gap between first and last, so it’s worth checking that the models you want to use have all been created before checking their use.

You probably want to check what actions are taking place with the training of these models. The best way to do this is to use the PD_GET_DIAG_HIST table function. Here’s the output of a query that is limited to just 1 of those 4 tables and that is not looking at the COLLECT type (COLLECT shows statistics being collected, either using explicit RUNSTATS commands or auto_runstats).

AI Query Optimization

What we see here is the Model being initially built (DISCOVER) and then populated (TRAIN). We only see 2 rows for each Event Type; one start and one success. You will see more TRAIN events as time goes on but this system having very low levels of activity, more TRAINs haven’t been triggered.

In addition, it is possible to see more DISCOVER events on the same table as time goes on. These DISCOVER events evaluate whether the model could benefit from being retrained on additional columns.

If you do want to see what is happening with statistics collection you can use that same query but include another column to see how the statistics were collected:

this shows single table and indicates that a RUNSTATS command was issued, either manually or from a scheduled job. On the 18th May and then auto-runstats kicked in and did an asynchronous collection on the 5th June.

 

Usage

All very well, but what is this feature actually doing for us? The best way of examining this is to take a look at the access paths and see what is going on. The db2exfmt command can be used to give a detailed view of the actual execution plan.

Things to look out for are

An ‘Eyecatcher’ banner

What you should find under the Access Plan header and right after the Total Cost and Query Degree figures is a banner with the text “Assisted by the AI Query Optimizer”.

This tells you that the AI Query Optimizer is active but not what it is actually doing.

NB this has only been introduced at ModPack 1 for v12.1 so you won’t find anything if that, or subsequent, ModPacks haven’t been applied.

Objects Used in Access Plan

At the bottom of the db2exfmt output is a report on the objects that are used. This will be tables, views, indexes etc. and it will now include, if applicable, something like this

If you refer back to the output from the AIOPT_TABLECARDMODELS query, you can see that this is a Model that was detailed in the that query.

But this just tells us that the model is present, but not what, if anything, was the benefit. It being referred to in this section does not mean it was considered for use in the plan.

 

Table Cardinality Model Predicates

This will be in the middle of the db2exfmt output, but this is what you need to find to prove something is being done by the AI Query Optimiser. It might be this simple

and this tells us that the model was used to examine the cardinality for a specific predicate in the query.

Whether it gives a benefit or not can be determined by looking at the access plan generated by db2exfmt.

Unfortunately, it is not as simple as looking at Total Cost. You might see the total cost increase with an AI Optimizer-influenced plan because the cardinality estimates become more realistic when they might have previously (without AI Optimizer) been underestimated.

But if you do see the Total Cost drop in the AI enabled one, it should be indicating a benefit. There’s no substitute for detailed examination of the access plan though.

If you want to compare output extracted whilst the AI Query Optimiser was running and then switching if off and re-running db2exfmt, switching it off is as simple as

UPDATE db cfg USING AUTO_MAINT OFF
And/or
db2set DB2_SELECTIVITY=MODEL_PRED_SEL OFF

It’s worth having a look at the Knowledge Centre for details on how and which predicates are evaluated by the AI Optimiser (see The AI Query Optimizer) but this screenshot gives a handy list of what will and will not be evaluated:

 

Conclusion

This is very much a V1.0 offering. The basic mechanics of switching on the AI Optimiser feature, and invoking model creation and training is there but the impact on your queries is limited to the, albeit crucial, cardinality estimation and, as pointed out in the introduction, won’t be any use if your queries use parameter markers. Which, let’s face it, is a considerable proportion of them.

The other thing I need to evaluate is what impact the training of the models has on the database; any impact on day-to-day performance, checking the storage being used by these models; that sort of thing. Maybe another blog…

But it is a start and a worthwhile and valuable one and provides the basis for introducing more complex AI estimation and, maybe one day, replacing all manual statistics collection.

My thanks again to Brandon Frendo from Db2 Software Development in Toronto for his help with this blog.

Email me at mark.gillis@triton.co.uk if you have any questions or observations, or if you want copies of the SQL I have referred to in this blog post.