Constants vs. Parameterless Functions

Do you use parameterless PL/SQL functions in your queries? Did you know that this may cause performance issues? In this blog post I explain why parameterless functions can be the reason for bad execution plans in any Oracle Database.

I recently had to analyze this problem in a production system and thought it was worth sharing. On the one hand because we did not find a satisfactory solution and on the other hand because this could change in the future when we start discussing it.

For this blog post I used an Oracle Database 19c Enterprise Edition, version 19.5.0.0.0 in a Docker environment. However, you can run the scripts in any edition of an Oracle Database 12c Release 1 or later to reproduce the results.

1. Data Setup

We create a user demo  with ALTER SESSION and SELECT ANY DICTIONARY privileges as follows:

Then as user demo we create a table t with an index t_ind_idx on column ind

and populated table t with the following anonymous PL/SQL block:

The case expression leads to a skewed distribution of column ind. Only around 0.1% of the rows have a value 1 as the following query shows:

Therefore we are gathering statistics for table t  with a histogram for column ind:

Now, we can check the histogram with the following query:

Two rows for the two values of column ind. For value 0 we expect 99899 rows (endpoints) and for value 1  we expect 101 rows (100000 – 99899 endpoints). This is 100 percent accurate.

2. Constant Declaration

In the Trivadis PL/SQL & SQL Guidelines we recommend avoiding the use of literals in PL/SQL code. Every time we see a literal in a PL/SQL code we should consider using a constant instead. Often this makes sense because the name of the constant is more meaningful than the literal, making the code more readable and maintainable.

Hence we create the following PL/SQL package for our representation of boolean values in SQL:

Now we can use these constants in our PL/SQL code as follows:

When developing complex SQL statements I often run them standalone in an IDE until I’m satisfied with the result. But when we run this

we get the following error message:

We have to change the constant const_boolean.co_true  to a literal (1), which is cumbersome and error-prone.

3. Parameterless Functions for Constants

As a workaround we can create a parameterless function for each constant. Like this:

Now we can use the function in PL/SQL and SQL like this:

So far so good.

4. The Problem

The execution plan of the previous statement looks as follows:

When you look at line 12 you see that the optimizer estimates to process 50000 rows. That’s 50 percent of all rows. This is based on the number of distinct values for column ind and the number of rows in the table t. The optimizer gets these information from here:

But unfortunately, the histogram for column ind is ignored. Why? Because the Oracle Database has no idea what the value of const_boolean.true#  is. Hence, a histogram is not helpful in finding an execution plan.

An optimal plan would look like this:

When you look at line 12, you see that

  1. an INDEX RANGE SCAN is used and
  2. the number of rows is estimated correctly.

We get this plan when using a literal 1, a bind variable with the bind value 1 (thanks to bind variable peeking) or a constant with value 1 (which is treated as a bind variable in PL/SQL).

The wrong cardinality is a major problem. Because the cardinality is the most important criterion for choosing an optimal access method, join order and join method. Bad cardinality estimates lead to bad execution plans and bad performance. This cannot be ignored, even if in this demo case the resulting performance is still okay.

The problem occurs only if we are accessing columns with significant skewed data and if these columns have a histogram.

5. Workarounds

We have basically three options to work around the problem:

  1. For PL/SQL code we can use a constant instead of a parameterless function
    (e.g. ind = const_boolean.co_true)
  2. For PL/SQL code or plain SQL like in views, we can use a literal with a comment instead of a parameterless function
    (e.g. ind = 1 -- const_boolean.co_true)
  3. For PL/SQL code or plain SQL like in views, we can query the parameterless function in a subquery and force the optimizer to execute it during parse time [added on 2019-12-14]
    (e.g. ind IN (SELECT /*+ precompute_subquery */ const_boolean.true# FROM DUAL))

The first option has the drawback, that you have to change the SQL to make it runnable outside of PL/SQL. The second option may lead to inconsistencies due to wrong literal/comment combinations or when changing constant values. The third option requires an IN condition that could be accidentally changed to an equal comparison condition due to the scalar subquery, which would make the undocumented precompute_subquery hint ineffective. [added on 2019-12-14]

Of course you can continue to use parameterless functions in SQL and PL/SQL and switch to one of the options if there is a problem or if you know that a histogram exists for a certain column. But this is difficult to apply in a consistent manner. In fact, it makes maintenance more complicated with a certain performance risk or penalty.

6. Considered Alternatives

I had a look at Associate Statistics (Extensible Optimizer Interface). This does not help, because there is no way to access the related table columns to calculate the impact on the selectivity. The feature is useful if a function gets some parameters to calculate the impact on the selectivity, but without parameters this is not possible.

I had considered list partitioned tables based on skewed columns instead of using indexes. This works and can make sense to reduce the overhead of an index (especially indexing non-selective values). But the issues regarding parameterless functions are 100 percent the same.

7. Summary

Parameterless functions are a way to use constants in SQL outside of PL/SQL, for example in views. However, they rob the optimizer of the capability to use histograms and therefore finding an optimal execution plan.

Actually we can only work around the problem. The Oracle Database has to provide the solution. Either by allowing to access package global constants in SQL (outside of PL/SQL) or by implementing some kind of peeking for parameterless, deterministic functions.

If you think I missed something important, especially if you think there is a better workaround or even a solution, then please do not hesitate to leave a comment or contact me directly. Thank you.

Updated on 2019-12-14, added a third option under “5. Workarounds” based on a tweet by Jonathan Lewis. Thanks a lot.

4 Comments

  1. (Just for completeness)

    I took a look at this back in 2016 to see if SYS_CONTEXT provides any help

    https://connor-mcdonald.com/2016/10/20/taking-a-peek-at-sys_context/

  2. Roger says:

    Oracle 20 SQL macros will solve this … looking forward to it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.