It shouldn’t be possible to get the wrong results by using a hint – but hints are dangerous and the threat may be there if you don’t know exactly what a hint is supposed to do (and don’t check very carefully what has happened when you’ve used one that you’re not familiar with).
This post was inspired by a blog note from Connor McDonald titled “Being Generous to the Optimizer”. In his note Connor gives an example where the use of “flexible” SQL results in an execution plan that is always expensive to run when a more complex version of the query could produce a “conditional” plan which could be efficient some of the time and would be expensive only when there was no alternative. In his example he rewrote the first query below to produce the second query:
select data from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) ; select data from address where ( :choice = 1 and street = :val ) union all select data from address where ( :choice = 2 and suburb = :val );
(We are assuming that bind variable :choice is constrained to be 1 or 2 and no other value.)
In its initial form the optimizer had to choose a tablescan for the query, in its final form the query can select which half of a UNION ALL plan to execute because the optimizer inserts a pair of FILTER operations that check the actual value of :choice at run-time.
When I started reading the example my first thought was to wonder why the optimizer hadn’t simply used “OR-expansion” (or concatenation if you’re running an older version), then I remembered that by the time the optimizer really gets going it has forgotten that “:choice” is the same bind variable in both cases, so doesn’t realise that it would use only one of two possible predicates. However, that doesn’t mean you can’t tell the optimizer to use concatenation. Here’s a model – modified slightly from Connor’s original:
drop table address purge; create table address ( street int, suburb int, post_code int, data char(100)); insert into address select mod(rownum,1e4), mod(rownum,10), mod(rownum,1e2), rownum from dual connect by level <= 1e5 -- > comment to avoid WordPress format issue ; commit; exec dbms_stats.gather_table_stats('','ADDRESS') create index ix1 on address ( street ); create index ix2 on address ( suburb ); create index ix3 on address ( post_code ); variable val number = 6 variable choice number = 1 alter session set statistics_level = all; set serveroutput off set linesize 180 set pagesize 60 select /*+ or_expand(@sel$1) */ count(data) from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) ; select * from table(dbms_xplan.display_cursor(null,null,'allstats last outline'));
I’ve added one more column to the table and indexed it – I’ll explain why later. I’ve also modified the query to show the output but restricted the result set to a count of the data column rather than a (long) list of rows.
Here’s the execution plan output when hinted:
SQL_ID 6zsh2w6d9mddy, child number 0 ------------------------------------- select /*+ or_expand(@sel$1) */ count(data) from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) Plan hash value: 3986461375 ------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 12 | 27 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 12 | 27 | | 2 | VIEW | VW_ORE_B7380F92 | 1 | 10010 | 10 |00:00:00.01 | 12 | 27 | | 3 | UNION-ALL | | 1 | | 10 |00:00:00.01 | 12 | 27 | |* 4 | FILTER | | 1 | | 10 |00:00:00.01 | 12 | 27 | | 5 | TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESS | 1 | 10 | 10 |00:00:00.01 | 12 | 27 | |* 6 | INDEX RANGE SCAN | IX1 | 1 | 10 | 10 |00:00:00.01 | 2 | 27 | |* 7 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | 0 | |* 8 | TABLE ACCESS FULL | ADDRESS | 0 | 10000 | 0 |00:00:00.01 | 0 | 0 | ------------------------------------------------------------------------------------------------------------------------------ Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('18.1.0') DB_VERSION('18.1.0') ALL_ROWS OUTLINE_LEAF(@"SET$9162BF3C_2") OUTLINE_LEAF(@"SET$9162BF3C_1") OUTLINE_LEAF(@"SET$9162BF3C") OR_EXPAND(@"SEL$1" (1) (2)) OUTLINE_LEAF(@"SEL$B7380F92") OUTLINE(@"SEL$1") NO_ACCESS(@"SEL$B7380F92" "VW_ORE_B7380F92"@"SEL$B7380F92") INDEX_RS_ASC(@"SET$9162BF3C_1" "ADDRESS"@"SET$9162BF3C_1" ("ADDRESS"."STREET")) BATCH_TABLE_ACCESS_BY_ROWID(@"SET$9162BF3C_1" "ADDRESS"@"SET$9162BF3C_1") FULL(@"SET$9162BF3C_2" "ADDRESS"@"SET$9162BF3C_2") END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter(:CHOICE=1) 6 - access("STREET"=:VAL) 7 - filter(:CHOICE=2) 8 - filter(("SUBURB"=:VAL AND (LNNVL(:CHOICE=1) OR LNNVL("STREET"=:VAL))))
As you can see we have a UNION ALL plan with two FILTER operations, and the filter operations allow one or other of the two branches of the UNION ALL to execute depending on the value for :choice. Since I’ve reported the rowsource execution statistics you can also see that the table access through index range scan (operations 5 and 6) has executed once (Starts = 1) but the tablescan (operation 8) has not been executed at all.
If you check the Predicate Information you will see that operation 8 has introduced two lnnvl() predicates. Since the optimizer has lost sight of the fact that :choice is the same variable in both cases it has to assume that sometimes both branches will be relevant for a single execution, so it has to add predicates to the second branch to eliminate data that might have been found in the first branch. This is the (small) penalty we pay for avoiding a “fully-informed” manual rewrite.
Take a look at the Outline Data – we can see our or_expand() hint repeated there, and we can discover that it’s been enhanced. The hint should have been or_expand(@sel$1 (1) (2)). This might prompt you to modify the original SQL to use the fully qualified hint rather than the bare-bones form we’ve got so far. So let’s assume we do that before shipping the code to production.
Now imagine that a couple of months later an enhancement request appears to allow queries on post_code and the front-end has been set up so that we can specify a post_code query by selecting choice number 3. The developer who happens to pick up the change request duly modifies the SQL as follows:
select /*+ or_expand(@sel$1 (1) (2)) */ count(data) from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) or ( :choice = 3 and post_code = :val) ;
Note that we’ve got the “complete” hint in place, but there’s now a 3rd predicate. Do you think the hint is still complete ? What do you think will happen when we run the query ? Here’s the execution plan when I set :choice to 3.
select /*+ or_expand(@sel$1 (1) (2)) */ count(data) from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) or ( :choice = 3 and post_code = :val) Plan hash value: 3986461375 ----------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | | 2 | VIEW | VW_ORE_B7380F92 | 1 | 10010 | 0 |00:00:00.01 | | 3 | UNION-ALL | | 1 | | 0 |00:00:00.01 | |* 4 | FILTER | | 1 | | 0 |00:00:00.01 | | 5 | TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESS | 0 | 10 | 0 |00:00:00.01 | |* 6 | INDEX RANGE SCAN | IX1 | 0 | 10 | 0 |00:00:00.01 | |* 7 | FILTER | | 1 | | 0 |00:00:00.01 | |* 8 | TABLE ACCESS FULL | ADDRESS | 0 | 10000 | 0 |00:00:00.01 | ----------------------------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('18.1.0') DB_VERSION('18.1.0') ALL_ROWS OUTLINE_LEAF(@"SET$9162BF3C_2") OUTLINE_LEAF(@"SET$9162BF3C_1") OUTLINE_LEAF(@"SET$9162BF3C") OR_EXPAND(@"SEL$1" (1) (2)) OUTLINE_LEAF(@"SEL$B7380F92") OUTLINE(@"SEL$1") NO_ACCESS(@"SEL$B7380F92" "VW_ORE_B7380F92"@"SEL$B7380F92") INDEX_RS_ASC(@"SET$9162BF3C_1" "ADDRESS"@"SET$9162BF3C_1" ("ADDRESS"."STREET")) BATCH_TABLE_ACCESS_BY_ROWID(@"SET$9162BF3C_1" "ADDRESS"@"SET$9162BF3C_1") FULL(@"SET$9162BF3C_2" "ADDRESS"@"SET$9162BF3C_2") END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter(:CHOICE=1) 6 - access("STREET"=:VAL) 7 - filter(:CHOICE=2) 8 - filter(("SUBURB"=:VAL AND (LNNVL(:CHOICE=1) OR LNNVL("STREET"=:VAL))))
We get a UNION ALL with two branches, one for :choice = 1, one for :choice = 2 and both of them show zero starts – and we don’t have any part of the plan to handle :choice = 3. The query returns no rows – and if you check the table creation code you’ll see it should have returned 1000 rows. An incorrect (historically adequate) hint has given us wrong results.
If we want the full hint for this new queryy we need to specify the 3rd predicate, by adding (3) to the existing hint to get the following plan (and correct results):
select /*+ or_expand(@sel$1 (1) (2) (3)) */ count(data) from address where ( :choice = 1 and street = :val ) or ( :choice = 2 and suburb = :val ) or ( :choice = 3 and post_code = :val) Plan hash value: 2153173029 --------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 1639 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1639 | | 2 | VIEW | VW_ORE_B7380F92 | 1 | 11009 | 1000 |00:00:00.01 | 1639 | | 3 | UNION-ALL | | 1 | | 1000 |00:00:00.01 | 1639 | |* 4 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | | 5 | TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESS | 0 | 10 | 0 |00:00:00.01 | 0 | |* 6 | INDEX RANGE SCAN | IX1 | 0 | 10 | 0 |00:00:00.01 | 0 | |* 7 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | |* 8 | TABLE ACCESS FULL | ADDRESS | 0 | 10000 | 0 |00:00:00.01 | 0 | |* 9 | FILTER | | 1 | | 1000 |00:00:00.01 | 1639 | |* 10 | TABLE ACCESS FULL | ADDRESS | 1 | 999 | 1000 |00:00:00.01 | 1639 | --------------------------------------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('18.1.0') DB_VERSION('18.1.0') ALL_ROWS OUTLINE_LEAF(@"SET$49E1C21B_3") OUTLINE_LEAF(@"SET$49E1C21B_2") OUTLINE_LEAF(@"SET$49E1C21B_1") OUTLINE_LEAF(@"SET$49E1C21B") OR_EXPAND(@"SEL$1" (1) (2) (3)) OUTLINE_LEAF(@"SEL$B7380F92") OUTLINE(@"SEL$1") NO_ACCESS(@"SEL$B7380F92" "VW_ORE_B7380F92"@"SEL$B7380F92") INDEX_RS_ASC(@"SET$49E1C21B_1" "ADDRESS"@"SET$49E1C21B_1" ("ADDRESS"."STREET")) BATCH_TABLE_ACCESS_BY_ROWID(@"SET$49E1C21B_1" "ADDRESS"@"SET$49E1C21B_1") FULL(@"SET$49E1C21B_2" "ADDRESS"@"SET$49E1C21B_2") FULL(@"SET$49E1C21B_3" "ADDRESS"@"SET$49E1C21B_3") END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter(:CHOICE=1) 6 - access("STREET"=:VAL) 7 - filter(:CHOICE=2) 8 - filter(("SUBURB"=:VAL AND (LNNVL(:CHOICE=1) OR LNNVL("STREET"=:VAL)))) 9 - filter(:CHOICE=3) 10 - filter(("POST_CODE"=:VAL AND (LNNVL(:CHOICE=1) OR LNNVL("STREET"=:VAL)) AND (LNNVL(:CHOICE=2) OR LNNVL("SUBURB"=:VAL))))
We now have three branches to the UNION ALL, and the final branch (:choice =3) ran to show A-rows = 1000 selected in the tablescan.
Conclusion
You shouldn’t mess about with hints unless you’re very confident that you know how they work and then test extremely carefully – especially if you’re modifying old code that already contains some hints.