The Oracle Developer Community forum often sees SQL that is hard to read – sometimes because it’s a brutal tangle of subqueries, sometimes because the format it bad, sometimes because the use of table and column aliases is poorly done. One particular case of the last weakness is the code where the same table alias (typically the letter A) is used a dozen times in the course of the query.
I’ve said that every table in a query should have a different alias and the alias should be used at every column usage in the query (the note at this URL includes a couple of refinements). I’ve just discovered another reason why this is a good idea and why you shouldn’t use the same alias twice in a query. Here’s a simplified demonstration of the threat – tested on 19.3.0.0:
rem Script: using_bug.sql rem Author: Jonathan Lewis rem Dated: Jan 2020 rem Purpose: rem rem Last tested rem 19.3.0.0 rem create table t2 as select rownum id, object_id, object_name, object_type, rpad('x',100) padding from all_objects where rownum <= 5000 and mod(object_id,2) = 1 / create table t1 as select rownum id, object_id, object_name, object_type, rpad('x',100) padding from all_objects where rownum <= 5000 and mod(object_id,2) = 0 /
I’ve created two tables from the view all_objects, one of the tables holds rows where the object_id is even, the other where it is odd, so if I join these two tables on object_id the result set will be empty. So here are three queries that join the two tables – with the little twist that I’ve (accidentally) given both tables the same alias X in all three cases:
prompt ======================================= prompt Here's a query that might "confuse" the prompt optimizer when we try to explain it prompt ======================================= explain plan for select max(object_name) from t1 X join t2 X using (object_id); prompt ================================== prompt So might this one, but it doesn't. prompt ================================== explain plan for select max(object_id) from t1 X join t2 X using (object_id); select * from table(dbms_xplan.display); prompt =================================================== prompt With this one A-rows matches E-rows: and it's NOT 0 prompt =================================================== alter session set statistics_level = all; set serveroutput off set linesize 156 select count(*) from t1 X join t2 X using (object_id); select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last'));
In the absence of the explicit aliases the first query should produce an execution plan; but when both tables are given the same alias the attempt to explain (or run) the query produced the error “ORA-00918: column ambiguously defined”.
The second query does better – or worse, depending on your point of view. Nominally the join is perfectly valid and the optimizer produces an execution plan for the query. But the plan predicts a Cartesian merge join with a result set of 25M rows – which doesn’t look like a good estimate – and the plan doesn’t have a Predicate Information section.
So we use a count(*) for the third query – just in case the result set is, somehow, 25M rows – and enable rowsource execution statistics, and acquire the plan from memory after running the query (which takes nearly 14 seconds of hammering the CPU to death). And here’s the output:
COUNT(*) ---------- 25000000 1 row selected. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 85ygrcg4n3ymz, child number 0 ------------------------------------- select count(*) from t1 X join t2 X using (object_id) Plan hash value: 4259280259 ----------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 67358 (100)| 1 |00:00:13.38 | 200 | 198 | | | | | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:13.38 | 200 | 198 | | | | | 2 | MERGE JOIN CARTESIAN| | 1 | 25M| 67358 (5)| 25M|00:00:10.55 | 200 | 198 | | | | | 3 | TABLE ACCESS FULL | T2 | 1 | 5000 | 15 (7)| 5000 |00:00:00.01 | 100 | 99 | | | | | 4 | BUFFER SORT | | 5000 | 5000 | 67343 (5)| 25M|00:00:04.54 | 100 | 99 | 133K| 133K| 118K (0)| | 5 | TABLE ACCESS FULL | T1 | 1 | 5000 | 13 (0)| 5000 |00:00:00.01 | 100 | 99 | | | | -----------------------------------------------------------------------------------------------------------------------------------------
Yes, instead of zero rows Oracle managed to produce 25M rows. The execution plan is a Cartesian merge join, and that’s because the optimizer has lost the join predicate (I didn’t supply a Predicate Information section because there wasn’t one – note the absence of any asterisks against any operations in the plan).
Interestingly the Query Block / Alias section of the plan (when I called for it) reported the two aliases as X_0001 and X_0002, so internally Oracle did manage to find two different aliases – but too late, presumably.
Conclusion
Give a little thought to using table aliases sensibly and safely. It’s trivial to fix this example, but some of the messy SQL that goes into production might end up running into the same issue without it being so easy to spot the source of the anomaly.
Footnote
This is Bug 25342699 : WRONG RESULTS WITH ANSI JOIN USING AND IDENTICAL TABLE ALIASES reported Jan 2017 against 12.1.0.2, not yet fixed.