Sorted Hash Clusters have been around for several years, but I’ve not yet seen them being used, or even investigated in detail. This is a bit of a shame, really, because they seem to be engineered to address a couple of interesting performance patterns.
The basic concept is that data items that look alike are stored together (clustered) by applying a hashing function to generate a block address; but on top of that, if you query the data by “hashkey”, the results are returned in sorted order of a pre-defined “sortkey” without any need for sorting. (On top of everything else, the manuals describing what happens and how it works are wrong).
Yesterday I had reason to take a closer look at them, and decided that perhaps the reason no one talks about them is that they simply aren’t safe. Here’s a trivial demonstration, which I’ve run on 10.2.0.5, 11.2.0.3, and 12.1.0.1:
execute dbms_random.seed(0) create cluster sorted_hash_cluster ( hash_value number(6,0), sort_value varchar2(2) sort ) size 300 hashkeys 100 ; create table sorted_hash_table ( hash_value number(6,0), sort_value varchar2(2), v1 varchar2(10), padding varchar2(30) ) cluster sorted_hash_cluster ( hash_value, sort_value ) ; begin for i in 1..5000 loop insert into sorted_hash_table values( trunc(dbms_random.value(0,99)), dbms_random.string('U',2), lpad(i,10), rpad('x',30,'x') ); commit; end loop; end; / begin dbms_stats.gather_table_stats( ownname => user, tabname =>'sorted_hash_table' ); end; / select count(*) from sorted_hash_table where hash_value = 92; select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null; select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null; select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR'; select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR';
I think the nature of the last two queries is exactly the type for which the feature has been invented – just check the results, which come from a cut-n-paste after setting echo on:
SQL> select count(*) from sorted_hash_table where hash_value = 92; COUNT(*) ---------- 60 1 row selected. SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null; COUNT(*) ---------- 60 1 row selected. SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null; COUNT(*) ---------- 60 1 row selected. SQL> select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR'; HASH_VALUE SO V1 PADDING ---------- -- ---------- ------------------------------ 92 YR 4773 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 92 ZF 250 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 92 ZJ 2046 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 92 ZT 65 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 4 rows selected. SQL> SQL> select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR'; no rows selected
So: Null is not null, and ‘ZF’ is not greater than ‘YR’, it’s only greater than or equal to ‘YR’ !
I’d be interested to see the test cases that the developer used for this feature that allowed it to ship at all.
