Unfortunately one column of a huge table has null of half of the data, so when query
select count(*) from huge_table where half_null_col is null;
will be a performance disaster even if it is already indexed:
create index half_null_col_idx on huge_table(half_null_col asc);
There are two questions:
Oracle 11g should support a constant expression to index on null values, but sorry I went through oracle doc but couldn't find explicit official document about it. Please share the reference if anyone knows
how to alter the index instead of drop
and create
again to avoid performance issue.
You have at least four options that come to my mind at this moment:
Create the "constant expression" index...
create index half_null_col_idx
on huge_table (half_null_col, 1);
Create a bitmap index on the table. Bitmap indexes allow for indexing NULLs as well...
create bitmap index half_null_col_idx
on huge_table (half_null_col);
Create an index on a function-based NULL-remapped-to-something values and use that remapped NULL in your queries instead of querying for NULLs...
create index half_null_col_idx
on huge_table (nvl(half_null_col, '<a value that does not appear in the values of the column>'));
select *
from huge_table
where nvl(half_null_col, '<a value that does not appear in the values of the column>')
= '<a value that does not appear in the values of the column>'
;
Repartition the table so that NULL values go all into one partition and the rest of the values into different partition/partitions...
create table huge_table_2
partition by list (half_null_col)
(
partition pt_nulls values (null),
partition pt_others values (default)
)
as
select *
from huge_table;
If you select only count(*)
from the table then the bitmap index might be your best option.
If you want to use the full data rows from the table elsewhere (in a join to another table or to export them to CSV or whatever), then the repartitioning might be your best option.
If you can't/don't want to repartition the table and you can't create the bitmap index (e.g., due to heavy concurrent DML activity on the table), then the "constant expression" index or the "NULL-remapped-to-something" index might be your best option.
To answer your original questions:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With