Usage

Index can be managed using any of the supported clients, such as hetu-cli located under the bin directory in the installation.

CREATE

To create an index you can run sql statements of the form:

CREATE INDEX [ IF NOT EXISTS ] index_name
USING [ BITMAP | BLOOM | BTREE | MINMAX ]
ON tbl_name (col_name)
WITH ( 'level' = ['STRIPE', 'PARTITION'], "bloom.fpp" = '0.001', [, …] )
WHERE predicate;
  • WHERE predicate can be used to create index on select partition(s)
  • WITH can be used to specify index properties or index level. See individual index documentation to support properties.
  • "level"='STRIPE' if not specified

If the table is partitioned, you can specify a single partition to create an index on, or an in-predicate to specify multiple partitions:

CREATE INDEX index_name USING bloom ON hive.schema.table (column1);
CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WITH ("bloom.fpp"="0.01") WHERE p=part1;
CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WHERE p in (part1, part2, part3);

Note: If the table is multi-partitioned (for example, partitioned by colA and colB), only index creation on the first level is supported (colA).

SHOW

To show all indexes or a specific index_name:

SHOW INDEX;
SHOW INDEX index_name;

DROP

To delete an index by name:

DROP INDEX index_name
WHERE predicate;
  • WHERE predicate be used to delete index for specific partition(s). However, if index was initially created on the entire table, it is not possible to delete index for a single partition.
DROP INDEX index_name where p=part1;

Note: dropped index will be removed from cache after a few seconds, so you may still see the next few queries still using the index.

Notes on resource usage

Disk usage

Heuristic index uses the local temporary directory (default /tmp on linux) while creating and processing indexes while running. Therefore, the temporary directory should have sufficient space. To change the temporary directory set the following property in etc/jvm.config:

-Djava.io.tmpdir=/path/to/another/dir