Index can be managed using any of the supported clients, such as hetu-cli located under the
bin directory in the installation.
To create an index you can run sql statements of the form:
CREATE INDEX [ IF NOT EXISTS ] index_name USING [ BITMAP | BLOOM | BTREE | MINMAX ] ON tbl_name (col_name) WITH ( 'level' = ['STRIPE', 'PARTITION'], "bloom.fpp" = '0.001', [, …] ) WHERE predicate;
WHEREpredicate can be used to create index on select partition(s)
WITHcan be used to specify index properties or index level. See individual index documentation to support properties.
"level"='STRIPE'if not specified
If the table is partitioned, you can specify a single partition to create an index on, or an in-predicate to specify multiple partitions:
CREATE INDEX index_name USING bloom ON hive.schema.table (column1); CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WITH ("bloom.fpp"="0.01") WHERE p=part1; CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WHERE p in (part1, part2, part3);
Note: If the table is multi-partitioned (for example, partitioned by colA and colB), only index creation on the first level is supported (colA).
To show all indexes or a specific index_name:
SHOW INDEX; SHOW INDEX index_name;
To delete an index by name:
DROP INDEX index_name WHERE predicate;
WHEREpredicate be used to delete index for specific partition(s). However, if index was initially created on the entire table, it is not possible to delete index for a single partition.
DROP INDEX index_name where p=part1;
Note: dropped index will be removed from cache after a few seconds, so you may still see the next few queries still using the index.
Notes on resource usage
Heuristic index uses the local temporary directory (default
/tmp on linux) while creating and processing indexes while running.
Therefore, the temporary directory should have sufficient space. To change the temporary directory set the following property in