openLooKeng v1.2.0 Is Officially Released
March 31, 2021
Since open-sourced, the openLooKeng community has been favored by an increasing number of partners. In the community, the performance of the openLooKeng is recognized and many valuable suggestions are provided. In this March, openLooKeng V1.2.0 was released. openLooKeng V1.2.0 is an update of the earlier version. Based on partners’ experience and suggestions, some new technologies are applied to improve the engine performance and provide better experience.
For the engine kernel, converged analysis and performance are enhanced:
The query fault tolerance is enhanced to improve the engine execution reliability.
During the batch query process, if a worker node is faulty, tasks can be restored on other nodes, for example, insert and create table as select operations on the Hive data source. Compared with the earlier version, the stability and reliability of the batch query task that runs for a long time are greatly improved.
Query performance optimization: The StarTree-based query pre-aggregation capability is enhanced.
StarTree is designed to optimize low-latency, aggregation query statements. We use the query pre-aggregation capability of StarTree to build cubes of different dimensions and aggregation operations for you. In subsequent query processes, if any aggregation subquery is matched, the engine directly reads data from cubes, so that the query is not performed on the original table, thereby improving query performance.
The Common Table Expression (CTE) optimization technology is introduced to reduce memory usage.
The CTE technology is introduced in the execution plan optimization process. In a complex query, if a subquery (for example, with statement) is used for multiple times, the optimizer automatically generates a CTE node for the subquery. The output of the CTE node is consumed by multiple parent nodes in the execution plan process. That is, repeated subqueries are executed only once, simplifying the query process and reducing the memory usage. In this way, the engine has better performance.
The general operator pushdown framework allows Connectors to participate in execution plan optimization.
To make the operator pushdown process more simple and flexible, a new general operator pushdown framework is used so that each Connector can participate in execution plan optimization. When optimizing the execution plan, the engine automatically applies the optimization rules of the Connector to facilitate the operator pushdown process.
The data maintenance performance of the Hive ORC is enhanced.
This feature accelerates the processing of data write and modification (update and deletion) without compromising the read performance. Concurrent access to the Hive Metastore is supported, improving metadata operation performance and further optimizing data write and modification performance.
For the engine portal, the optimization focuses on the southbound ecosystem.
The performance of the HBase Connector is optimized.
To improve the performance of single-table queries, the sharding algorithm is added. In addition, this feature allows HBase to access snapshots, improving the performance of concurrent queries.
Pushdown of User-Defined Functions (UDFs) is supported.
The engine supports registration of external UDFs and pushdown of external UDFs to JDBC data sources. To provide better user experience, you can use the existing UDFs without migrating UDFs in your own data source, improving the reuse rate of UDFs.
These are the highlights of the openLooKeng V1.2.0 in terms of performance optimization. As a key project in the big data field, openLooKeng attaches great importance to the usability and security of the engine. The following improvements are made in V1.2.0:
Usability | User experience of the Admin Dashboard UI is improved.
UI freezing caused by scheduled full loading is optimized. The function of displaying query history and query results on multiple pages is enhanced. The parameter settings of the new Connector are optimized. Kerberos and password login are supported on the UI, and historical query results can be filtered.
Security | Ranger-based fine-grained permission control supports row filtering and column mask.
Row filtering, column mask, and simulated permission control are added for authenticated users to provide finer-grained permission control.
Want to try the new version? You can download it at: https://openlookeng.io/download.html.
If you have any suggestions on V1.2.0, send an email to email@example.com. The openLooKeng community appreciates all your support, and looks forward to and welcomes the participation of more partners.
Welcome to join the openLooKeng community and make big data easier.
openLooKeng official website: https://openlookeng.io
openLooKeng code repository: https://gitee.com/openlookeng