Varada Solution Overview

Varada provides the following key features and capabilities:

Varada Catalog

Varada can work with data stored in an Apache Hive data warehouse, as well as with data stored in Apache Iceberg table format. Queries hitting tables under the Varada catalog will get accelerated automatically.

  • Hive: Varada creates a full copy of the Hive catalog, so that you can access all Hive catalog schemes and tables in Varada under the same full path. For example, you can find the Hive table hive.<schema>.<table> in Varada under varada.<schema>.<table>.

  • Iceberg: Varada creates a full copy of the Iceberg catalog, so that you can access all Iceberg catalog schemes and tables in Varada under the same full path. For example, you can find the Iceberg table iceberg.<schema>.<table> in Varada under varada_iceberg.<schema>.<table>.

📘

The Iceberg catalog only works with the Hive metastore, and not with the AWS Glue metastore.

👍

The default catalog names are varada for Hive and varava_iceberg for Iceberg. You can customize these by changing the names of the varada.properties and varada_iceberg.properties files.

📘

Learn more about the Hive connector and the Iceberg connector.

Big Data Indexing on the Data Lake

Varada uses the nanoblock indexing mechanism, which is uniquely optimized for high-performance analytics. Instead of storing one large index for each column you select, Varada dynamically creates millions of nanoblocks – a few dozen kilobyte-sized sub-sections of the indexed column.

Each nanoblock only stores an index for a subset of the data, and each nanoblock index is independent and uniquely adapted to that nanoblock. By taking advantage of modern storage systems, such as SSDs, monoblocks are fast to load and update, which enables Varada to quickly read and execute queries against nanoblock indexes without the overhead inherent in either traditional indexes.

Varada as a Connector

In addition to deploying Varada as a data platform‍, you can easily deploy Varada as a connector to leverage its performance improvements for your existing Trino clusters.

With this model, Varada deploys its own connector in addition to your existing connectors, and leaves your existing connectors in place. No query rewrites are required, as your workloads continue to query the existing cluster as before.

👍

Varada offers a community edition of the Trino connector, which is free for up to 4 nodes. Click here to get the connector.

Acceleration Instructions

An acceleration instruction defines the materialization type Varada will perform on a column in a table in the Varada Catalog‍ in order to warm up the data in the column. The actual warmup takes place when a query hits the column.

You can manually define instructions from your Varada Control Center‍ or using REST API commands‍.

Supported Data Types

Varada supports all of Presto's built-in data types, including structural data types. While all structural data types (ARRAY, MAP, and ROW) are accessible with Varada, only fields inside ROW data types can be indexed (starting from version 360.11).

Text Analytics

Varada provides you with high-performance, full-text-search capabilities based on Lucene. You can create a Lucene index for a string column, and then run queries to quickly and precisely search through large quantities of data.

📘

Text analytics is not supported by the Varada Trino Connector.