In addition to deploying Varada as a data platform, you can easily deploy Varada as a connector to leverage its performance improvements for your existing Trino clusters.
Varada can integrate into clusters running the open-source of Trino (formerly known as PrestoSQL), and the commercial offerings of Amazon EMR, GCP DataProc, and Starburst Enterprise. In this deployment model, Varada seamlessly applies its dynamic and adaptive indexing-based acceleration technology to your existing clusters.
The following diagram shows how Varada integrates into your existing clusters. Varada deploys its own connectors in addition to your existing connectors, and leaves your existing connectors in place. No query rewrites are required, as your workloads continue to query the existing cluster as before.
Deploying Varada as a connector is simple:
copy to the connector tar.gz file to /tmp/ directory on each of the clusters' nodes.
run the following script
# create the varada-install directory and extract the connector RPM mkdir /tmp/varada-install tar -zxf /tmp/presto-connector.tar.gz -C /tmp/varada-install # add access permissions to the trino directory sudo chmod 777 -R /opt/trino-362
- Install the connector by running the following script on each of the clusters' machines:
installer.py [-e VARADA_ROOT_FOLDER] [-o ORIGINAL_CATALOG] [-c TARGET_CATALOG_NAME] [-w WORKER_TYPE] [-m METADATA_STORAGE_PATH] [-p PLUGIN_DIR_PATH] [-d CONFIG_DIR_PATH] [-b BACKEND_PORT_NUMBER] [-u CLUSTER-UNIQUE-ID]
The path to the connector directory.
The catalog properties file from which to copy the properties.
The name of the newly generated properties file for the Varada catalog.
The worker machine type. This parameter should only be specified if the coordinator machine type is different from the worker machine type.
The path to the metadata persistency location on S3.
The path to the plugin directory.
The path to the configuration directory.
The Varada port number.
Note: The coordinator node and the worker nodes need to be able to communicate via this port.
A unique id to set for the cluster
sudo python3 /tmp/varada-install/varada-connector-362/varada/installer.py -e /tmp/varada-install/varada-connector-362/ -o hive -c varada -m s3://my-s3-bucket/user-2020-12-14-10-53-45/ -p /opt/trino/trino-server-362/plugin/ -d /opt/trino/trino-server-362/etc/ -b 8088 -u my-cluster
- Restart the cluster.
That's it! Once the cluster restarts, Varada's solution is integrated and is available through the Varada catalog, while the existing Hive or Iceberg catalog remains untouched. All queries running on tables under the Varada catalog will leverage Varada's acceleration.
The Varada Connector requires the coordinator node and the worker nodes to communicate via port 8088, which is the default
http-rest-portdefined in the varada.properties file.
The worker node instance type must be from the r5d or i3 families that include SSD disks.
Varada supports various Trino and Presto cluster deployment methods.
For specific information related to your deployment method, please contact [email protected].
Presto® is a trademark of the Linux Foundation.
Updated about 10 hours ago