Hope springs eternal in the database business. While we’re still hearing about data warehouses (fast analysis databases, typically featuring in-memory columnar storage) and tools that improve the ETL step (extract, transform, and load), we’re also hearing about improvements in data lakes (which store data in its native format) and data federation (on-demand data integration of heterogeneous data stores).
keeps coming up as a fast way to perform on big data that resides in data lake files. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes. Presto allows querying data where it lives, including Hive, Cassandra, relational databases, and proprietary data stores. A single Presto query can combine data from multiple sources. Facebook uses Presto for interactive queries against several internal data stores, including their 300PB data warehouse.
The Presto Foundation is the organization that oversees the development of the Presto open source project. Facebook, Uber, Twitter, and Alibaba founded the Presto Foundation. Additional members now include Alluxio, Ahana, Upsolver, and Intel.
, the subject of this review, is a managed service that simplifies Presto for the cloud. As we’ll see, Ahana Cloud for Presto runs on Amazon, has a fairly simple user interface, and has end-to-end cluster lifecycle management. It runs in Kubernetes and is highly scalable. It has a built-in catalog and easy integration with data sources, catalogs, and dashboarding tools.
Competitors to Ahana Cloud for Presto include , , and . I will draw comparisons at the end of the article.