Amazon QuickSight gets Presto, Apache Spark Connectors to Big Data Visualization

Amazon Web Services Inc. (AWS), has increased its Big Data visualization capabilities by adding two connectors — Presto and Apache Spark — into its Amazon QuickSight service.
Amazon QuickSight provides business analytics services that include visualization, ad-hoc analyses and other insight functionality. It claims to offer Business Intelligence (BI), capabilities at a fraction of the cost of traditional solutions.
It allows developers to connect to data sources of different types, including CSV and Excel files, popular databases like SQL Server, MySQL, and PostgreSQL, and a host AWS services such as Amazon Redshift and Amazon RDS, Amazon Aurora and Amazon Athena, Amazon S3 or Amazon S3.
Yesterday’s announcement of Presto and Spark connectors means that those AWS sources now also include Amazon EMR, a managed Hadoop framework to support Big Data analytics across Amazon EC2 compute instances.
Apache Spark is a popular open-source tool that adds new capabilities to the original Hadoop-based ecosystem. Presto, a less-known distributed SQL query engine, is used by data developers to perform interactive analytic queries using data sources ranging from gigabytes up to petabytes.
Presto supports ANSI SQL standard operations, such as complex queries and joins, and window functions.
AWS stated yesterday that Presto’s execution framework was fundamentally different than Hive/MapReduce. “Presto uses a custom query engine and execution engine. The stages of execution are piped together, similar to a directed Acyclic Graph (DAG). All processing takes place in memory to reduce disk I/O. This pipelined execution model allows for multiple stages to run simultaneously and streams data from one stage of the pipeline to the next as data becomes available. Presto is a great tool to ad-hoc data exploration with large data sets. Presto can be run on multiple data sources, such as Amazon S3.
This post explains how to create an EMR cluster and set up Presto and Lightweight Directory Access Protocols (LDAP) with Secure Sockets Layers (SSL), and then use QuickSight for visualization.