-
Athena
Cloud/AWS
2022. 6. 17. 10:56
Overview
- Interactive query service for S3 (SQL)
- No need to load data, it stays in S3
- Presto under the hood
- Serverless
- Unstructured, semi-structured, or structured
- Supports many data formats
- CSV
- JSON
- ORC
- Parquet
- Avro
Examples
- ad-hoc queries of weblogs
- Querying staging data before loading to Redshift
- Analyze Cloudtail/CloudFront/VPC/ELB etc logs in S3
- Integration with Jupiter, Zeppelin, R Studio notebooks
- Integration with QuickSight
- Integration via ODBC/JDBC with other visualization tools
Integration with Glue
- Glue Crawler populate Glue Data Catalog for S3
- Athena will see Glue Data Catalog and build a table from it automatically
- Athena provides a SQL interface underlying Glue Structure
Cross-region concerns
- Athena cannot query across regions on its own, BUT a Glue crawler can
- So, you can query S3 data across regions if you query a Glue Data Catalog in the same region as Athena
- And, the Glue Crawler that created the data catalog spanned multiple regions