ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Lake Formation
    Cloud/AWS 2022. 4. 26. 11:43

    Introduction

    • Can tie to IAM users/roles, SAML, or external AWS accounts
    • Can use policy tags on databases, tables, or columns
    • Can select specific permissions for tables or columns

    Overview

    • “Makes it easy to set up a secure data lake in days”
    • Loading data & monitoring data flows
    • Setting up partitions
    • Encryption & managing keys
    • Defining transformation jobs & monitoring them
    • Access control
    • Auditing
    • Built on top of Glue

    Pricing

    No cost for Lake Formation itself. but underlying services incur changes

    • Glue
    • S3
    • EMR
    • Athena
    • Redshift

    Building a Data Lake

    1. Create an IAM user for Data Analyst
    2. Create AWS Glue connection to your data sources
    3. Create S3 bucket for the lake
    4. Register the S3 path in Lake Formation, grant permissions
    5. Create database in Lake Formation for data catalog, gran permissions
    6. Use a blueprint for a workflow (ie, Database snapshot)
    7. Run the workflow
    8. Grant SELECT permissions to whoever needs to read it (Athena, Redshift Spectrum, etc)

    Finer Point

    Lake Formation does not support manifests in Athena or Redshift queries

    IAM Permissions on the KMS encryption key are needed for encrypted data catalogs in Lake Formation

    IAM Permissions needed to create blueprints and workflow

    Cross-account Lake Formation permission

    • Recipient must be set up as a data lake administrator
    • Can use AWS Resource Access Manager for accounts external to your organization
    • IAM permissions for cross-account access

    Troubleshooting

    If encountering an error about being able to create a blueprint or a workflow, it’s probably an IAM issue

    If encountering any cross-account permission issue, it’s probably need to do something with Resource Access Manager (RAM)

    'Cloud > AWS' 카테고리의 다른 글

    Glue  (0) 2022.06.15
    AWS Redshift  (0) 2022.06.05
    Lambda  (0) 2021.03.09
    Choosing the right database on AWS  (0) 2021.03.08
    DynamoDB  (0) 2021.03.08

    댓글

Designed by Tistory.