In this blog, you will learn how to load data from an external AWS S3 bucket stage into a Snowflake table.
Upload a file to the AWS S3 bucket, as shown in the screenshot below. The bucket name is
sfbucketpbi, the folder name is
files, and the file name is
Copy the S3 bucket folder URL and make a note of it for future reference.
Create The IAM Role In AWS
Log into the AWS Management Console, and from the home dashboard, select
Identity & Access Management (IAM).
Alternatively, you can search for
IAM and click on
Roles , then proceed to click on
After clicking on the
Create Role button, select the trusted entity, choose the
AWS Account option, leave all other settings unchanged, and click on the
Add Permission Policies
S3, and you will see all S3-related policy names. Choose
AmazonS3FullAccess and click the Next button.
Assign the role name and description
Enter a unique role name, provide a description, and then click on
Create Role button.
Now role has been created successfully.
Find ARN Name(Amazon Resource Name)
Go to IAM > Roles > Select your role > Copy the ARN and make a note of it somewhere for future use.
AWS S3 Storage Integration with Snowflake
Execute below command in snowflake Web UI.
create or replace storage integration my_aws_integration TYPE = EXTERNAL_STAGE STORAGE_PROVIDER = S3 ENABLED = TRUE STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::252332417099:role/mysnowflakerole' STORAGE_ALLOWED_LOCATIONS = ('s3://sfbucketpbi/files/') COMMENT = 'Integration with aws s3 buckets' ;
STORAGE_AWS_ROLE_ARN: Specifies the Amazon Resource Name (ARN) of an AWS IAM role. This role is used to grant Snowflake the necessary permissions to access the specified AWS resources.
STORAGE_ALLOWED_LOCATIONS: Defines the allowed locations or paths within the specified cloud storage provider (S3 in this case). In this example, data can be loaded from or unloaded to the s3://sfbucketpbi/files/ location within the S3 bucket. Refer prerequisite section of this post.
Retrieve The AWS IAM User For Your Snowflake Account
Execute below command, Snowflake will return a result set that includes information about the specified integration.
DESC integration my_aws_integration;
And copy the property value of STORAGE_AWS_IAM_USER_ARN, refer above image
Grant The IAM User Permissions To Access Bucket Objects
Switch back to the AWS Console, navigate to
IAM, locate your role, and then click
Edit trust policy.
Replace this with your STORAGE_AWS_IAM_USER_ARN, and click on Update policy button.
Create File Format Objects
Define a file format that specifies how your data is formatted in the S3 bucket files.
Let’s take a look at a sample file. The file format is TXT, and the data is in comma-separated format.
CREATE OR REPLACE FILE FORMAT aws_file_format
TYPE = ‘CSV’
FIELD_DELIMITER = ‘,’
SKIP_HEADER = 1;
- Type: This specifies the file format type. In this case CSV will work for TXT file.
- Field Delimiter: It specifies the field delimiter used in the files. Here data are separated by commas.
- Skip Header: This line specifies the number of header rows to skip in the files.
Create an external stage
External stages are used to access and manage data files from external sources in Snowflake, such as Amazon S3, Azure Blob Storage, or Google Cloud Storage.
Create a stage object to specify to Snowflake where to retrieve the files from-
CREATE OR REPLACE STAGE my_stage_aws URL = 's3://sfbucketpbi/files/empData.txt' STORAGE_INTEGRATION = my_aws_integration file_format= aws_file_format; list @my_stage_aws;
Create a Table in Snowflake for Data Load
CREATE TABLE statement to create a table and execute it.
create or replace TABLE DEVDB.PUBLIC.empData ( EMP_ID NUMBER(38,0) NOT NULL autoincrement, FIRST_NAME VARCHAR(50), LAST_NAME VARCHAR(50), SALARY NUMBER(10,2) ); Select * from empData;
Copy command to ingest data from AWS S3
COPY INTO command is used to load data from external files, such as CSV, TSV or Parquet files, into a Snowflake table.
Copy into DEVDB.PUBLIC.empData from @DEVDB.PUBLIC.my_stage_aws;
After that you can execute the below query to check the data in empData table.
Select * from empData;
Thank you for reading this post, and I hope you enjoyed it!