Notes: Snowflake copy into

Be careful with copy into it has some nuances

/* 
mystage is an externalstage in S3, with a storage integration setup with rwd permissions 
on the bucket
*/

 
/*
This will delete files in S3
You absolutely don't want this if S3 contains the  only copy of your source data
*/
COPY INTO mytable
FROM @mystage
PURGE = true;


-- This will duplicate data in mytable, by loading all files again into the table
COPY INTO mytable
FROM @mystage
FORCE = true;


-- This will do a partition load
COPY INTO mytable
FROM '@mystage/event_date=2023-09-20/';



-- copy into approach with loaded timestamp
create or replace table my_source_table(
v variant, row_last_updated_ts timestamp default current_timestamp::timestamp);

COPY INTO my_source_table(v)
FROM @mystage;




Copy into only loads changed files – running the command on the same set of files that have already loaded will do nothing. Supposedly the mechanism here is a checksum on the filename + contents, although I can’t find any docs explaining how this works in practice.

Leave a comment