Be careful with copy into it has some nuances
/*
mystage is an externalstage in S3, with a storage integration setup with rwd permissions
on the bucket
*/
/*
This will delete files in S3
You absolutely don't want this if S3 contains the only copy of your source data
*/
COPY INTO mytable
FROM @mystage
PURGE = true;
-- This will duplicate data in mytable, by loading all files again into the table
COPY INTO mytable
FROM @mystage
FORCE = true;
-- This will do a partition load
COPY INTO mytable
FROM '@mystage/event_date=2023-09-20/';
-- copy into approach with loaded timestamp
create or replace table my_source_table(
v variant, row_last_updated_ts timestamp default current_timestamp::timestamp);
COPY INTO my_source_table(v)
FROM @mystage;
Copy into only loads changed files – running the command on the same set of files that have already loaded will do nothing. Supposedly the mechanism here is a checksum on the filename + contents, although I can’t find any docs explaining how this works in practice.
Leave a comment