Home  >  Blog  >   Snowflake

Snowflake Cloning - Zero Copy Clone

This article explains the snowflake cloning introduction and zero-copy clone snowflake, their benefits, and how to use them. This topic discusses important things on how to use zero-copy cloning with a snowflake's ability to travel through time at table levels. We will also give some things to consider when cloning particular snowflake objects. This information will give you a good understanding of snowflakes' most important characteristics.

Rating: 4.9
  
 
1483
  1. Share:
Snowflake Articles

Snowflake is a major cloud-computing data warehousing firm that will play an important part in the development of artificial intelligence. 

Snowflake is a business solution that helps companies establish and run a system without substantially relying on DBAs. 

Snowflake offers data analysis, collecting, and analytics solutions that are substantially quicker, easier to use, and more versatile than the standard systems used up until now. In addition to helping with other things, it is helpful for advanced analytics, business intelligence, systems integration, and security and governance.

Table of Content: Snowflake Cloning

What is Cloning in Snowflake?

Snowflake cloning reproduces data from a schema, database, or table without the need for additional storage or long waiting periods. In most organizations, real-time data must be copied from production to development or staging environments to change the object with exact results. We no longer have to sit for long periods to enjoy the environment. 

Snowflake Cloning is significantly faster than any other database for cloning. Depending on the source objects, replication takes only a few minutes. 

The cloned object can be written to and is separate from the source of the clone. Changes made to the original object or the copy don't affect the other. 

  • When you clone a database, all the schemas and tables in that database will also be cloned. 
  • When you clone a schema, all the tables in that schema will also be copied. It shows how to copy databases, tables, and schemas in a script, along with comments that explain what is going on. 

When a table is cloned, its data, structure, and other characteristics are copied. The original table's load history is not preserved in cloned tables. Suppose a data file was successfully loaded into the original table. In that case, it could be loaded into a table copy.

Some of the most advanced features of snowflakes, such as time travel and zero-copy cloning, simplify life for us. The clone command makes it simple to create duplicates of a database or table without manually copying all associated metadata, primary keys, or schema. We may replicate the entire database with a single operation, including its metadata and structure. It's helpful during development when we need to replicate a database or table quickly.

Example of Cloning in Snowflake

The following script should run entirely if you copy and paste it into the Worksheet in the Snowflake online interface:

The following script should run entirely if you copy and paste it into the Worksheet in the Snowflake online interface:

-- Cloning Tables

-- Create a sample table

CREATE OR REPLACE TABLE demo_db.public.employees

 (emp_id number,

 first_name varchar,

 last_name varchar

 );

-- Populate the table with some seed records.

Insert into demo_db.public.employees

values(100,'John','Smith'),

 (200,'Sam','White'),

 (300,'Bob','Jones'),

 (400,'Linda','Carter');

-- Show the content of the table employees in the demo_db database and the public schema

select * from demo_db.public.employees;

Cloning

Want to enhance your skills to become a master in Snowflake Certification, Enroll in our Snowflake Training Certification Course

Snowflake Cloning with Time Travel

One of the many benefits of using a snowflake database is the ability to go back in travel time and retrieve previously modified or deleted data.

Suppose a table or part of its entries were deleted inadvertently. In that case, you could get them back within the table's configured time-travel period. It can be done using the AT or BEFORE clauses, which allow for reference timestamps, offsets from the current time, or statement IDs.

Below is the syntax to use Time Travel to clone a table

CREATE [ OR REPLACE ] { DATABASE | SCHEMA | TABLE | STREAM } [ IF NOT EXISTS ] <object_name>  
CLONE <source_object_name>  
[ { AT | BEFORE } ( { TIMESTAMP => <timestamp> | OFFSET => <time_difference> | STATEMENT => <id> } ) ]

This feature allows you to clone a table precisely as it existed at a given time or a short while/hours ago. 

Here are a few examples of using time travel to duplicate tables in Snowflake:

CREATE TABLE EMPLOYEE_CLONE CLONE EMPLOYEE 
AT(OFFSET => -60*5); 
CREATE TABLE EMPLOYEE_CLONE CLONE EMPLOYEE 
AT(TIMESTAMP => 'SUN, 06 MAR 2022 13:45:00 +0530'::TIMESTAMP_TZ); 

Considerations Of Snowflake Cloning with Time Travel

There are four primary considerations to make when cloning with time travel:

  1. An error is returned if the source object did not exist at the time or location indicated in the AT or BEFORE clause.
  2. Objects whose children did not exist at the time/point indicated in the AT or BEFORE clause are not cloned. There will be no mistakes.
  3. An error is returned if any objects needed to clone the object do not have historical data. The time specified in the AT or BEFORE clause is not part of the object's data retention time.
  4. Time travel data retention keeps data and metadata so that a standard time travel query can get the names of columns and other metadata from the past. A clone, on the other hand, gets its data from the historical store, but it gets its metadata from the source object in its current state. For example, suppose a column in a table has been renamed. The column names will be retrieved from the current state, while the data in the table will be fetched depending on the time/point provided by the AT or BEFORE clause.

Cloning with Time Travel

Related Article: Snowflake Tutorial

What is Zero Copy Cloning in Snowflake?

The cloning process can produce a database copy called "zero-copy cloning." 

When an object is cloned, it receives a copy of all the information previously stored in the original object. As a result of the cloning process, the copied object can be updated independently of the original. To put it another way, if you make changes to either the original object or the clone, you won't see those changes reflected in the other. Through zero-copy cloning, Snowflake makes it simple to make backup copies of your data without increasing your storage needs or requiring you to wait for extended periods.

Snowflake's ability for zero-copy cloning is an outstanding feature. It is the ability to make many copies of something without making a new physical copy or requiring more space to store them. Since expenditures are reduced without restriction, significantly more liberty is available in the setting.

Cloned objects are separate from the original table; thus, we can update them without affecting the original table. However, making a copy of any object won't cost us anything and is economical. All metadata is simple to copy, and storage management is enhanced.

Purpose

Any database or table cloning serves the same purpose as establishing a backup for development and functions as time travel allows us to restore data.

Command

CREATE TABLE <table_name>
CLONE <source_table_name>

What Privileges are Required in Zero Copy Clone Snowflake?

To clone an item, you must have the most basic permissions possible. Your current role ought to have the appropriate permissions on the source object to be able to create a clone of it:

  • Additional items: USAGE
  • Tables: SELECT
  • In addition, to clone a schema or an object contained within a schema, the role you are currently playing needs to have the appropriate privileges on both the object that will serve as the source and the one that will serve as the clone container (s).

How does Snowflake's Zero Copy Cloning work?

A Snowflake table's data is automatically partitioned into micro-partitions, the minor persistent storage units. Uncompressed data ranges from 50 MB to 500 MB across all micro partitions.

Note: Snowflake's actual size is smaller because data is always saved and compressed. All Snowflake tables undergo this micro-partitioning procedure automatically.

In Snowflake, the clone can be duplicated an infinite number of times, and each copy will have access to both shared and dedicated storage. Snowflake assigns a unique identifier (ID) to each table in the database. A similar CLONE GROUP ID that reflects cloning status is assigned to each table. The table is a clone if the IDs in both columns are unique. The Account Admin role is required to access the TABLE STORAGE METRICS view. 

Zero Copy Cloning Work

When you use the Clone database command, the following occurs:

  • Creates a new data source.
  • Everything in the database is generated (notice that particular objects are not cloned due to constraints).

 In place of creating duplicate micro-partitions when cloning an object database, Snowflake instead creates new metadata that refers back to the original object's micro-partitions. Zero-copy cloning describes this process. Due to Snowflake's cloud services layer, the user is not required to perform any of these actions.

Related Article: Snowflake Architecture

Advantages of Zero Copy Clone Snowflake

The Zero Copy Clone Snowflake has many benefits, some of which are as follows:

  • Snowflake "Fast Clone": The zero-copy clone snowflake approach enables users to instantly create as many copies of the data as they need without incurring any of the usual costs of doing so. It also helps the user save time.
  • Easy to use: Zero-copy clone snowflake technology allows duplicate data structures like tables, databases, and schemas without copying the originals. No more paperwork is necessary. Therefore, cloning is a straightforward and elementary procedure that does not need any specialized knowledge or equipment.
  • Saves you Time: The users are required to wait for long periods, including hours, days, and even weeks, to construct a test environment or even develop one from a copy of the production data warehouse. The user saves time and does not have to shell out additional charges for the test and development environment so that all of the replicated data. The user can pay for both environments separately.
  • Saves money on storage: Using Zero clone snowflake, users can make copies of an item without duplicating its storage. Cloning a table doesn't consume any space since it uses the same micro partitioning scheme as the original table, which already exists in the parent database. Rows in a clone can be modified independently of the original table. When a clone is updated, it creates a set of new, CDP-protected micro partitions unique to that clone.

MindMajix Youtube Channel

How to Clone objects in Snowflake?

Cloning objects in Snowflake is accessible and can be done with the help of the simple SQL statement displayed below.

CREATE <object_type> <object_name>
CLONE <source_object_name>

The version presented above is an abridged version. The following is an example of the full syntax that must be used to clone a Snowflake object.

CREATE [ OR REPLACE ] { DATABASE | SCHEMA | TABLE | STREAM | STAGE | FILE FORMAT | SEQUENCE | TASK } [ IF NOT EXISTS ] <object_name>
CLONE <source_object_name>

With the above statements, a copy of a source object will be made into a new object. Even though the cloned object's metadata and the data it holds are the same as the source object's, each object has its life cycle and is separate from the other.

Child Objects in a Clone

When any internal stage is cloned (typically an internal table stage), it is critical to remember that this stage will be empty. Similarly, cloned streams will be empty by default, and cloned pipes and jobs will be paused. Some Snowflake objects have child objects, such as a database with schemas containing tables containing table stages.

A pipe is an example of a child that an object could produce. A pipe that loads data into one of a schema's tables would be an example of something that might be included in a schema. The behavior of the pipe determines whether or not a child pipe is copied along with its parent pipe during a clone operation. Whether or not the pipe refers to internally named stages is the specific factor determining this. Keep in mind that internally named stages are not duplicated in any way. It also applies to pipes that have their internal stages called internally. If a pipe, on the other hand, refers to external stages, then that will clone the pipe.

When a pipe is cloned, the operation will, by default, pause the pipe. It is done to prevent excessive data loading that is not essential, as a clone may not need additional data loading. In addition, this allows the user to verify that the pipe will carry out the intended operation. It is of utmost significance in situations where the pipe definition includes a specification of either the complete namespace or the schema. These will be kept up to date so that data may load incorrectly.

Children of a cloned item will typically also be clones. This regulation has the following exceptions:

  • External Tables
  • Internal Named Stages

Internal table stages, which are cloned along with their parent table, are not to be confused with these.

Which Objects can be Cloned in Zero Copy Clone Snowflake?

Before learning how to clone an object, you must determine which objects can be cloned and any limits. At the time of writing, this is a list of all cloneable objects.

Snowflake's Cloning Documentation has an up-to-date list:

Data Storage Objects such as

  • Schemas 
  • Streams
  • Tables
  • Databases

Data Configuration Objects:

  • File Formats 
  • Stages 
  • Sequences 

The cloning functionality for each category influences how the tasks are categorized and organized into groups.

What is the Point of Cloning an Object in Zero Copy Clone Snowflake?

There are many reasons to copy an item in Snowflake or any other Data Warehouse. Most cloning happens because of one of these three things

  1. Support a wide range of environments, including testing, development, and backup.
  2. To try out potential alterations/development without having to set up a new environment or risk damaging the source object.
  3. To finish a one-time task that uses its source item.

Conclusions

The Zero Copy Clone Snowflake functionality has been detailed in this article. It is one of the most beautiful snowflake features. Because copying entire databases for testing is simple, you may quickly accumulate hundreds of terabytes of redundant storage. You can discover and remove this storage if the Snowflake administrators know the underlying mechanism. Database cloning is basic and straightforward. Cloning can not only save you time, but it will also save you a lot of money!

Join our newsletter
inbox

Stay updated with our newsletter, packed with Tutorials, Interview Questions, How-to's, Tips & Tricks, Latest Trends & Updates, and more ➤ Straight to your inbox!

Course Schedule
NameDates
Snowflake TrainingMar 30 to Apr 14View Details
Snowflake TrainingApr 02 to Apr 17View Details
Snowflake TrainingApr 06 to Apr 21View Details
Snowflake TrainingApr 09 to Apr 24View Details
Last updated: 04 Apr 2023
About Author

 

Madhuri is a Senior Content Creator at MindMajix. She has written about a range of different topics on various technologies, which include, Splunk, Tensorflow, Selenium, and CEH. She spends most of her time researching on technology, and startups. Connect with her via LinkedIn and Twitter .

read more