The Simple Guide To Snowflake Tables

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

The Simple Guide

to Tables in
Snowflake

masteringsnowflake.com
What you'll learn

The Simple Guide to tables in Snowflake


The 9 different types of table in
Snowflake, how they differ and when
you would use them.

But first we'll cover:

Data Retention Periods


Fail-Safe

Otherwise some of the concepts we


discuss won't make sense!

Adam Morton
Data Retention

The Simple Guide to tables in Snowflake


Snowflake maintains versions of data as
they change over time.

Think of this as an audit trail or ledger of


transactions applied to the data.

Each change in the data creates a new


version of that data in the background.

Snowflake will keep these versions for as


long as the data-retention period is set.

Adam Morton
Data Retention

The Simple Guide to tables in Snowflake


In both standard and enterprise editions of
Snowflake, the retention period is, by default,
one day or 24 hours.

As you can see in the table below, it is possible


to set time travel to 0, which is equivalent to
disabling it, meaning historical data is no
longer available to be queried.

Adam Morton
Fail-Safe

The Simple Guide to tables in Snowflake


After the data-retention period ends,
data cannot be viewed within your
account.

But that’s not the end of the story.

There is one final resting point for


your data, which you cannot see or
access directly.

Adam Morton
Fail-Safe

The Simple Guide to tables in Snowflake


For a further, non-configurable 7
days after data leaves the
database, data from permanent
objects ends up in something
called a fail-safe.

Think of this as a last-chance


saloon where your data rests for a
few days before finally biting the
dust!

Adam Morton
If you cannot see the

The Simple Guide to tables in Snowflake


data in the fail-safe, then
who can?

Well, only Snowflake employees can access


the fail-safe, so it may take several hours to
recover the data from this area.

Snowflake states that it is provided on a best-


endeavor basis.

Meaning you should not rely on this as part of


a disaster recovery scenario!

Adam Morton
The Simple Guide to tables in Snowflake
Ok, now let's move on to
the tables...

Adam Morton
Table types in

The Simple Guide to tables in Snowflake


Snowflake
There are 9 table types in Snowflake.

1. Permanent
2. Temporary
3. Transient
4. External
5. Directory
6. Dynamic (Public Preview)
7. Event (Public Preview)
8. Unistore/Hybrid (Private Preview)
9. Iceberg (Private Preview)

Adam Morton
Permanent Tables

The Simple Guide to tables in Snowflake


You can think of this table type
as the ‘standard’ table type.

These are the default in


Snowflake as they’ll be created
when you execute the following
command:

CREATE TABLE

Adam Morton
Permanent Tables

The Simple Guide to tables in Snowflake


Permanent tables are used to
store permanent data and exist
until explicitly dropped.

They remain until explicitly


dropped and are kept in time
travel for up to 1 day (standard
edition) and 90 days (enterprise
and higher).

Adam Morton
The Simple Guide to tables in Snowflake
Time travel is a whole
other discussion for a
different day and a
different carousel!

Adam Morton
Temporary Tables

The Simple Guide to tables in Snowflake


Snowflake supports creating temporary
tables for storing non-permanent,
transitory data.

These are very handy for short-lived ETL


working tables or session-specific data.

Temporary tables only exist within the


session in which they were created and
persist only for the remainder of the
session.

Adam Morton
Temporary Tables

The Simple Guide to tables in Snowflake


This means they are not visible to
other sessions.

Once the session ends, data stored in


the table is purged completely from
the system

This means it is not recoverable, either


by the user who created the table or
Snowflake.

Adam Morton
Temporary Tables

The Simple Guide to tables in Snowflake


So the takeaway message is don’t store
anything important in there you may need to
retrieve!

Create a temporary table by adding the


'TEMPORARY' key word:

CREATE TEMPORARY TABLE mytemptable


(id NUMBER, creation_date DATE);

Adam Morton
Transient Tables

The Simple Guide to tables in Snowflake


In contrast to temporary tables, transient tables
can be accessed by different users and
sessions.

Transient tables also persist until explicitly


dropped.

CREATE TRANSIENT TABLE mytemptable


(id NUMBER, creation_date DATE);

Adam Morton
So what’s the difference

The Simple Guide to tables in Snowflake


between a transient and
a permanent table?

The key difference is that


transient tables don’t have a fail-
safe period.

Adam Morton
External Tables

The Simple Guide to tables in Snowflake


In a typical table, the data is stored in
the Snowflake database.

In an external table, the data is stored in


files in an external stage (e.g. AWS S3).

In the case you don’t ingest the data


into Snowflake - then external tables
provide a way to query files ‘in-place’.

It also allows you to join multiple files by


joining them into a single table.

Adam Morton
External Tables

The Simple Guide to tables in Snowflake


This enables querying data stored
in files in an external stage as if it
were inside a database.

However, querying data stored


external to the database is likely to
be slower than querying native
database tables as they are not
optimised for Snowflake.

Adam Morton
External Tables

The Simple Guide to tables in Snowflake


External tables can access data stored
in any format supported by COPY
INTO <table> statements.

External tables are read-only,


therefore no DML operations can be
performed on them.

But they can be used for query and


join operations.

Adam Morton
External Tables

The Simple Guide to tables in Snowflake


Views can be created against
external tables.

Materialized views based on


external tables can improve
query performance.

Adam Morton
External Tables

The Simple Guide to tables in Snowflake


Partitioning your external tables is
recommended which allows you to either:

1. Add new partitions manually

or

2. Add new partitions automatically by


refreshing an external table that defines an
expression for each partition column.

Note: Automaticially refereshing comes at


an additional cost as it uses compute
credits in the background.

Adam Morton
Directory Tables

The Simple Guide to tables in Snowflake


A directory table is a little different to
traditional types of RDBMS tables.

It’s purpose is to catalog files which sit outside


of Snowflake in a external stage such as AWS
S3, Azure Blob or Google Cloud Storage (GCS).

You need to enable the directory table on your


stage as follows:

alter stage my_stage_name


set directory = (enable = true);

Adam Morton
Directory Tables

The Simple Guide to tables in Snowflake


Directory tables can be automaticially
refreshed using the corresponding event
notification service for the underlying cloud
provider i.e. AWS SQS, Microsoft Event Grid
or Google Pub/Sub.

You can also force a manual refresh using


the following command:

alter stage my_stage_name refresh;

Adam Morton
Directory Tables

The Simple Guide to tables in Snowflake


Once the directory table has refreshed, you
can then query the data in the table to
retrieve a URL that points to your
unstructured data file, as the following
example demonstrates:

SELECT * FROM DIRECTORY( @my_stage


name );

Once you have the URL, you can use this to


provide access to the file to your
application or users.

Adam Morton
Dynamic Tables

The Simple Guide to tables in Snowflake


A Dynamic Table is a declarative query
which aims to simplify your data
engineering efforts.

This type of table materializes the results of


a SQL query that you specify.

So, instead of creating a separate target


table and writing code to transform and
update the data in that table, you can define
the target table as a dynamic table.

Adam Morton
Dynamic Tables

The Simple Guide to tables in Snowflake


Often these tables are coupled
with Snowpipe Streaming to form
a neat and elegant way to ingest,
transform and make available
source data in a low latency and
efficient way

Adam Morton
Dynamic Tables

The Simple Guide to tables in Snowflake


You actually specify the SQL statement that
performs the transformation.

The following query provides a basic


example:

CREATE OR REPLACE DYNAMIC TABLE


TGT.CUSTOMER
LAG = '1 minute'
WAREHOUSE = COMPUTE_WH
AS
SELECT C_CUSTKEY, C_NAME, C_PHONE,
CURRENT_TIMESTAMP() AS
DATE_UPDATED
FROM STG.CUSTOMER;

Adam Morton
Dynamic Tables

The Simple Guide to tables in Snowflake


In the previous example Snowflake
executes an automated process
which updates the materialized
results automatically through
regular refreshes based upon a
time increment you define - in this
case every minute.

Adam Morton
Event Tables

The Simple Guide to tables in Snowflake


Snowflake offers a pre-
defined type of table which
allows you to capture logs
produced by stored
procedures, UDFs, and
UDTFs in your account.

Adam Morton
Event Tables

The Simple Guide to tables in Snowflake


Once you’ve created an event table
and associated it with your account,
you use an API in the language of
your handler to emit log messages
from handler code.

After you’ve captured log and trace


data, you can query the data to
analyze the results.

Adam Morton
Iceberg Tables

The Simple Guide to tables in Snowflake


Querying data within Snowflake using
External Tables then can work well.

But what if you have other teams in


your organisation who need to use a
different application (other than
Snowflake) to access the same files?

If you’re not using Snowflake, then


you cannot tap into the External
Tables.

Adam Morton
So, how can you ensure

The Simple Guide to tables in Snowflake


that the data is used
consistently across the
enterprise? ​
Well, this is where Apache Iceberg comes
in.

Originally developed by Netflix and Apple,


the challenge they were looking to
address was similar.

Iceberg was the answer and now


Snowflake (and AWS) have
adopted support for this.

Adam Morton
So how does Iceberg

The Simple Guide to tables in Snowflake


work?

Essentially, Iceberg allows you to apply a


Table Format to a group of files.

If also offers support for fine grained options,


ACID support, schema evolution and better
performance due to impro​​ved filtering.

Adam Morton
Promotes Consistency

The Simple Guide to tables in Snowflake


When an application needs to access this data,
it can leverage the table format first which
ensures a consistent format is applied across
the organisation.​​

Iceberg Tables use customer-supplied


storage.

This means you no longer pay Snowflake for


storage costs.

Instead, storage is billed by your cloud


provider.

Adam Morton
How to use Iceberg

The Simple Guide to tables in Snowflake


tables?

The first step in creating an Iceberg


Table is to create an External
Volume to hold the Iceberg Table
data and metadata.

Adam Morton
Create an external

The Simple Guide to tables in Snowflake


volume

create or replace external volume


my_ext_vol
STORAGE_LOCATIONS =
(
(
NAME = 'my-s3-us-east-1'
STORAGE_PROVIDER = 'S3'
STORAGE_BASE_URL = 's3://my-s3-
bucket/data/snowflake_extvol/'
STORAGE_AWS_ROLE_ARN = '****'
)
);

Adam Morton
The Simple Guide to tables in Snowflake
Now you can create one or more
Iceberg Tables using the External
Volume.

Notice on the next page that there is a


new iceberg table qualifier that
distinguishes an Iceberg Table from a
native Snowflake table.

In this example, we will also pre-


populate the table with data from our
sales table.

Adam Morton
Create the iceberg table

The Simple Guide to tables in Snowflake


create or replace iceberg table
my_iceberg_table
with EXTERNAL_VOLUME =
'my_ext_vol'
as select id, date, first_name,
last_name, address, region,
order_number, invoice_amount
from sales;

Adam Morton
Unistore/Hybrid Tables

The Simple Guide to tables in Snowflake


All of us are tired of spending our valuable
time moving data between systems.

We don't want to manage redundant


datasets across multiple solutions.

We want to access data when they need,


and be able to work with virtually all their
data in one place.

This is where Unistore’s comes in.

Adam Morton
Unistore/Hybrid Tables

The Simple Guide to tables in Snowflake


Unistore allows you to build
transactional business applications
directly on Snowflake.

You can run real-time analytical queries


on their transactional data.

With a unified data store it is easier to


achieve a consistent approach to
governance and security.

Adam Morton
Some benefits of

The Simple Guide to tables in Snowflake


Unistore
A single dataset to power the future
of modern development.

Act on transactional data almost


immediately, build better customer
experiences, and get new insights
by integrating transactional and
analytical data in a single dataset.

Simple and streamlined


transactional app development on
Snowflake

Adam Morton
New row-based storage

The Simple Guide to tables in Snowflake


engine

Snowflake have introduced an


entirely new row-based storage
engine so enterprise transactional
applications can now be built
directly on Snowflake.

Hybrid tables underpin the Unistore


feature and getting started with
Hybrid Tables is easy.

Adam Morton
How to create a hybrid

The Simple Guide to tables in Snowflake


table?

Simply create a table the same way


you would any other traditional
Snowflake table.

But to support these transactional


workloads, Hybrid Tables require a
primary key and Snowflake will now
enforce the uniqueness of your
applications’ primary keys.

Adam Morton
The Simple Guide to tables in Snowflake
CREATE HYBRID TABLE Orders (
Orderkey number(38,0) PRIMARY
KEY,
Customerkey number(38,0),
Orderstatus varchar(20),
Totalprice number(38,0),
Orderdate timestamp_ntz,
Clerk varchar(50)
);

Adam Morton
The Simple Guide to tables in Snowflake
Of course, your application requires more
than a single table.

An application’s data model is defined by


the relationships between tables using
primary keys and foreign keys.

And with Hybrid Tables, you can define


the relationship using referential integrity
constraints that are now enforced.

Let’s walk through an example. First we’ll


create the Hybrid Tables and the
associated constraint:

Adam Morton
The Simple Guide to tables in Snowflake
CREATE HYBRID TABLE Customers (
CustomerKey number(38,0)
PRIMARY KEY,
Customername varchar(50)
);

Adam Morton
The Simple Guide to tables in Snowflake
-- Create order table with foreign key
referencing the customer table
CREATE OR REPLACE HYBRID TABLE Orders
(
Orderkey number(38,0) PRIMARY KEY,
Customerkey number(38,0),
Orderstatus varchar(20),
Totalprice number(38,0),
Orderdate timestamp_ntz,
Clerk varchar(50),
CONSTRAINT fk_o_customerkey FOREIGN
KEY (Customerkey) REFERENCES
Customers(Customerkey),
INDEX index_o_orderdate (Orderdate));

Adam Morton
Et Voila!

The Simple Guide to tables in Snowflake


I hope you found The Simple
Guide to Tables in Snowflake
valuable!

If you did please share it with


other who you think would
benefit!
I'm Adam Morton

The Simple Guide to tables in Snowflake


I'll help you unlock your career
potential and supercharge your
career.

Follow me and lets move your


temporary dreams to a
permanent reality.
Work With Me

The Simple Guide to tables in Snowflake


Apply to join my exclusive Mastering
Snowflake program

Link in my profile

Adam Morton

You might also like