Why The Company Was Called Snowflake?: Founders
Why The Company Was Called Snowflake?: Founders
Why The Company Was Called Snowflake?: Founders
Snowflake on AWS
Snowflake on Azure
Absolutely, you can now run Snowflake across all three major cloud platforms, and
indeed seamlessly share or replicate data across platforms. There are also a huge
number of 3rd party tools are available to load data from Google Cloud Platform into
Snowflake on AWS or Azure.
In addition, each warehouse can be resized within milliseconds from a single node
extra-small cluster to a massive 128-node monster. This means, users don’t have to
put up with poor performance, as the machine size can be adjusted throughout the day
to match the workload. In one benchmark test, I reduced the time to process 1.3
terabytes of data down from 5 hours to under 3 minutes.
Finally, in addition to scaling up for larger data volumes, it’s also possible to
automatically scale out to support a massive numbers of users. The diagram below
illustrates how the Snowflake multi-cluster feature automatically scales out and then
back in during the day, and the user is only charged for the time the clusters are
actually running.
What are the layers of the Snowflake
service?
The diagram below illustrates the layers in the Snowflake service:
The Service Layer: Provides connectivity to the database and handles
concurrency, transaction management and metadata.
The Compute Layer: Hosts a potentially unlimited number of virtual
warehouses, (compute clusters) on which SQL statements are executed.
The Storage Layer: Hosts a potentially infinite size data pool.
Also, note that the database and schema you just created are now in use for your
current session. This information is displayed in your SnowSQL command
prompt, but can also be viewed using the following context functions:
Creating a Table
Create a table named emp_basic in sf_tuts.public using the CREATE
TABLE command:
create or replace table emp_basic (
first_name string ,
last_name string ,
email string ,
streetaddress string ,
city string ,
start_date date
);
Note that the number of columns in the table, their positions, and their data types
correspond to the fields in the sample CSV data files that you will be staging in
the next step in this tutorial.
Creating a Virtual Warehouse
Create an X-Small warehouse named sf_tuts_wh using the CREATE
WAREHOUSE command:
create or replace warehouse sf_tuts_wh with
warehouse_size='X-SMALL'
auto_suspend = 180
auto_resume = true
initially_suspended=true;
Note that the warehouse is not started initially, but it is set to auto-resume, so it
will automatically start running when you execute your first SQL command that
requires compute resources.
Also, note that the warehouse is now in use for your current session. This
information is displayed in your SnowSQL command prompt, but can also be
viewed using the following context function:
Note that this command requires an active, running warehouse, which you
created as a prerequisite for this tutorial. If you don’t have access to a
warehouse, you will need to create one now.
You can also manipulate the data, such as updating the loaded data or inserting
more data, using standard DML commands.
+--------------------------+
| EMAIL |
|--------------------------|
| gbassfordo@sf_tuts.co.uk |
| rtalmadgej@sf_tuts.co.uk |
| madamouc@sf_tuts.co.uk |
+--------------------------+
+------------+-----------+------------------------------+
| FIRST_NAME | LAST_NAME | DATEADD('DAY',90,START_DATE) |
|------------+-----------+------------------------------|
| Granger | Bassford | 2017-03-30 |
| Catherin | Devereu | 2017-03-17 |
| Cesar | Hovie | 2017-03-21 |
| Wallis | Sizey | 2017-03-30 |
+------------+-----------+------------------------------+
Lifecycle Diagram
All user data in Snowflake is logically represented as tables that can be queried
and modified through standard SQL interfaces. Each table belongs to a schema
which in turn belongs to a database.
Organizing Data
You can organize your data into databases, schemas, and tables. Snowflake
does not limit the number of databases you can create or the number of schemas
you can create within a database. Snowflake also does not limit the number of
tables you can create in a schema.
Storing Data
You can insert data directly into tables. In addition, Snowflake provides DML for
loading data into Snowflake tables from external, formatted files.
Querying Data
Once data is stored in a table, you can issue SELECT statements to query the
data.
Working with Data
Once data is stored in a table, all standard DML operations can be performed on
the data. In addition, Snowflake supports DDL actions such as cloning entire
databases, schemas, and tables.
Removing Data
In addition to using the DML command, DELETE, to remove data from a table,
you can truncate or drop an entire table. You can also drop entire schemas and
databases.
DDL Commands
Data Definition Language (DDL) commands are used to create, manipulate, and
modify objects in Snowflake, such as users, virtual warehouses, databases,
schemas, tables, views, columns, functions, and stored procedures.
They are also used to perform many account-level and session operations, such
as setting parameters, initializing variables, and initiating transactions.
The following commands serve as the base for all DDL commands:
ALTER <object>
COMMENT
CREATE <object>
DESCRIBE <object>
DROP <object>
SHOW <objects>
USE <object>
Each command takes an object type and identifier. The remaining parameters
and options that can be specified for the command are object-specific.
Managed Accounts
CREATE MANAGED Currently used to create reader accounts for providers who
ACCOUNT wish to share data with non-Snowflake customers.
DROP MANAGED
ACCOUNT
SHOW MANAGED
ACCOUNTS
SHOW REPLICATION
ACCOUNTS
SHOW REPLICATION
DATABASES
SHOW REGIONS
Session Parameters
ALTER SESSION For setting parameters within a session; can be performed by any
user.
SHOW For viewing parameter settings for the session (or account); can
PARAMETERS also be used to view parameter settings for a specified object.
Session Context
USE ROLE Specifies the user role to use in the session.
USE SCHEMA Specifies the schema to use in the session (specified schema must
be in the current database for the session).
DESCRIBE RESULT Describes the columns in the results from a specified query
(must have been executed within the last 24 hours).
SHOW
TRANSACTIONS