Teradata Certification Associate Teaser
Teradata Certification Associate Teaser
Teradata Certification Associate Teaser
The Tera-Tom Genius Series consists of ten Teradata and Aster Data books. Each book design is for a specific
audience level. The books take a building block approach; always starting simple, then each page builds upon the
previous point. Order them all at www.CoffingDW.com or from your favorite book site.
Tera-Tom- Author of over 50 Teradata Books
Who is the number one technical book author in the world? Tera-Tom! He has written over 75 books on all big
data technologies. His books have helped to educate millions of people worldwide. What people love the most
about the Tera-Tom books is that they are so easy that a seven-year-old boy (raised by wolves) can understand!
Tera-Tom - Author of 25 Books on Other Major Platforms
Writing 75 books on so many different technologies has given me insight into how computer systems work.
The Nexus and Nexus Core Server for the Cloud and On-Premises
Some software companies specialize in query tools, while others focus on data movement. Some companies join
data across different platforms, yet others specialize in gathering data from different systems to build analytics
inside the user’s PC. Some vendors focus on business users through visualization and point-and-click capabilities,
while others focus on the IT developers or DBA teams. And some companies choose to build their applications for
the desktop, while others run their software on servers.
We designed and built the Nexus and NexusCore Server to do all of the above, plus more. The brilliance with our
software approach is that we allow the user to do everything from their desktop and then execute on their PC or
from a server. We call the desktop application the Nexus and the server the NexusCore Server. Users can run
queries, data movement, or cross-system joins (Federated queries) and run them from the PC on smaller data or
from any NexusCore Server on larger data. Nobody in the world has built anything like this! Try out a free trial at
www.CoffingDW.com.
Trademarks and Copyrights
Teradata*, NCR, BYNET, and SQL Assistant have registered trademarks of Teradata Corporation, Dayton, Ohio,
U.S.A., IBM, DB2 and Netezza are registered trademarks of IBM Corporation, ANSI is a registered trademark of
the American National Standards Institute. Microsoft Windows, Windows 2003 Server, .NET, PDW, and SQL
Server, are trademarks of Microsoft. Ethernet is a trademark of Xerox. UNIX is a trademark of The Open Group.
Linux is a trademark of Linus Torvalds. Java and Oracle is a trademark of Oracle. ParAccel is a trademark of
ParAccel. Kognitio is a trademark of Kognitio. Greenplum is a trademark of EMC Corporation. Nexus Query
Chameleon is a trademark of Coffing Data Warehousing.
Coffing Data Warehousing shall have neither liability nor responsibility to any person or entity concerning any loss
or damages arising from the information contained in this book or from the use of programs or program segments.
The manual is not a publication of Teradata Corporation, nor was it produced in conjunction with Teradata
Corporation.
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any
means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the
publisher — no assumption for patent liability concerning the use of the information contained herein. Although
we took seriously every precaution in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions, neither is any liability assumed for damages resulting from the use of the
information contained herein.
About Tom Coffing
Tom Coffing, better known as Tera-Tom, is the founder of Coffing Data Warehousing, where he has been CEO for the past 20 years.
Tom has written over 50 books on all aspects of Teradata, Netezza, Kognitio, Redshift, ParAccel, Vertica, SQL Server, and
Greenplum. Tom has taught over 1,000 Teradata classes in places such as India, Africa, Europe, China, Malaysia, and throughout
North America.
Tom is also the owner and designer of the Nexus Query Chameleon, the most sophisticated enterprise query tool in the industry. The
Nexus works on all platforms, including Hadoop, converts table structures between all systems, and allows companies to load their
Erwin logical model inside Nexus. The Nexus guides users like GPS. Users point and click on any table or view from any system, and
then shown what joins to what. As users choose the columns they want on their report, the SQL creates automatically.
In High School, Tom was the first athlete from his school to ever place at state. He was selected by his school to represent them at
Buckeye Boys State, and Tom earned his induction into the first class of the Lakota High School Hall of Fame.
At the University of Arizona and University of Nevada Las Vegas, Tom was a two-time All-American wrestler, Sophomore Athlete of
the year, and a two-time winner of the 1980 Olympic wrestling trials. Tom graduated with a bachelor’s degree in Speech
Communications.
After college, Tom became a state and national champion speech winner for Toastmasters and won two orchid awards as an actor.
Tom is the proud father of three wonderful children and is married for the past 38 years.
Contents
Chapter 1 – Database Concepts ................................................................................................................................ 3
Teradata Certification Exams, Releases, and Websites .......................................................................................... 4
Teradata Database Certification and Exam Path .................................................................................................... 5
Teradata Associate Exam Objectives and Section Weightings .............................................................................. 6
What is Parallel Processing? ................................................................................................................................. 7
What is a Database? .............................................................................................................................................. 8
What is SQL? ........................................................................................................................................................ 9
How Can You Use a Database?............................................................................................................................10
There are Two Types of Databases ......................................................................................................................11
Relational Database Model and Mathematical Set Theory Concepts ....................................................................12
Relational Term Review.......................................................................................................................................13
Rules for Teradata Databases, Tables, and Columns ............................................................................................14
What is a Constraint? ...........................................................................................................................................15
What is the Difference Between a Primary Key and a Foreign Key? ....................................................................16
Referential Integrity .............................................................................................................................................17
Quiz - Logical Model Primary Key Quiz .............................................................................................................18
Answer - Logical Model Primary Key Quiz .........................................................................................................19
Quiz - Logical Model Foreign Key Quiz ..............................................................................................................20
Answer - Logical Model Foreign Key Quiz .........................................................................................................21
Dimensional Model – One Fact Table and Many Dimension Tables ....................................................................22
Quiz – What is the Primary Key of the Fact Table? .............................................................................................23
Answer – What is the Primary Key of the Fact Table? .........................................................................................24
Quiz – Name the Foreign Keys ............................................................................................................................25
Answer – Name the Foreign Keys ........................................................................................................................26
What is a Star Schema? ........................................................................................................................................27
Table of Contents
Normalization ......................................................................................................................................................28
A Denormalized Table .........................................................................................................................................29
Getting to First Normal Form ...............................................................................................................................30
A Table that is in Second Normal Form ...............................................................................................................31
Tables Reaching Third Normal Form ...................................................................................................................32
Dimensional Modeling .........................................................................................................................................33
What are the Advantages of a Normalized vs. Dimensional Model? ....................................................................34
Logical Data Model vs. Physical Data Model ......................................................................................................35
Logical Data Model vs. Physical Data Model Example ........................................................................................36
Data Warehouses are Designed for Analytics .......................................................................................................37
Chapter 1 Database Concepts
“I find that the harder I work the more luck I seem to have.”
– Thomas Jefferson
3
Chapter 1 Database Concepts
The Certification Exams use the Teradata releases above. The Teradata Certification Exams will be updated
periodically to the latest Teradata release. The websites listed above provide additional information. The first
website listed above explain the concepts and objectives for each exam.
4
Chapter 1 Database Concepts
Teradata Database
Teradata Database Certified
Certified DBA Advanced DBA
The current Teradata Certification Exams are listed above. You can earn a wide variety of Teradata Certified
Designations depending upon the tests completed. The exams require knowledge plus experience with Teradata
release 16 and higher. Many of the test questions are scenario-based, and Teradata experience is needed to answer
many of the questions, but the Tera-Tom Certification books will teach you what you need to know to give you the
best chance of passing.
5
Chapter 1 Database Concepts
The exam objectives and section weightings above describe the content covered on the Associate Exam.
6
Chapter 1 Database Concepts
Tera-Tom’s
Parallel
Processing
Wash
and Dry
Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said,
“What?!” The man explained that if he went to the laundry mat the next morning, he would be lucky to get one
machine and be there all day. But, if he went on Saturday night, he could get all the machines. Then, he could do
all his wash and dry in two hours. Now that’s parallel processing mixed in with a little dry humor!
7
Chapter 1 Database Concepts
What is a Database?
Customer_Table Order_Table
CustNo
_______ Cust Name
________________ Order_No CustNo
________ _______ Order_Date Order_Total
__________ ___________
1 Benny's Boots 1
1000 2019-01-04 299.50
2 Dave's Dogs 2
1001 2019-01-05 1254.67
3 Hanna's Hamsters 4
1002 2019-01-08 67.00
4 Frank's Frogs 5
1003 2019-01-09 12.50
5 Rachel's Rabbits
A table is made up of columns and rows. The Customer_Table has two columns and five
rows. The Order_Table has four columns and four rows.
Each row (also called a record) contains one instance of all the columns (also called
fields) in a table.
Databases provide a mechanism for organizing data. Relational databases store data in
tables. The storing, managing and retrieving of data are qualities of a database.
A table looks almost like an Excel spreadsheet because it has columns and rows, but a spreadsheet is considered a
flat file. Tables in a relational database have relations with other tables. Notice the CustNo column is in two tables.
8
Chapter 1 Database Concepts
What is SQL?
Order_Table
Order_No CustNo
________ _______ Order_Date Order_Total
__________ ___________
Select all 1000 1 2019-01-04 299.50
columns 1001 2 2019-01-05 1254.67 Select two
and sort the 1002 4 2019-01-08 67.00 columns for
data 1003 4 2019-01-09 12.50 CustNo 4
9
Chapter 1 Database Concepts
Customer_Table Order_Table
CustNo
_______ CustName
________________ Order_No CustNo Order_Date
________ _______ Order_Total
__________ ___________
1 Benny's Boots 1000 1 2019-01-04 299.50
2 Dave's Dogs 1001 2 2019-01-05 1254.67
3 Hanna's Hamsters 1002 4 2019-01-08 67.00
4 Frank's Frogs 1003 5 2019-01-09 12.50
5 Rachel's Rabbits
Retrieve rows (records) that Update rows Perform complex Cross-reference rows
match a certain criteria. (records) in mass. Aggregation. in different tables.
Update Order_Table SELECT * FROM
SELECT Cust_Name SELECT
Set Order_Total = Order_Table O
FROM Customer_Table AVG(Order_Total)
Order_Total * 1.1 ; Join Customer_Table C
WHERE Cust_No = 1 ; FROM Order_Table ; On O.CustNo = C.CustNo
Not all databases are alike. Teradata is special. No two database systems are the same.
10
Chapter 1 Database Concepts
The Constitution of the United States Product Report NoSQL databases don't use
50 SQL, but MapReduce or a
We the People of the United States, in 40 proprietary language.
Order to form a more perfect Union, 30 NoSQL only retrieve
establish Justice, insure domestic 20 information. DynamoDB
Tranquility, provide for the common 10 and MongoDB are both
defense, promote the general Welfare, Ice Eggs Tea Ham NoSQL databases.
SQL databases have relations between tables and include constraints on data types and consistency. NoSQL does
away with relational rules for the sake of speed, flexibility, and scale. Relational databases follow strict rules.
NoSQL provides documents, contracts, pictures, and XML or JSON (Java Script Object Notation).
11
Chapter 1 Database Concepts
A Relational database model uses concepts from mathematical set theory. A Union is a set of all objects that are a
member of A, or B, or both. The Intersection is the set of all objects that are members of both A and B.
12
Chapter 1 Database Concepts
Logical Operational
Model Relational File
Theory
& RDBMS Systems
13
Chapter 1 Database Concepts
database Tablename
Teradata has databases and within a database are tables. Teradata doesn't implement schemas like SQL Server, but
a schema in Teradata refers to the organization of data as a logical grouping or blueprint of how the database
divides up into tables. A database schema refers to a set of formulas or sentences, which means integrity constraints
imposed on a database. A NOT NULL constraint, for example, will not allow a NULL value in that column.
14
Chapter 1 Database Concepts
What is a Constraint?
At times it is advisable to add restrictions to columns within a table. Defining constraints is done to provide data
integrity, availability, and ease of use to the table and its data. You can define constraints on column values during
the CREATE Table process. Defining constraints can also be done using the ALTER command once the table
already exists. You define constraints at the column or table level.
15
Chapter 1 Database Concepts
Customer_Table Order_Table
CustNo
_______ Cust Name
________________ Order_No CustNo Order_Date Order_Total
________ _______ __________ __________
1 Benny's Boots 1000 1 2019-01-04 299.50
2 Dave's Dogs 1001 2 2019-01-05 1254.67
3 Hanna's Hamsters 1002 4 2019-01-08 67.00
4 Frank's Frogs 1003 4 2019-01-09 12.50
A PK is usually a single column in a table that is a unique identifier for each row (record).
Above, we have a column named CustNo in the Customer_Table that is a Primary Key (PK).
We also have a column named CustNo in the Order_Table that is a Foreign Key (FK).
CustNo is a unique column in the Customer_Table, but in the Order_Table it is not unique.
The two tables can be joined because they have a PK/FK relation on CustNo.
Do you know what the initials NN, NC, ND mean for a Primary Key, which refers to PK? Logically, a Primary
Key should never be NULL (No Nulls (NN)), and never be changed (No Changes (NC)), but most importantly it
should never have a duplicate (No Dups (ND)).
16
Chapter 1 Database Concepts
Referential Integrity
Employee_Table Payroll_Table
Any primary key field changes must apply to all foreign keys, or not at all. Any foreign key updates (not deletions)
must propagate to the primary parent key. Referential Integrity prevents inconsistencies, which prevents the
database from being corrupted on inserts, updates, merges, and deletes. Referential integrity prevents someone
from being deleted from the Employee_Table and not from the Payroll_Table.
17
Chapter 1 Database Concepts
This tricky quiz is going to teach you a lot in your quest to understand logical and physical modeling. You will
soon understand relational modeling and normalization as well as dimensional modeling.
18
Chapter 1 Database Concepts
Subscriber_No Subscriber_No *
PK
Claim_ID PK
PK Member_No
Subscriber_No
Member_No
Services
PK Service_Code Claim_Service
Services
PK Provider_No
Provider_Code
Above, we have a picture of s simple logical model. The logical modelers theoretically believe that this is the best
way to store, retrieve, and manipulate data for an insurance model. Each PK is unique. A Primary Key is always a
unique identifier for each row in that table, and it should never contain a NULL value.
19
Chapter 1 Database Concepts
Subscriber_No Subscriber_No *
PK
Claim_ID PK
PK Member_No
Subscriber_No
Member_No
Place a FK on Services
each column
that is a PK Service_Code Claim_Service
Foreign Key,
and draw a line
between the Services
PK and FK
PK Provider_No
relationship Provider_Code
A Foreign Key can have duplicate values, can be NULL, and it can be changed. Remember, both a Primary Key
or a Foreign Key can be a single column or a multi-column combination. Also, know that a PK and an FK column
can have different names, while still representing "like" data. Representing “like” data means the columns have
the same domain, which means they represent the same associated data.
20
Chapter 1 Database Concepts
A Foreign Key (FK) is a normal column(s) in a table that joins to a Primary Key (PK) column(s) in another table.
The PK/FK logical model is the relation between the tables, and that is why it is called a "Relational Model." A
Foreign Key can have duplicate values, can be NULL, and it can be changed. Notice that both a Primary Key or a
Foreign Key can be a single column or a multi-column combination. Also, notice that a PK and an FK column can
have different names, but they represent the same type of data. Since the two columns represent like data that is
joinable, they are from the same domain, which means they have the same range of values and data types.
21
Chapter 1 Database Concepts
Dimensional model design is best for business user reporting. Navigation paths are modeled and align with the
work process of users. Relational model design is best for business rules, but a Dimensional model emphasizes
usability. Tables in a Dimension define the way that that particular business uses the data. The model organizes
tables in a way that a given user or group of users think about the data.
22
Chapter 1 Database Concepts
23
Chapter 1 Database Concepts
The Primary Key of the Fact Table is a multi-column Primary Key of the four columns in the
Line_Order_Fact_Table colored in blue. They are LO_Cust_Key, LO_Part_Key, LO_Supp_Key, LO_Order_Date.
24
Chapter 1 Database Concepts
25
Chapter 1 Database Concepts
The FK boxes display each Foreign Key column. The fact table has a multi-column Primary Key. Each
Dimension table has a corresponding Foreign Key so the dimension can join to the fact. Notice, each dimension
table joins to the fact table on a different column.
26
Chapter 1 Database Concepts
Referred to as a
Dimension Snowflake
Table schema.
Dimension Dimension
Table Table
A dimension contains
A star schema model is reference information about
a simple form of a Fact Table the fact, such as date,
dimensional model. product, or customer.
A star schema model is for a data warehouse or business intelligence analytics. A star schema model is a simple
form of a dimensional model. Data organizes into a single fact table and multiple dimension tables. A fact is an
event that is counted or measured, such as a product sale or a customer login. A dimension contains reference
information about the fact, such as date, product, or customer. A Star Schema is also referred to as a Snowflake
schema.
27
Chapter 1 Database Concepts
Normalization
The process of placing columns in the correct tables is called normalization. There are three
major forms of normalization. You start by trying to get to first normal form. You then attempt
second normal form, and your end goal is to get to third normal form.
First Normal Form (1NF) means that columns (attributes) must not repeat within a table. No
repeating groups such as Monday_Sales, Tuesday_Sales, Wednesday_Sales, etc.
Second Normal Form (2NF) means that all columns relate to the entire Primary Key and not just
a portion of the Primary Key. Columns can also relate to each other, such as Dept_No and
Department_Name, by both belonging to the same table. A table that has a single column Primary
Key is always said to be in at least 2NF.
Third Normal Form (3NF) means all columns must relate to the Primary Key and not to each
other. In other words, all columns must relate to the primary key, the whole key, and nothing but
the key, so help you, Codd! It was a gentleman that worked for IBM named Codd that came up with
the normalization techniques.
If a table is in 3NF, it is automatically also in 2NF and 1NF. If a table only achieves 2NF, then it automatically
qualifies to also be in 1NF. Normalization can be confusing, but the next couple of pages will explain it well.
28
Chapter 1 Database Concepts
A Denormalized Table
Product_Daily_Sales_Table
If a business
ProdID Mon ______
Tue Wed Thu Fri Sat Sun
performs best _______ ______ ______ _____ _____ _____ _____
when product 1 90.00 95.44 80.01 91.74 97.48 90.61 81.73
sales are 2 75.50 78.34 85.56 73.76 73.64 80.46 73.74
tracked daily, 3 56.37 59.50 59.57 57.55 51.88 52.67 47.65
and then 4 64.90 61.77 63.65 68.66 41.77 67.75 78.67
compared from 5 87.53 77.16 80.63 97.79 70.76 70.13 87.79
day to day, a 6 92.10 82.23 95.22 89.43 72.93 78.42 99.53
denormalized 7 20.45 27.41 30.46 19.80 23.31 40.56 29.40
table if fine. 8 15.23 25.53 19.83 35.48 24.73 29.43 25.68
The table above is completely denormalized. It has not even reached the First Normal Form (1NF) because it has
repeating groups that are ranging from Mon-Sun. The three major forms of normalization are First Normal Form
(1NF), Second Normal Form (2NF), and Third Normal Form (3NF). If a table is in 3NF, it is automatically also in
2NF and 1NF. If a table only achieves 2NF, then it automatically qualifies to also be in 1NF.
29
Chapter 1 Database Concepts
Prod_Sizes_Price_Table To bring this table into first normal form, we split the
This table is table into two tables.
not in first ProdID ____________
______ Sizes Price
______
normal form 1 Sm, Med, Lg 80.01 Prod_Price_Table Prod_Sizes_Table
because the 2 Sm, Med 85.56
(sizes) column ProdID Price ProdID ________
______ Sizes
3 Med 59.57
contains
multiple 4 Sm 63.65 1 80.01 1 Sm
values. 5 XLG 80.63 2 85.56 1 Med
3 59.57 1 Lg
4 63.65 2 Sm
5 80.63 2 Med
3 Med
4 Sm
5 XLG
The table on the left is not in first normal form, but once we split it into two tables, we have reached the first
normal form.
30
Chapter 1 Database Concepts
Employee_Table
Second Normal Form (2NF) means that all columns relate to the entire Primary Key and not just a portion of the
Primary Key. Columns can also relate to each other.
31
Chapter 1 Database Concepts
Employee_Table Department_Table
This tables have reached third normal form. In each table, the columns
all relate to the primary key, the whole key, and nothing but the key.
A Second Normal Form (2NF) means that all columns relate to the entire Primary Key and not just a portion of the
Primary Key. In the Second Normal Form, columns can also relate to each other. Third Normal Form (3NF) means
all columns must relate to only the Primary Key and not to each other. In other words, all columns must relate to
the primary key, the whole key, and nothing but the key.
32
Chapter 1 Database Concepts
Dimensional Modeling
Dimension Table
Dimension Table
Product_Id Product_Name Price
Store_Id Store_Location
_______ ______________ Address
______ _________ ______________ ______
Fact Table
________ _________
Store_ID Product_ID _________
Sale_Date __________
Daily_Sales
Dimensional models are 11223344 1000 2014-10-16 5240.50
also referred to as Star 11223344 2000 2014-10-16 6023.77
Schema or
11223344 3000 2014-10-16 6787.21
Star Join Schema model.
99998888 1000 2014-10-16 6223.97
99998888 2000 2014-10-16 5543.22
99998888 3000 2014-10-16 6123.85
Dimension Table
Sale_Date ______________
_________ Quarter_of_Year ___________
Day_of_Week _____________
Week_of_Month
2014-10-15 4 Wednesday 2
2014-10-16 4 Thursday 2
Dimensional modeling contains two types of tables. They are fact and dimension tables. A fact table will have
fewer columns, but millions of rows potentially. There is only one fact table because it is the main table that
contains good information such as the products a company sells, sales totals, quantities, and the dates, times, and
places they sold. There will be many dimension tables. Dimension tables generally have more columns, but fewer
rows. Each dimension table that joins back to the main fact table provides detail about the actual place or product
or date the sale took place. Benefits include easy querying, flexibility, and improved performance.
33
Chapter 1 Database Concepts
Order_Table Dimension
Order_No CustNo Order_Date Order_Total Table
________ _______ __________ ___________
1000 1 2019-01-04 299.50
1001 2 2019-01-05 1254.67 Dimension Dimension
1002 4 2019-01-08 67.00 Table
Table
1003 4 2019-01-09 12.50
Fact Table
Customer_Table
Normalized data
is optimized for CustNo
______ CustName
______________
entity level 1 Benny's Boots De-normalized data is optimized for driving
transactions, 2 Dave's Dogs decision making by answering business questions.
such as row 3 Hanna's Hamsters
inserts, updates, 4 Frank's Frogs Dimension denormalization supports dimensional
and deletes. 5 Rachel's Rabbits modeling's top two priorities; speed and simplicity.
Normalized data optimizes for entity-level transactions, such as row inserts, updates, and deletes. De-normalized
data optimizes for driving decision making by answering business questions. Dimension denormalization supports
dimensional modeling's top two priorities; speed and simplicity.
34
Chapter 1 Database Concepts
You would use the same You would use a different physical
logical model on a model on Teradata. A different physical
Teradata, Oracle, DB2, model on Oracle. A different physical
SQL Server or Amazon model on DB2. A different physical
Redshift system. model on SQL Server, etc.
Great logical modelers get a room full of a company's business leaders in a conference room, and they use a
whiteboard or flip chart to build the tables with the proper columns for each table. The logical model is merely
theoretically how the business wants to organize their data. When finished with the logical model, a leader gives
the logical model to the IT staff. The physical modelers and the DBA team build the tables on a Teradata system
(or another system) with proper indexing so that reports, Ad Hoc queries, joins and tactical queries run as fast and
efficiently as possible.
35
Chapter 1 Database Concepts
Same logical model for Teradata and SQL Server Teradata physical (top) SQL Server (bot)
The logical model for how the company wants to organize its tables is the example above on the left. The company
has chosen a relational model that is in third normal form but could have chosen a dimensional model. The
physical model (for the Claims table only) displays above on the right. The top example is the physical model for a
Teradata system. The bottom example is a physical model for Microsoft SQL Server.
36
Chapter 1 Database Concepts
Data warehouses store enterprise-wide detailed data for analysis. This analysis allows for a strategic and tactical
business decision. A data warehouse is a "Decision Support System" or DSS. Teradata allows for transactions
from OLTP systems to immediately load into a Teradata system so that real-time tactical queries can run. A tactical
query is designed to return an answer in seconds. OLTP systems hold days, weeks, or months of data, but a data
warehouse holds years to decades of information.
37