Query Execution: Intro To Database Systems Andy Pavlo

Query Execution
12 Part I
Intro to Database Systems Andy Pavlo

15-445/15-645
Fall 2019 AP Computer Science
Carnegie Mellon University
2
ADMINISTRIVIA
Homework #3 is due Wed Oct 9th @ 11:59pm
Mid-Term Exam is Wed Oct 16th @ 12:00pm
Project #2 is due Sun Oct 20th @ 11:59pm
CMU 15-445/645 (Fall 2019)

3
QUERY PL AN
SELECT R.id, S.cdate
The operators are arranged in a tree. FROM R JOIN S
ON R.id = S.id
Data flows from the leaves of the tree WHERE S.value > 100
up towards the root.
p R.id, S.value
The output of the root node is the

result of the query. ⨝ R.id=S.id
s value>100
R S
CMU 15-445/645 (Fall 2019)
4
T O D AY ' S A G E N D A
Processing Models
Access Methods
Expression Evaluation
CMU 15-445/645 (Fall 2019)

5
PROCESSING MODEL
A DBMS's processing model defines how the

system executes a query plan.
→ Different trade-offs for different workloads.
Approach #1: Iterator Model

Approach #2: Materialization Model
Approach #3: Vectorized / Batch Model
CMU 15-445/645 (Fall 2019)

6
I T E R AT O R M O D E L
Each query plan operator implements a Next

function.
→ On each invocation, the operator returns either a single
tuple or a null marker if there are no more tuples.
→ The operator implements a loop that calls next on its
children to retrieve their tuples and then process them.
Also called Volcano or Pipeline Model.
CMU 15-445/645 (Fall 2019)

7
Next() for t in child.Next():

emit(projection(t)) FROM R JOIN S
ON R.id = S.id
Next() for t1 in left.Next(): WHERE S.value > 100
buildHashTable(t1)
for t2 in right.Next():
if probe(t2): emit(t1⨝t2) p R.id, S.value
Next() for t in child.Next(): ⨝ R.id=S.id
s
if evalPred(t): emit(t)
value>100
Next() for t in R: Next() for t in S:
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

7
1 for t in child.Next():
ON R.id = S.id
for t1 in left.Next(): WHERE S.value > 100
buildHashTable(t1)
for t in child.Next(): ⨝ R.id=S.id
s
value>100
for t in R: for t in S:
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

7
ON R.id = S.id
WHERE S.value > 100
2 for t1 in left.Next():
buildHashTable(t1)
s
value>100
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

7
ON R.id = S.id
WHERE S.value > 100
buildHashTable(t1)
Single Tuple
s
value>100
3 for t in R: for t in S:
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

7
ON R.id = S.id
WHERE S.value > 100
buildHashTable(t1)
for t in child.Next():
4 ⨝ R.id=S.id
s
value>100
5
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

7
ON R.id = S.id
WHERE S.value > 100
buildHashTable(t1)
4 ⨝ R.id=S.id
s
value>100
5
R S
emit(t) emit(t)
CMU 15-445/645 (Fall 2019)

8
This is used in almost every DBMS. Allows for

tuple pipelining.
Some operators have to block until their children

emit all of their tuples.
→ Joins, Subqueries, Order By
Output control works easily with this approach.
CMU 15-445/645 (Fall 2019)

9
M AT E R I A L I Z AT I O N M O D E L
Each operator processes its input all at once and

then emits its output all at once.
→ The operator "materializes" its output as a single result.
→ The DBMS can push down hints into to avoid scanning
too many tuples.
→ Can send either a materialized row or a single column.
The output can be either whole tuples (NSM) or

subsets of columns (DSM)
CMU 15-445/645 (Fall 2019)

10
out = [ ]
1 for t in child.Output():
out.add(projection(t)) SELECT R.id, S.cdate
return out FROM R JOIN S
ON R.id = S.id
out = [ ]
for t1 in left.Output(): WHERE S.value > 100
p
buildHashTable(t1)
for t2 in right.Output(): R.id, S.value
if probe(t2): out.add(t1⨝t2)
return out
out = [ ] ⨝ R.id=S.id
s
for t in child.Output():
if evalPred(t): out.add(t)
return out value>100
R S
out = [ ] out = [ ]
out.add(t) out.add(t)
return out return out
CMU 15-445/645 (Fall 2019)
10
out = [ ]
ON R.id = S.id
out = [ ]
WHERE S.value > 100
2
for t1 in left.Output():
p
buildHashTable(t1)
return out
s
All Tuples return out value>100
3
R S
out = [ ] out = [ ]
CMU 15-445/645 (Fall 2019)
10
out = [ ]
ON R.id = S.id
out = [ ]
WHERE S.value > 100
2
p
buildHashTable(t1)
return out
4
s
3 5
R S
out = [ ] out = [ ]
CMU 15-445/645 (Fall 2019)
10
out = [ ]
ON R.id = S.id
out = [ ]
WHERE S.value > 100
2
p
buildHashTable(t1)
return out
4
s
3 5
R S
out = [ ] out = [ ]
CMU 15-445/645 (Fall 2019)
11
Better for OLTP workloads because queries only

access a small number of tuples at a time.
→ Lower execution / coordination overhead.
→ Fewer function calls.
Not good for OLAP queries with large

intermediate results.
CMU 15-445/645 (Fall 2019)

12
V E C T O R I Z AT I O N M O D E L
Like the Iterator Model where each operator

implements a Next function in this model.
Each operator emits a batch of tuples instead of a
single tuple.
→ The operator's internal loop processes multiple tuples at a
time.
→ The size of the batch can vary based on hardware or
query properties.
CMU 15-445/645 (Fall 2019)

13
out = [ ]
for t in child.Next(): 1 SELECT R.id, S.cdate
out.add(projection(t))
if |out|>n: emit(out) FROM R JOIN S
ON R.id = S.id
2 out = [ ]
p
buildHashTable(t1)
for t2 in right.Next(): R.id, S.value
if |out|>n: emit(out)
out = [ ]
⨝ R.id=S.id
if |out|>n: emit(out) s value>100
out = [ ] Tuple Batch out = [ ]

3 for t in R:
out.add(t)
for t in S:
out.add(t)
R S
CMU 15-445/645 (Fall 2019)
13
out = [ ]
for t in child.Next(): 1 SELECT R.id, S.cdate
out.add(projection(t))
if |out|>n: emit(out) FROM R JOIN S
ON R.id = S.id
2 out = [ ]
p
buildHashTable(t1)
for t2 in right.Next(): R.id, S.value
out = [ ]
for t in child.Next(): 4 ⨝ R.id=S.id
if |out|>n: emit(out) s value>100
out = [ ] Tuple Batch out = [ ]

3 5
for t in R:
out.add(t)
for t in S:
out.add(t)
R S
CMU 15-445/645 (Fall 2019)
14
Ideal for OLAP queries because it greatly reduces

the number of invocations per operator.
Allows for operators to use vectorized (SIMD)
instructions to process batches of tuples.
CMU 15-445/645 (Fall 2019)

15
PLAN PROCESSING DIRECTION
Approach #1: Top-to-Bottom

→ Start with the root and "pull" data up from its children.
→ Tuples are always passed with function calls.
Approach #2: Bottom-to-Top

→ Start with leaf nodes and push data to their parents.
→ Allows for tighter control of caches/registers in pipelines.
CMU 15-445/645 (Fall 2019)

16
ACCESS METHODS
An access method is a way that the FROM R JOIN S
DBMS can access the data stored in a ON R.id = S.id
table. WHERE S.value > 100
→ Not defined in relational algebra.
p R.id, S.value
Three basic approaches:

→ Sequential Scan ⨝ R.id=S.id
→ Index Scan
→ Multi-Index / "Bitmap" Scan s value>100
R S
CMU 15-445/645 (Fall 2019)
17
SEQUENTIAL SCAN
For each page in the table: for page in table.pages:

→ Retrieve it from the buffer pool. for t in page.tuples:
→ Iterate over each tuple and check whether if evalPred(t):
to include it. // Do Something!
The DBMS maintains an internal

cursor that tracks the last page / slot
it examined.
CMU 15-445/645 (Fall 2019)

18
S E Q U E N T I A L S C A N : O P T I M I Z AT I O N S
This is almost always the worst thing that the

DBMS can do to execute a query.
Sequential Scan Optimizations:

→ Prefetching
→ Buffer Pool Bypass
→ Parallelization
→ Zone Maps
→ Late Materialization
→ Heap Clustering
CMU 15-445/645 (Fall 2019)

19
ZONE MAPS
Pre-computed aggregates for the attribute values

in a page. DBMS checks the zone map first to
decide whether it wants to access the page.
Original Data Zone Map

val type val
SELECT * FROM table 100 MIN 100
WHERE val > 600 200 MAX 400
300 AVG 280
400 SUM 1400
400 COUNT 5
CMU 15-445/645 (Fall 2019)
20
L AT E M AT E R I A L I Z AT I O N
DSM DBMSs can delay stitching together tuples

until the upper parts of the query plan.
SELECT AVG(foo.c)
γ AVG(foo.c) FROM
ON
foo JOIN bar
foo.b = bar.b
WHERE foo.a > 100
⨝ foo.b=bar.b
s
a b c
a>100 0
1
bar foo 2
3
CMU 15-445/645 (Fall 2019)
20

SELECT AVG(foo.c)
γ AVG(foo.c) FROM
ON
foo JOIN bar
foo.b = bar.b
WHERE foo.a > 100
⨝ foo.b=bar.b
s
a b c
a>100 Offsets 0
1
bar foo 2
3
CMU 15-445/645 (Fall 2019)
20

SELECT AVG(foo.c)
γ AVG(foo.c) FROM
ON
foo JOIN bar
foo.b = bar.b
WHERE foo.a > 100
⨝ foo.b=bar.b Offsets
s
a b c
a>100 Offsets 0
1
bar foo 2
3
CMU 15-445/645 (Fall 2019)
20

SELECT AVG(foo.c)
γ AVG(foo.c) Result FROM
ON
foo JOIN bar
foo.b = bar.b
WHERE foo.a > 100
⨝ foo.b=bar.b Offsets
s
a b c
a>100 Offsets 0
1
bar foo 2
3
CMU 15-445/645 (Fall 2019)
21
HEAP CLUSTERING
Tuples are sorted in the heap's pages Scan Direction

using the order specified by a
clustering index.
If the query accesses tuples using the 101 102 103 104
clustering index's attributes, then the
DBMS can jump directly to the pages
that it needs.
CMU 15-445/645 (Fall 2019)

22
INDEX SCAN
The DBMS picks an index to find the tuples that

the query needs.
Which index to use depends on:

→ What attributes the index contains
→ What attributes the query references
→ The attribute's value domains
→ Predicate composition
→ Whether the index has unique or non-unique keys
CMU 15-445/645 (Fall 2019)

23
INDEX SCAN
SELECT * FROM students
Suppose that we a single table with WHERE age < 30
100 tuples and two indexes: AND dept = 'CS'
→ Index #1: age AND country = 'US'
→ Index #2: dept
Scenario #1 Scenario #2
There are 99 people There are 99 people in
under the age of 30 but the CS department but
only 2 people in the CS only 2 people under the
department. age of 30.
CMU 15-445/645 (Fall 2019)
24
M U LT I - I N D E X S C A N
If there are multiple indexes that the DBMS can

use for a query:
→ Compute sets of record ids using each matching index.
→ Combine these sets based on the query's predicates
(union vs. intersect).
→ Retrieve the records and apply any remaining predicates.
Postgres calls this Bitmap Scan.
CMU 15-445/645 (Fall 2019)

25
With an index on age and an index WHERE age < 30
on dept, AND dept = 'CS'
→ We can retrieve the record ids satisfying AND country = 'US'
age<30 using the first,
→ Then retrieve the record ids satisfying
dept='CS' using the second,
→ Take their intersection
→ Retrieve records and check
country='US'.
CMU 15-445/645 (Fall 2019)

26
Set intersection can be done with WHERE age < 30
bitmaps, hash tables, or Bloom filters. AND dept = 'CS'
AND country = 'US'
age<30 dept='CS'
record ids
CMU 15-445/645 (Fall 2019)

26
AND country = 'US'
age<30 dept='CS'
record ids record ids
CMU 15-445/645 (Fall 2019)

26
AND country = 'US'
age<30 dept='CS'
record ids record ids
fetch records country='US'
CMU 15-445/645 (Fall 2019)

27
I N D E X S C A N PA G E S O R T I N G
Scan Direction
Retrieving tuples in the order that
appear in an unclustered index is
inefficient.
101 102 103 104
The DBMS can first figure out all the
tuples that it needs and then sort them
based on their page id.
CMU 15-445/645 (Fall 2019)

27
Scan Direction
inefficient.
101 102 103 104
The DBMS can first figure out all the
tuples that it needs and then sort them
based on their page id.
CMU 15-445/645 (Fall 2019)

27
Scan Direction
inefficient.
101 102 103 104
The DBMS can first figure out all the Page 102
tuples that it needs and then sort them Page
Page
103
104
based on their page id. Page
Page
104
102
Page 103
Page 102
Page 102
Page 101
Page 103
Page 104
Page 103
CMU 15-445/645 (Fall 2019)

27
Scan Direction
inefficient.
101 102 103 104
The DBMS can first figure out all the Page 102 Page 101
tuples that it needs and then sort them Page
Page
103
104
Page
Page
101
102
based on their page id. Page
Page
104
102
Page
Page
102
102
Page 103 Page 102
Page 102 Page 103
Page 102 Page 103
Page 101 Page 103
Page 103 Page 104
Page 104 Page 104
Page 103 Page 104
CMU 15-445/645 (Fall 2019)

28
E X P R E S S I O N E VA L U AT I O N
The DBMS represents a WHERE clause FROM R JOIN S
as an expression tree. ON R.id = S.id
WHERE S.value > 100
The nodes in the tree represent
different expression types: AND
→ Comparisons (=, <, >, !=)
→ Conjunction (AND), Disjunction (OR)
→ Arithmetic Operators (+, -, *, /, %) = >
→ Constant Values
→ Tuple Attribute References
Attribute(R.id) Attribute(S.id) Attribute(value) Constant(100)
CMU 15-445/645 (Fall 2019)

29
SELECT * FROM S
WHERE B.value = ? + 1
CMU 15-445/645 (Fall 2019)

29
Execution Context
SELECT * FROM S Current Tuple Query Parameters Table Schema
WHERE B.value = ? + 1 (123, 1000) (int:999) S→(int:id, int:value)
Attribute(S.value) +
Parameter(0) Constant(1)
CMU 15-445/645 (Fall 2019)

29
Execution Context
CMU 15-445/645 (Fall 2019)

29
Execution Context
1000
CMU 15-445/645 (Fall 2019)

29
Execution Context
1000
999
CMU 15-445/645 (Fall 2019)
29
Execution Context
1000
999 1
CMU 15-445/645 (Fall 2019)
29
Execution Context
1000 1000
999 1
CMU 15-445/645 (Fall 2019)
29
Execution Context
=
true
1000 1000
999 1
CMU 15-445/645 (Fall 2019)
30
Evaluating predicates in this manner

=
is slow.
→ The DBMS traverses the tree and for each
node that it visits it must figure out what
the operator needs to do. Constant(1) Constant(1)
Consider the predicate "WHERE 1=1"
A better approach is to just evaluate

1 = 1
the expression directly.
→ Think JIT compilation
CMU 15-445/645 (Fall 2019)

31
CONCLUSION
The same query plan be executed in multiple ways.
(Most) DBMSs will want to use an index scan as

much as possible.
Expression trees are flexible but slow.
CMU 15-445/645 (Fall 2019)

32
NEXT CLASS
Parallel Query Execution
CMU 15-445/645 (Fall 2019)

Query Execution: Intro To Database Systems Andy Pavlo

Uploaded by

Copyright:

Available Formats

Query Execution: Intro To Database Systems Andy Pavlo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Query Execution: Intro To Database Systems Andy Pavlo

Uploaded by

Copyright:

Available Formats

Query Execution

Intro to Database Systems Andy Pavlo

Homework #3 is due Wed Oct 9th @ 11:59pm

Mid-Term Exam is Wed Oct 16th @ 12:00pm

Project #2 is due Sun Oct 20th @ 11:59pm

CMU 15-445/645 (Fall 2019)

The output of the root node is the

CMU 15-445/645 (Fall 2019)

A DBMS's processing model defines how the

Approach #1: Iterator Model

CMU 15-445/645 (Fall 2019)

Each query plan operator implements a Next

Also called Volcano or Pipeline Model.

CMU 15-445/645 (Fall 2019)

Next() for t in child.Next():

Next() for t in child.Next(): ⨝ R.id=S.id

CMU 15-445/645 (Fall 2019)

for t in child.Next(): ⨝ R.id=S.id

CMU 15-445/645 (Fall 2019)

for t in child.Next(): ⨝ R.id=S.id

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

This is used in almost every DBMS. Allows for

Some operators have to block until their children

Output control works easily with this approach.

CMU 15-445/645 (Fall 2019)

Each operator processes its input all at once and

The output can be either whole tuples (NSM) or

CMU 15-445/645 (Fall 2019)

Better for OLTP workloads because queries only

Not good for OLAP queries with large

CMU 15-445/645 (Fall 2019)

Like the Iterator Model where each operator

CMU 15-445/645 (Fall 2019)

out = [ ] Tuple Batch out = [ ]

out = [ ] Tuple Batch out = [ ]

Ideal for OLAP queries because it greatly reduces

CMU 15-445/645 (Fall 2019)

PLAN PROCESSING DIRECTION

Approach #1: Top-to-Bottom

Approach #2: Bottom-to-Top

CMU 15-445/645 (Fall 2019)

Three basic approaches:

For each page in the table: for page in table.pages:

The DBMS maintains an internal

CMU 15-445/645 (Fall 2019)

This is almost always the worst thing that the

Sequential Scan Optimizations:

CMU 15-445/645 (Fall 2019)

Pre-computed aggregates for the attribute values

Original Data Zone Map

DSM DBMSs can delay stitching together tuples

DSM DBMSs can delay stitching together tuples

DSM DBMSs can delay stitching together tuples

DSM DBMSs can delay stitching together tuples

Tuples are sorted in the heap's pages Scan Direction

CMU 15-445/645 (Fall 2019)