Query Optimization 2 - Stanford University
Query Optimization 2
Instructor: Matei Zaharia cs245.stanford.edu
Recap: Data Statistics
Information about tuples in a table that we can use to estimate costs
? Must be approximated for intermediate tables
We saw one way to do this for 4 statistics:
? T(R) = # of tuples in R ? S(R) = average size of tuples in R ? B(R) = # of blocks to hold R's tuples ? V(R, A) = # distinct values of attribute A in R
CS 245
2
Another Type of Data Stats: Histograms
15 12
10
5
number of tuples in R with A value in a given range
Aa(R) = ?
10 20 30 40
CS 245
3
Outline
What can we optimize?
Rule-based optimization
Data statistics
Cost models
Cost-based plan selection
Spark SQL
CS 245
4
Outline
What can we optimize?
Rule-based optimization
Data statistics
Cost models
Cost-based plan selection
Spark SQL
CS 245
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- stanford university philosophy department
- stanford university plato
- stanford university encyclopedia of philosophy
- stanford university philosophy encyclopedia
- stanford university philosophy
- stanford university ein number
- stanford university master computer science
- stanford university graduate programs
- stanford university computer science ms
- stanford university phd programs
- stanford university phd in education
- stanford university online doctoral programs