Relational Algebra CSE 544: Relational Operators, Sorting 1. Union and ...
嚜燎elational Algebra
? Operates on relations, i.e. sets
每 Later: we discuss how to extend this to bags
? Five operators:
CSE 544:
Relational Operators, Sorting
Wednesday, 5/12/2004
每
每
每
每
每
Union: ﹍
Difference: Selection: 考
Projection: 旭
Cartesian Product: ℅
? Derived or auxiliary operators:
每 Intersection, complement
每 Joins (natural,equi-join, theta join, semi-join)
每 Renaming: 老
1. Union and 2. Difference
? R1 ﹍ R2
? Example:
ActiveEmployees ﹍ RetiredEmployees
What about Intersection ?
?
?
?
?
It is a derived operator
R1 ﹎ R2 = R1 每 (R1 每 R2)
Also expressed as a join (will see later)
Example
UnionizedEmployees ﹎ RetiredEmployees
? R1 每 R2
? Example:
AllEmployees 每 RetiredEmployees
3. Selection
? Returns all tuples which satisfy a condition
? Notation: 考c(R)
? Examples
考Salary > 40000 (Employee)
考name = ※Smithh§ (Employee)
4. Projection
? Eliminates columns, then removes duplicates
? Notation: 旭 A1,#,An (R)
? Example: project social-security number and
names:
旭 SSN, Name (Employee)
Output schema: Answer(SSN, Name)
? The condition c can be =, , ≡,
1
Cartesian Product Example
5. Cartesian Product
? Each tuple in R1 with each tuple in R2
? Notation: R1 ℅ R2
? Example:
Employee ℅ Dependents
? Very rare in practice; mainly used to
express joins
Renaming
? Changes the schema, not the instance
? Notation: 老 B1,#,Bn (R)
? Example:
老LastName, SocSocNo (Employee)
Output schema: Answer(LastName, SocSocNo)
Employee
Name
John
Tony
SSN
999999999
777777777
Dependents
EmployeeSSN
999999999
777777777
Dname
Emily
Joe
Employee x Dependents
Name
SSN
EmployeeSSN
John
999999999 999999999
John
999999999 777777777
Tony
777777777 999999999
Tony
777777777 777777777
Renaming Example
Employee
Name
John
Tony
SSN
999999999
777777777
老LastName, SocSocNo (Employee)
LastName
John
Tony
Natural Join
? Notation: R1 |℅| R2
? Meaning: R1 |℅| R2 = 旭A(考C(R1 ℅ R2))
? Where:
每 The selection 考C checks equality of all common
attributes
每 The projection eliminates the duplicate common
attributes
Dname
Emily
Joe
Emily
Joe
SocSocNo
999999999
777777777
Natural Join Example
Employee
Name
John
Tony
SSN
999999999
777777777
Dependents
SSN
999999999
777777777
Dname
Emily
Joe
Employee
Dependents =
旭Name, SSN, Dname(考
考 SSN=SSN2(Employee x 老SSN2, Dname(Dependents))
Name
John
Tony
SSN
999999999
777777777
Dname
Emily
Joe
2
Natural Join
? R=
S=
Natural Join
A
B
B
C
X
Y
Z
U
X
Z
V
W
Y
Z
Z
V
Z
V
? R |℅| S=
? Given the schemas R(A, B, C, D), S(A, C, E),
what is the schema of R |℅| S ?
? Given R(A, B, C), S(D, E), what is R |℅| S ?
A
B
C
X
Z
U
X
Z
V
Y
Z
U
Y
Z
V
Z
V
W
? Given R(A, B), S(A, B), what is R |℅| S ?
Theta Join
? A join that involves a predicate
? R1 |℅| 牟 R2 = 考 牟 (R1 ℅ R2)
? Here 牟 can be any condition: =, ≡
Eq-join
? A theta join where 牟 is an equality
? R1 |℅| A=B R2 = 考 A=B (R1 ℅ R2)
? Example:
Employee |℅| SSN=SSN Dependents
? Most useful join in practice
Semijoin
? R |℅ S = 旭 A1,#,An (R |℅| S)
? Where A1, #, An are the attributes in R
? Example:
Employee |℅ Dependents
Semijoins in Distributed
Databases
? Semijoins are used in distributed databases
Dependents
Employee
SSN
Name
...
...
SSN
...
Dname
Age
...
network
Employee
(考 age>71(Dependents))
Employee |℅|
|℅| ssn=ssn
(Dependents))
ssn=ssn (考 age>71
R = Employee |℅ T
T = 旭 SSN 考 age>71 (Dependents)
Answer = R |℅| Dependents
3
Complex RA Expressions
Operations on Bags
旭 name
buyer-ssn=ssn
pid=pid
seller-ssn=ssn
旭 ssn
旭 pid
考name=fred
Person
Purchase
Person
考name=gizmo
A bag = a set with repeated elements
Relational Engines work on bags, not sets !
All operations need to be defined carefully on bags
? {a,b,b,c}﹍{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
? {a,b,b,b,c,c} 每 {b,c,c,c,d} = {a,b,b,d}
? 考C(R): preserve the number of occurrences
? 旭A(R): no duplicate elimination
? Cartesian product, join: no duplicate elimination
Product
Logical Operators in the Bag
Algebra
?
?
?
?
?
?
?
Union, intersection, difference
Selection 考
Projection 旭
Join
Duplicate elimination 汛
Grouping 污
Sorting 而
Relational
Algebra
(on bags)
Query Plan:
? logical tree
? implementation
choice at every
node
? scheduling of
operations.
SELECT
SELECTcity,
city,count(*)
count(*)
FROM
FROMsales
sales
GROUP
GROUPBY
BYcity
city
HAVING
HAVINGsum(price)
sum(price)>>100
100
City=&seattle*
考 p > 100
c
sales
Architecture of a Database Engine
SQL query
buyer
考
旭 city, c
T(city,p,c)
污 city, sum(price) p, count(*)
Physical Operators
SELECT
SELECT S.buyer
S.buyer
FROM
FROMPurchase
PurchaseP,
P,Person
PersonQQ
WHERE
P.buyer=Q.name
WHERE P.buyer=Q.nameAND
AND
Q.city=&seattle*
Q.city=&seattle*AND
AND
Q.phone
>
&5430000*
Q.phone > &5430000*
Example
Parse Query
phone>*5430000*
Buyer=name
(Simple Nested Loops)
Query
optimization
Select Logical Plan
Logical
plan
Select Physical Plan
Purchase
(Table scan)
Person
(Index scan)
Query Execution
Physical
plan
Some operators are from relational
algebra, and others (e.g., scan)
are not.
4
Cost Parameters
In database systems the data is on disks, not in main memory
The cost of an operation = total number of I/Os
Cost parameters:
? B(R) = number of blocks for relation R
? T(R) = number of tuples in relation R
? V(R, a) = number of distinct values of attribute a
Cost
Cost of an operation =
number of disk I/Os needed to:
每 read the operands
每 compute the result
Cost of writing the result to disk is not included on the following
slides
Question: the cost of sorting a table with B blocks ?
Answer:
Sorting While Scanning
? Sometimes it is useful to have the output
sorted
? Three ways to scan it sorted:
每 If there is a primary or secondary index on it,
use it during scan
每 If it fits in memory, sort there
每 If not, use multi-way merge sort
Cost Parameters
? Clustered table R:
每 Blocks consists only of records from this table
每 B(R) > T(R) / blockSize
? Unclustered table R:
每 Its records are placed on blocks with other tables
每 When R is unclustered: B(R) > T(R)
? When a is a key, V(R,a) = T(R)
? When a is not a key, V(R,a)
Scanning Tables
? The table is clustered:
每 Table-scan: if we know where the blocks are
每 Index scan: if we have a sparse index to find the
blocks
? The table is unclustered
每 May need one read for each record
Cost of the Scan Operator
? Clustered relation:
每 Table scan:
? Unsorted: B(R)
? Sorted: 3B(R)
每 Index scan
? Unsorted: B(R)
? Sorted: B(R) or 3B(R)
? Unclustered relation
每 Unsorted: T(R)
每 Sorted: T(R) + 2B(R)
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- sql select order by
- relational algebra cse 544 relational operators sorting 1 union and
- example paragraph with descending order
- sql order by multiple columns
- oracle mooc sql fundamentals
- ascending or descending order
- gsort — ascending and descending sort stata
- sql sort group by
- sort statement in sql
- title gsort — ascending and descending sort
Related searches
- algebra 1 worksheets and answers
- algebra 1 terms and definitions
- algebra union and intersection definition
- 1 or 2 297 297 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 3 297 297 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 2 905 905 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 3 905 905 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 2 648 648 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 3 648 648 1 0 0 0 1 168 1 1 username and password verizon
- 1 or 2 29 29 1 0 0 0 1 or uskqme9h 168 1 1 username and password verizon
- 1 or 3 29 29 1 0 0 0 1 or uskqme9h 168 1 1 username and password verizon
- 1 or 2 228 228 1 0 0 0 1 168 1 1 username and password verizon