Domains are important in design
E.g. Age: (16 - 70)
Range of each data type (domain)
Figure xxx
Each row is called a tuple
Domain is a set of atomic values
Values | Domain |
SSN | Set of 9 digit SSN |
GPA | (0.00, 4.00) |
Age | (16, 70) |
Dept | CS, EE, ME, MIS, Set of departments) |
Most specific data type for each domain
is denoted as
R (A1, A2, ... An)
where
attribute Ai has domain D dom(Ai)
Degree of relation = # of attributes
e.g. car (name, make, model, msrp)
degree = 4
dom (name) = all the car names
A relation (instance) r of relational schema R(A1, A2, ..., An) is denoted by r (R)
= set of n-triples r = { t1, t2, ..., tn}
OR
A relation r(R) is a subset of the cartesian product of the domains that define R
r(R) "is equal or subset of" (dom(A1) x dom(A2) x ... x dom(An))
=> x specific all possible combos of values from the underlying domain
Total number of triples in Cartesian product
= | dom (A1) | x | dom (A2) | x ... x | dom (An) |
Pick any valid combo => current relation state
| Names | x | Phone # | x | Age |
Order of tuples => does not affect relation
Page 140, Fig 6.1
Atomicity in tuple vales
An n-tuple t in r(R)
=> t = < v1, v2, ... , vn>
where vi represents value of attribute Ai
t[Ai] == t[Name] = <'Bubber Smith'>
t[A1, A2, ... , An] == t[Name, Salary, Age] = <'John', '2500', '40'>
Q, R, S denote rel names
q, r, s denote rel states
t, u, v denote tuples
Specify atomicity in each value of each attribute from dom(A)
data types - int, float, char ...
Set of all its attributes
Key of R
(SK of R) and (K -A = K') where K' is not SK of R
E.g.
SSN is key of STUDENT since there is no 2 students has same SSN
(Any attribute has SSN in there, will be superkey)
t1[SK] not equal t2[SK]
Key is minimal superkey ... at least set of all attriibutes
A relation schema may have more than one key (candidate key) (can be both)
P145, Fig 6.4 - Ned to choose one as primary key
E.g. SSN - primary key, cannot be null
= { R1, R2, ... , Rn}
Set of integnity constraints (IC)
A relational DB instance of S
= {r1, r2, ..., rm} where ri is an instance of Ri and ri satisfies IC.
Entity integnity:
No primary key can be null. E.g. unable to identify same tuples if null.
Referential Integrity Constriant
E.g.
EMPLOYEE
tuple i [------] ---------> DEPARTMENT (Must exist)
tuple j [------]
A foreign key FK is a set of attributes of some Ri if
Figure 6.7 P148
1. attr of Ri FK = attr of Rj PK ==> FK references Rj
(1)
dom of R1 (FK) = dom of R2 (PK)
=> FK refer to R2
(2)
t1 [FK] = t2 [PK]
t1 refers to t2
i.e. value of FK in t1 occurs in value of PK in t2 tuple t1 references tuple t2
(PK cannot be null).
E.g. Retrieve all names of employees whose
salary < salary (supervisor)
All type of constraints should be provided by DDL
Operations
insert - new tuples in a relation
delete - tuples
modify - change values of some attributes
Constraints violated?
1. Key constraints: cannot insert tuple with a PK that occurs in another tuple of the same relation.
t1 = <'B. Smith', '1234', 'xxx', ...>
t2 = <'C. Smith', '1234', null>
1234 is the same, so should be rejected.
2. Entity constraint
t1 = <'B.Smith', '1234', '624-1234'> (All attrs combined are primary key)
t2 = <'C.Smith', '3456', null>
Entity constraint => PK not equal to null
3. Referential Integrity Constraints
Insert
t1 = <'B. Smith', ... , 20>
If there is only 15 dept, when try to refer to 20, we should get invalid error.
Reject operation - inform user
Perform interactive correction to prompt user for correct info.
Should observe all constraints
Delete a primary key that is reference by a foreign key (ref integrity)
Reject operation - inform user
Accept delete and delete all related tuples
R1 --- R2
R1 --- R3
Delete all tuples in R1, R2 and R3. Or null the FK in R2 and R3.
Check if PK or FK, make sure new value refers to an existing tuple
Check correct data type and domain
Can use delete and insert (modify PK)
Operations could be divided into
Set operation (math set theory) OR
Relational DB operations
DE <dno = 4> (EMPLOYEE)
DE <salary > 80000> (EMPLOYEE)
DE <select cond> (<relation name>)
where
<select cond> is set of clauses s.t.
<attr name> <comparsion op><const value>
OR
<attr name> <comparsion op> <attr name>
E.g. DNO = 20
<comparsion op> = {=, <, <=, =>, >, noteq}
<const val> = from attributes domain
AND, OR, NOT
E.g.
DE (DNO = 4 AND SALARY > 80000) OR (DNO = 5 AND SALARY > 20000) (EMPLOYEE)
Unary, i.e. used on one relation
Applied to each tuple individually
Degree of resulting relation = degree (# of attr) of orig relation
# tuples in result <= # tuples in original relation
Faction of tuple get selected
Ration: # of selected tuple / total #
tells the nature of tuples in your relation
increase access performance
tell you what common query and selectivity and can tune system
DE <cond1> (DE <cond2> (R))
= DE <cond2> (DE <cond1> (R))
Also
DE <cond1> (DE <cond2> ( ... (DE <condn> (R)))
converted to single select
DE <cond1> and <cond2> and ... and <condn> (R)
Select certain columns. Reject the resst
E.g. PI (LNAME, FNAME, SALARY) (EMPLOYEE)
Not Communtative
PI AB (PI ABC (R))
does not equal to
PI ABC (PI AB (R))
PI <attribute list> (<Relation name>)
size -> number of tuples
size(PI (R)) <= size (R)
Degree (resulting rel T1) <= Degree (R)
PI <attribute list> (R) never eq Size(R)
P157 Fig 6.9
PI fname, lname, salary (DE dno=5 (EMPLOYEE))
DE dno = 5 (PI fname, lname, salary) (EMPLOYEE)
Use standard math set ops
Binary op: # operands = 2
Validity of results of such operators
=> 2 reln must have same tuple types (aka union compatibility)
=> same degree
=> same domains
Union:
R1 U R2 => all tuples in R1 or in R2 or in both R1 & R2 (duplicates eliminated)
Intersection:
R1 . R2 => all tuples in both R1 and R2
Difference = R1 R2, all tuples in R1 that are not in R2
(P 159, Fig 6.11)
Union - Yes
Intersection - Yes
R1 U R2 = R2 U R1
R1 . R2 = R2 . R1
Difference is not commutative
R1 U (R2 U R3) = (R1 U R2) U R3
R1 . (R2 . R3) = (R1 . R2) . R3
(R1 U R2) . R3 not eq R1 U (R2 . R3)
aka cross product
combine all tuples
union compatability not required
Ra: m
Rb: n
A X B => m x n