Relational Model Concepts
The relational model represents the database as a collection of relations. Informally, each relation resembles a table of values or, to some extent, a flat file of records. It is called a flat file because each record has a simple linear or flat structure.
When a relation is thought of as a table of values:
Each row represents a collection of related data values.
A row typically corresponds to a real-world entity or relationship.
In formal relational model terminology:
A row is called a tuple.
A column header is called an attribute.
The table is called a relation.
The data type that defines the set of values for a column is called a domain.
Domain
A domain (D) is a set of atomic values.
Atomic means each value in the domain is indivisible as far as the relational model is concerned.
A datatype or format is specified for each domain.
Relation Schema
A relation schema (R) is denoted by:
R(A1, A2, A3, ..., An)
Where:
R is the name of the relation.
A1, A2, ..., An are the attributes.
Each attribute Ai is associated with a domain D, denoted as dom(Ai).
The degree or arity of a relation is the number of attributes n in the schema.
Relation Instance
A relation instance (r) of the schema R(A1, A2, A3, ..., An) — denoted as r(R) — is a set of n-tuples:
r = {t1, t2, t3, ..., tm}
Each n-tuple t is an ordered list of n values:
t = ⟨v1, v2, v3, ..., vn⟩
Where:
- Each vi (1 ≤ i ≤ n) is a value from dom(Ai) or a special NULL value.
Characteristics of Relations
Ordering of tuples in a relation:
Tuples are not ordered in a relation. The order in which tuples appear is not significant.Ordering of values within a tuple:
Although tuples are unordered in a set, the position of values within a tuple matters, as it corresponds to attributes.Values and NULLs in tuples:
Each value in a tuple is from the domain of its attribute or can be NULL, representing unknown or inapplicable values.
Relational Model Notation
The following notation is commonly used:
A relation schema R of degree n is denoted by:
R(A1, A2, A3, ..., An)Uppercase letters Q, R, S denote relation names.
Lowercase letters q, r, s denote relation states (instances of relations).
Letters t, u, v denote tuples.
The name of a relation schema (e.g.,
STUDENT
) also refers to the current set of tuples (state) in that relation.An attribute A belonging to a relation R can be qualified using dot notation:
R.AAn n-tuple t in a relation instance r(R) is denoted as:
t = ⟨v1, v2, ..., vn⟩,
where each vi corresponds to the value of attribute Ai.Tuples can also be denoted using named attributes:
- Example:
t[Name] = ⟨'Barbara Benson'⟩
t[Ssn, Gpa, Age] = ⟨'533-69-1238', 3.25, 19⟩
- Example:
Relational Model Constraints and Relational Database Schemas
The state of the whole database corresponds to the states of all its relations at a particular point in time. There are generally many restrictions or constraints on the actual values in a database state. Constraints on databases can generally be divided into four main categories:
Inherent model-based constraints (implicit constraints): Constraints that are inherent in the data model.
Schema-based constraints (explicit constraints): Constraints that can be directly expressed in the schemas of the data model, typically by specifying them in the DDL.
Application-based or semantic constraints (business rules): Constraints that cannot be directly expressed in the schemas of the data model and must be enforced by application programs or other means.
Data dependencies: Constraints that are functional dependencies or multivalued dependencies.
Domain Constraints
A domain constraint specifies that within each tuple, the value of each attribute A must be an atomic value from the domain dom(A).
Key Constraints and Constraints on NULL Values
A relation is defined as a set of tuples. All tuples in a relation must be distinct, meaning that no two tuples can have the same combination of values for all attributes.
A superkey (SK) specifies a uniqueness constraint such that no two distinct tuples in any state of relation R can have the same value for SK.
Every relation has at least one default superkey — the set of all its attributes.
A key (K) of a relation schema R is a superkey with the additional property that removing any attribute A from K results in a set of attributes that is no longer a superkey.
A key satisfies two properties:
Uniqueness: Two distinct tuples in any state of the relation cannot have identical values for all attributes in the key. This also applies to a superkey.
Minimality: A key is a minimal superkey; no attributes can be removed while still preserving uniqueness.
A relation schema may have more than one key. In this case, each is called a candidate key. One of them is usually designated as the primary key, typically underlined when specifying the relation.
A superkey is any set of attributes that uniquely identifies a tuple.
A candidate key is a minimal superkey.
Relational Databases and Relational Database Schemas
A relational database schema (S) is a set of relation schemas:
S = {R1, R2, R3, ..., Rm}
and a set of integrity constraints IC.
A relational database state (DB) of S is a set of relation states:
DB = {r1, r2, ..., rm}
such that each ri is a state of Ri and satisfies the integrity constraints specified in IC.
A database state that does not satisfy all constraints is called not valid.
A database state that does satisfy all constraints is called a valid state.
Integrity constraints are defined on the database schema and must hold on every valid state.
Entity Integrity, Referential Integrity, and Foreign Keys
Entity Integrity
- No primary key value can be NULL.
Referential Integrity
A referential integrity constraint is specified between two relations to maintain consistency among tuples.
A set of attributes FK in relation schema R1 is a foreign key referencing relation R2 if:
The attributes in FK have the same domain(s) as the primary key attributes PK of R2.
For any tuple t1 in the current state of R1, the value of FK either:
Occurs as the value of PK in some tuple t2 in R2, or
Is NULL.
R1 is called the referencing relation.
R2 is the referenced relation.
A foreign key can refer to the same relation it belongs to (self-referencing).
Other Types of Constraints
Semantic integrity constraints: e.g., the salary of an employee should never be more than their supervisor’s. Enforced using triggers or assertions.
State constraints: The database state must always be valid.
Transition constraints: e.g., the salary of an employee can only increase.
Update Operations, Transactions, and Dealing with Constraint Violations
Operations of the Relational Model:
Retrievals: Queries that return new relations by applying relational operators.
Updates: Modify the state of relations using:
Insert: Adds new tuples.
Delete: Removes existing tuples.
Update (Modify): Changes attribute values in existing tuples.
All operations must respect integrity constraints.
The Insert Operation
Provides attribute values for a new tuple t in relation R. Insert can violate:
Domain constraint: If an attribute value does not match its domain.
Key constraint: If the key value already exists.
Entity integrity: If a primary key value is NULL.
Referential integrity: If a foreign key value refers to a non-existent tuple.
The Delete Operation
Can violate referential integrity if the deleted tuple is referenced by foreign keys. Possible options when this happens:
Restrict
Cascade
Set null / Set default
The Update Operation
Used to modify attribute values in one or more tuples of a relation R.
Must specify a condition to identify tuples to update.
Must ensure no integrity constraints are violated.