High-Level Conceptual Data Model
The first step in database design is requirements collection and analysis. During this phase, database designers interview potential users to gather and document their data requirements.
In parallel with data requirements, it is helpful to identify the functional requirements of the application. These include user-defined operations (transactions) such as retrievals and updates that will be applied to the database.
Once the requirements are analyzed, a conceptual schema is created using a high-level conceptual data model. This process is referred to as conceptual design.
The conceptual schema is a concise and complete description of the data requirements, including:
Entity types
Relationships
Constraints
No implementation-related details are specified at this stage.
After designing the conceptual schema, basic data model operations (such as queries and updates) are defined. These confirm that the conceptual schema satisfies all functional requirements.
The next step is to implement the database using a commercial DBMS, typically using an implementation data model such as the relational model (SQL). The conceptual schema is then mapped to this implementation model — a step called logical design or data model mapping.
Logical design may be automated or assisted by database design tools.
Finally, in the physical design phase, storage structures, indexes, file organization, and access paths are specified. In parallel, application programs are designed and implemented as transactions corresponding to user requirements.
Entity Types, Entity Sets, Attributes, and Keys
The Entity-Relationship (ER) model describes data using:
Entities
Relationships
Attributes
Entities and Attributes
An entity is a real-world object that exists independently and can be either:
Physical (e.g., a person, car, or employee)
Conceptual (e.g., a company, job, or course)
Each entity has attributes—properties that describe the entity.
For example, an EMPLOYEE
entity might have: Name, Age, Address, Salary, Job
Attributes will have values, it become part of the data stored in the database.
Types of Attributes
1. Simple (Atomic) vs Composite Attributes
Simple (Atomic) Attributes:
Cannot be subdivided.
Example: Age, Gender, Salary.Composite Attributes:
Can be divided into subparts. The value of a composite attribute is the concatenation of the values of its component simple attributes. Example:Address
→Street_address
,City
,State
,Zip
Street_address
→Number
,Street
,Apartment_number
Composite attributes allow referencing either the entire attribute or its individual components.
2. Single-Valued vs Multivalued Attributes
Single-Valued Attribute:
Has only one value per entity.
Example: Age of a person.Multivalued Attribute:
Can have multiple values per entity.
Example:Colors
of a carCollege_degrees
of a person
Multivalued attributes may have constraints such as minimum and maximum values.
For example, a car may have 1 or 2 color values.
3. Stored vs Derived Attributes
Stored Attributes:
Directly stored in the database.
Example: Birth_dateDerived Attributes:
Calculated from stored attributes or related entities.
Example:Age
derived fromBirth_date
Number_of_employees
derived from counting employees in a department
4. Complex Attributes
A complex attribute involves nesting of composite and multivalued attributes.
We can represent arbitrary nesting by grouping components of a composite attribute between parentheses ( ) and separating the components with commas, and by displaying multivalued attributes between braces { }.
{Address_phone(
{Phone(Area_code, Phone_number)},
Address(Street_address(Number, Street, Apartment_number), City, State, Zip)
)}
Complex attributes provide flexibility to represent hierarchical attribute structures.
NULL Values
A NULL value is used when:
An attribute does not apply to an entity (e.g.,
Apartment_number
for a person living in a house).The value is unknown or missing at the time.
Types of NULL meanings:
Not Applicable — The attribute logically doesn’t apply.
Unknown (Exists but not known) — e.g., Height of a person is unknown.
Unknown (May or may not exist) — e.g., Home_phone of a person is NULL, and it's unclear if it exists at all.
Entity Types, Entity Sets, Keys, and Value Sets
Entity Types and Entity Sets
A database usually contains groups of entities that are similar. For example, employee entities share the same attributes, but each has its own value(s) for each attribute.
An entity type defines a collection (or set) of entities that have the same attributes.
Each entity type is described by its name and attributes.
The collection of all entities of a particular entity type at any point in time is called an entity set (or entity collection).
Entity sets are usually referred to using the same name as the entity type, though they are conceptually distinct.
Example:
EMPLOYEE
refers to both the entity type and the current collection of all employee entities in the database.
ER Diagram Notation
Entity types are represented as rectangular boxes enclosing the entity type name.
Attribute names are enclosed in ovals and connected to their entity type with straight lines.
Composite attributes are connected to their component attributes by straight lines.
Multivalued attributes are shown in double ovals.
Key Attributes of an Entity Type
An important constraint on entities is the key (uniqueness) constraint on attributes.
A key attribute has values that are distinct for each entity in the entity set. This value can be used to identify each entity uniquely.
Keys can be single attributes or composite attributes (a combination of several attributes).
Superfluous attributes should not be included in a key.
In ER diagrams, key attributes are underlined inside the oval.
Example: In the entity type
CAR
, bothVehicle_id
andRegistration
are key attributes.Registration
is a composite key made up ofState
andNumber
.
Some entity types have multiple keys.
An entity type with no key is called a weak entity type.
If two attributes are underlined separately, each is a distinct key.
The ER model doesn't define a primary key; this is chosen during mapping to a relational schema.
Value Sets (Domains) of Attributes
Each simple attribute of an entity type is associated with a value set (or domain), which defines the possible values it may have.
Value sets are similar to data types in programming (e.g.,
integer
,string
,Boolean
,float
,enum
, etc.).They are not typically shown in ER diagrams.
At any given time, only a subset of the domain values exist in the database—this subset reflects the current state of the miniworld.