Data Models, Schema, and Instances
Data Model
A data model is a collection of concepts that can be used to describe the structure of a database. It provides a formal framework for organizing, representing, and manipulating data. Data models play a central role in supporting data abstraction.
A data model provides tools for representing data, relationships, constraints, and meaning.
Data Abstraction
Data abstraction generally refers to the suppression of details of data organization and storage, and the highlighting of essential features for an improved understanding of data.
One of the main characteristics of the database approach is to support data abstraction so that different users can perceive data at their preferred level of detail.
Categories of Data Models
Many data models have been proposed, which can be categorized according to the types of concepts they use to describe the database structure:
High-level (conceptual) data models: Provide concepts that are close to the way users perceive data. These models are used during the early phases of database design and include concepts like entities, attributes, and relationships.
Low-level (physical) data models: Describe the details of how data is stored on the computer storage media, typically magnetic disks. These models include information such as record layouts, storage structures, and access paths. They are intended for use by system programmers and database administrators.
Representational (implementation) data models: Provide concepts that are easily understood by end users and are also close to the way data is physically stored. These models hide many storage details but can be implemented directly on a computer system. They are most commonly used in commercial DBMSs.
Components of a Data Model
Structural Component:
Defines how data is organized in the database. This includes:Data types
Entity: Represents a real-world object or concept, such as an employee or project.
Attribute: Represents a property or characteristic of an entity, such as name or salary.
Relationship: Represents an association among two or more entities.
Constraints (such as keys, cardinality, and data integrity rules)
Manipulative Component:
Specifies the operations that can be performed on the data. These include:Basic operations such as insert, delete, update, and retrieve
Query languages (e.g., SQL for relational databases)
Advanced or user-defined operations depending on the model (e.g., traversals in graph databases)
Dynamic or Behavioral Component (in some models):
Defines how the database can evolve over time. This includes:Triggers
Rules
User-defined procedures or operations
Support for modeling workflows or application-specific behaviors
Different data models offer different levels of abstraction and are suited to different types of applications.
Examples of Common Data Models
Relational Model: Uses tables (relations), rows (tuples), and columns (attributes). The most widely used model in commercial DBMSs.
Entity-Relationship (ER) Model: A high-level conceptual model used in database design.
Object-Oriented Model: Represents data as objects, similar to object-oriented programming.
Document Model: Stores data in document formats (e.g., JSON, XML). Used in NoSQL systems.
Graph Model: Represents data as nodes and edges. Useful in applications like social networks and recommendation systems.
Key-Value Model: Simplest form of NoSQL database, where each item is stored as a key-value pair.
Schemas, Instances, and Database State
In a data model, it is important to distinguish between the description of the database and the actual data stored in it.
A schema is the overall description of the database. It defines the structure of data and is specified during database design. Schemas change infrequently and act as a blueprint for how data is organized. Each object in the schema—such as STUDENT or COURSE—is called a schema construct.
An instance (or database state) is the actual data in the database at a particular moment in time. It reflects the current content of the database and can change frequently as data is inserted, deleted, or modified.
The DBMS is responsible for ensuring that each instance of the database conforms to the constraints and structure defined in the schema. This ensures that every state of the database is valid.
The DBMS stores the schema definitions and constraints—also known as metadata—in the DBMS catalog. The system refers to this catalog to interpret and enforce the structure of the database.
The schema is sometimes referred to as the intension of the database.
The current database state or instance is referred to as the extension of the schema.
Changes to the schema may occasionally be required. For example, we may decide to add a new attribute such as Date_of_birth
to the STUDENT entity. This process is called schema evolution.
Most modern DBMSs support schema evolution, and these changes can often be applied while the database remains operational.